SASL Authentication for SIP

SASL Authentication for SIP OpenFortress BV

Haarlebrink 5 Enschede Overijssel 7544 WP The Netherlands rick@openfortress.nl

Many protocols benefit from "pluggable" authentication choice as a result of SASL authentication. In the Session Initiation Protocol, the independent branch of HTTP Authentication has been elected. Recent progress has been made in bringing SASL to HTTP, but SIP has its own special considerations and needs its own embedding to gain the flexibility of SASL.

SASL authentication has long been used in protocol design to gain flexible authentication without change to the protocol that carries this. This powerful idea is still being introduced in new protocols. SIP has fallen behind by choosing the HTTP framework, which is not actively developed because much of HTTP authentication has shifted to the application layer. A recent proposal was made to add SASL authentication to HTTP, and a variant of that is herein proposed for SASL. A few aspects about SASL are especially interesting. Its support of Kerberos is useful in many organisations who exercise central identity management under a single-signon system. Any protocol that cannot use Kerberos interferes with such operational policies. Recent SASL mechanisms of cryptographic interest are the SCRAM family and the OPAQUE mechanism. Finally, work has been done to proxy SASL to a client's own Diameter backend to allow Realm Crossover. All these would be of benefit to SIP usage profiles under the authentication profile specified herein. SIP differs from HTTP and connection-bound protocols in a few important manners. First, SIP messages are often forwarded along a path, and this is not done transparantly. Second, any TLS protection is tied to a single hop along this path and cannot be used to protect the SASL mechanism from transmission in plaintext. Third, SIP messages must be considered separate from their carrier connection but are tied together with identities, both for transactions and for dialogs. This mix of properties necessitates a separate profile to use SASL under SIP.

SASL authentication inherits most of its properties from HTTP authentication [Section 22 of ]. This includes the 401 and 407 responses and the corresponding headers in both these responses and subsequent new requests that attempt to answer the authentication request. It adopts the Digest authentication but abolishes Basic authentication. The general SIP grammar [Section 25 of ] welcomes other-challenge and other-response forms that fit a token as an auth-scheme:

The auth-scheme values are not formally linked to HTTP, but are presumably inhereted, along with their accompanying grammars. We hereby explicitly import the specifications for HTTP-SASL [TODO:xref target="draft-vanrein-httpauth-sasl"] for use with SIP. This specifically adds the auth-scheme token SASL and the auth-param-name tokens realm, mech, c2s, s2c, c2c and s2s. When each of these MAY or MUST be used follows HTTP-SASL [Section 2.1 of TODO:xref target="draft-vanrein-httpauth-sasl"].

Most SASL protocols have a direct connection between client and server, and protect it with TLS. This is not the case for SIP, may vividly route its messages. It is also not like HTTP, in the sense that requests and responses may be even more detached when they are transmitted over UDP. Some things about SASL are actually simpler than HTTP, namely the standardised notions of identities. There are dialog identities (Call-ID: value, From: tag and, once available, To: tag) and transaction identities (Via: branch) that are helpful to see connections between individual SIP messages. They are not designed to be cryptographically secure, but as unique identifiers that may be used in nonces. Each party may approve of more flexibility than a literal match with the authenticated identity and the remote header value. Group members may authenticate as a group, for example. And users may be replaced by an alias. More generally said, an access control mechanism may be used to decide on identity substitutions, possibly based on the relation between remote and local identity.

The identities validated are the ones in From: and To: headers, respectively as client and server identities. The precies parts of the SIP grammar [Section 25 of ] are the user or telephone-subscriber in the addr-info non-terminal in these headers. The host (which often ends up being a domain) is used as the realm for SIP authentication. It is worth noting that the host/domain for From: and To: may differ, especially when SIP proxies connect over the Internet. In such cases, the From: host/domain is used as the client realm and the To: host/domain is used as the server realm. The implications for authentication access realms are discussed in .

Some SASL mechanisms are unsuitable for transfer over a plaintext channel. Such mechanisms SHOULD NOT be used with SIP, not even when protected by TLS, because the messages are often relayed; even when all legs on their path are encrypted channels, then the SASL tokens are revealed to all intermediate proxies. This concern is also reflected in the original SIP authentication schemes, which inherit Digest from HTTP authentication, but not Basic. It is a concern that rogue proxies could insert a SASL mechanism of lower quality in the mech header while it passes through, so clients MUST be able to filter the list of mechanisms, for example with conservative builtin lists or through configuration settings by the user. The restriction to plaintext-ready SASL mechanisms does not mean that secure transports are no longer useful. There are other values, such as privacy of SIP messages and the binding of these messages to the authentication exchange. This is a useful hop-to-hop property. It is worth nothing that TLS is not the only mechanism to establish these properties however; a SIP connection may encapsulate messages in a secure body to achieve the same effect.

All SASL mechanisms offer client authentication, some offer mutual authentication and thereby also authenticate the server. Examples of the latter include GSS-API for Kerberos5, GS2-KRB5, GS2-KRB5-PLUS, EXTERN based on a mutually authenticating TLS connection, the SCRAM-* family, SXOVER-PLUS with suitable access control, and OPAQUE. SIP clients are those parties that send a request; this is fixed per transaction, but roles may vary during a dialog. When a request receives a 401 or 407 response to call for client authentication, then authentication always validates the client identity, but mutual authentication would validate the server in the same authentication process. Authentication never covers more than a SIP dialog, and may even be considered valid for a single transaction (which makes cryptographic sense of unencrypted tansports). When a dialog is covered, mutual authentication may be considered to already authenticate the other side.

SASL mechanism names have a reserved -PLUS ending to indicate Channel Binding. This facility is used to avoid passing of an authentication exchange from one channel onto another. Protocols that employ a single TLS connection can derive a channel binding value from the master key. For SIP, this might work as a hop-to-hop approach, but in general SIP needs end-to-end negotiation, at least for generally useful 401 response handling. Channel Binding has progressed from simple endpoint data such as addresses and ports to much stronger cryptographic derivates. These have the benefit of providing sufficient entropy that cannot be changed by a rogue intermediate. In SIP, both end points supply random tags and identifiers that involves entropy, and pass it in a textual form which is not size-constrained. This allows any amount of entropy desired. Dialogs in SIP are identified with the Call-ID field value, the From: tag and, once set in a response, the To: tag. This allows a good level of entropy to describe the identity of a dialog with input from both endpoints. Transactions are identified by a Via: branch value, which may be added to the dialog identities because a transaction are always part of a dialog. The values are identities, and they are unrelated to transport endpoint coordinates. As a result, they can be designed as non-repetitive strings and they serve as selectors for the active transactions. This qualifies the SIP identity information as useful channel binding values. It is vital that both sides can supply entropy, which is addressed by a first response. As long as SASL is initiated with a 401 response this is taken care of. Channel binding can be used to incorporate more data into a authentication exchange, thereby validating the said data. For SIP, that is likely the SDP attachment. If the SDP information from both ends is to be integrated, then the first 401 must already add the SDP offer to bind. Based on the aforementioned information, the following forms of channel binding are based on varying lists of values:

The channel binding value starts with the Channel Binding type, a colon (ASCII 0x3a) [Section 2.1 of ] followed by the binary value of the SHA-256 hash over the following character sequence: the contents of the callid terminal, followed by an ASCII space, the contents of the token terminal in the tag-param in the from-param, followed by an ASCII space, the contents of the token terminal in the tag-param in the to-param, followed by an ASCII space, the token terminal in the via-branch of the first Via: header, followed by an ASCII space, the media-type non-terminal in the Content-Type header, an ASCII space, the shortest possible 1*DIGIT value for the value of the Content-Length header, an ASCII space and the body from the original request in the transaction that is being authenticated, the media-type non-terminal in the Content-Type header, an ASCII space, the shortest possible 1*DIGIT value for the value of the Content-Length header, an ASCII space and the body from the response to the original request in the transaction that is being authenticated. The bodies are always specific to the authenticating transaction, even when Channel Binding covers a dialog. This avoids confusion when re-authenticating for a re-INVITE. TODO: Are all these Channel Binding types practical? (1) The Via: branch differs at different parts of the path. (2) It is difficult to speak of a body when referencing a dialog, unless it is for the transaction that is authenticated; does this not mean that the body is tied to transaction authentication? Note that only certain parts of the SIP message content are included in the Channel Binding information. This reflects the protocol support for changes along its path. The body however, is only subjected to rewrites in special circumstances (SDP rewrites for traffic over IPv4) but will otherwise be sent as an opaque end-to-end object. The use of a secure transport is the customary approach to protect the unbound information from being changed in transit.

SASL can negotiate security layers, which usually means that it can facilitate integrity and/or confidentiality bassed on the cryptographic material exchanged during authentication. A common example is the GSS-API implementation of Kerberos5, which passes a session key encrypted with a Ticket-specific secret. Subsequent GSS_Wrap and GSS_Unwrap calls, or GSS_GetMIC and GSS_VerifyMIC map plaintext to wire messages and back. The use of cryptographic content in a SIP message is generally confined to the body. When a suitable Media type is registered to convey the confidentialilty and/or integrity wrapper of the body, then an endpoint may choose to use this to transmit a more secure form of a body, or a full SIP message. A typical example of this mechanism is the S/MIME content in a body with Content-type application/pkcs7-mime [Section 23 of . This specification introduces a media type message/gssapi, to capture bodies with one wire-format message as may be prepared with GSS-API calls like GSS_Wrap() or GSS_GetMIC() ]. SASL mechanisms that could use this media type include GSS-API, GS2-KRB5, GS2-KRB5-PLUS and SXOVER-PLUS. When a proxy or server offers a SASL mechanism that derives a security layer, then the authenticating proxy or client MAY transmit SIP messages with a matching body media type after the authentication succeeds. When a proxy or client chooses to authenticate with a SASL mechanism that derives a security layer, then the proxy or server MAY transmit SIP messages with a matching body media type after the authentication succeeds.

Security layers negotiated with SASL can take various forms. Particularly interesting for SIP could be key derivation, because it initiates independently operated sessions which often benefit from keys to improve security. A general mechanism that works well for key derivation is to present a string label and an optional salt to extract distinct key material. SIP applications in want for such a facility would define such a string label and salt computation, and choose whether to prefer or even require SASL mechanisms that allow key derivation. When an application and SASL mechanism meet, then key derivation can be used, further processing the outcome as per the definitions of the application. Such mechanisms are not enforced for SIP-SASL in a general sense, because that could lead to incompatibility issues, but applications or application profiles with suitable negotiation parameters may build upon key derivation facilities. Applications and application profiles MAY choose to extract key material from SASL mechanisms that were not originally designed with such facilities, but that happen to have a more extensive implementation. SASL mechanism negotiation would then restrict the mechanisms offered and accepted if applications depend on such extended forms of key derivation. When elected, the general pattern starts with HKDF-Extract(salt,IKM) [Section 2.2 of ] with salt set to the hash input for sip-dialog Channel Binding and IKM set to a shared secret; in password-based mechanisms this is the password and in decryption challenges it is the encrypted material. Following this, the general pattern uses KHDF-Expand(PRK,info,L) [Section 2.3 of ] where the info consists of the string label, and, if the application defines a salt for this string label, an additional NUL byte (0x00) and the salt bytes. The HMAC algorithm underneath HDKF-Extract and HDKF-Expand will be based on a hash used in the SASL mechanism if it uses one; otherwise it will be SHA256.

The term Realm Crossover refers to techniques [TODO:xref target="draft-vanrein-internetwide-realm-crossover"] that allow existing protocols to benefit from identity providers under operational control of a user's domain. These techniques minimise the dependency on third parties while maximising the control of domains and their users over their online identity. Identities generally take a form user@domain.name which, in the the case of SIP, can simply be prefixed with sip: or sips: scheme. Techniques for which Realm Crossover can be achieved are: This uses the SXOVER-PLUS mechanism as a secure wrapper for another SASL mechanism, and relaying it to a Diameter backend, which in can relay the SASL content to the client's realm, where a Diameter server offers identity assurance. The Diameter cross-link assures a domain.name based on TLS and DNSSEC/DANE, the client's server assures a userid based on the client's SASL content, so that a userid@domain.name can be formed on the relying Diameter node, and fed back locally to the relying service. This already facilitates Realm Crossover, but it is customarily founded on static keys. The KXOVER protocol allows Key Distribution Centers to exchange a dynamic key that can be used for a few weeks to crossover between their realms. KXOVER founds its security on TLS and DNSSEC/DANE. Current work is in progress on Client DANE. This may be used to publish a Root CA for a domain's client identities. After obtaining this with DNSSEC assurance from the client domain, a client certificates can be validated by any other party. It is common for SIP messages to be internally routed from one endpoint to the public proxy for its realm, then crossover to the proxy of another realm which internally routes it to another endpoint. The realm-crossing link is a sensitive place in terms of privacy, and may benefit from encrypting the SIP message. The 407 mechanism might be used to validate the From: and To: identities for the crossover, while encrypting the SDP content involved in the crossover. Such proxy-to-proxy encryption may be used to connect SIP proxies for prolonged periods, independently of the SIP messages that pass as SIP messages contained in secured bodies. This can be setup with proxy hostnames in From: and To: headers, and be used when SRV records indicate those hostnames. The setup would be with INVITE with SASL authentication, and the dialog is used until one party sends BYE or a transmission results in a 481 Call/Transaction Does Not Exist; the latter may be seen as a suggestion to send another INVITE to the responding proxy. The longer messages caused by security wrapping combined with MTU considerations arguably make it better to use a transport with more control than UDP. When domains connect via their externally visible proxies, SCTP should be a reasonable default.

The mechanism of passing SIP messages in secure bodies need not be limited to Realm Crossover; it may also be used for internal security, such as for a device registration with its domain proxy. To combat MTU considerations it may be useful to forego UDP, but clients may be restricted to TCP and, given the low traffic rate and the higher administrative overhead of SCTP, this would be a good choice for individual registrations. For SIP trunks, the use of SCTP may still be a feasible option. TODO: Is this a good idea though? How would it relate to the REGISTER usage pattern? It does add proper security in a part of the infrastructure that usually is only mildly secure.

TODO: Some points made in the foregoing: plaintext sensitivity, binding of authentication to context, mutual-or-not, encrypting wrappers for SIP messages.

IANA is requested to register the following values in the Media Types registry:

]]> IANA is requested to register the following types of channel binding in the Channel-Binding Types registry:

Channel binding is secret: no Description: Call-ID, From: tag, To: tag Intended usage: COMMON Person and email: Owner/Change controller: IESG Subject: Registration of channel binding sip-dialog-sdp Channel-binding unique prefix: sip-transaction Channel-binding type: unique Channel type: SIP Published specification: Channel binding is secret: no Description: Call-ID, From: tag, To: tag, Via: branch Intended usage: COMMON Person and email: Owner/Change controller: IESG Subject: Registration of channel binding sip-transaction Channel-binding unique prefix: sip-dialog-sdp Channel-binding type: unique Channel type: SIP Published specification: Channel binding is secret: no Description: Call-ID, From: tag, To: tag, request SDP, [response SDP] Intended usage: COMMON Person and email: Owner/Change controller: IESG Subject: Registration of channel binding sip-transaction-sdp Channel-binding unique prefix: sip-transaction-sdp Channel-binding type: unique Channel type: SIP Published specification: Channel binding is secret: no Description: Call-ID, From: tag, To: tag, Via: branch, request SDP, [response SDP] Intended usage: COMMON Person and email: Owner/Change controller: IESG ]]> Since there is no separate registry for SIP authentication schemes, no work on this is requested from IANA.

Thanks to NLNet for funding this work.