<?xml version="1.0" encoding="utf-8"?>
<!-- This can be converted using the Web service at https://xml2rfc.tools.ietf.org/ -->
<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">
<?rfc toc="yes"?>
<!-- You want a table of contents -->
<?rfc symrefs="yes"?>
<!-- Use symbolic labels for references -->
<?rfc sortrefs="yes"?>
<!-- This sorts the references -->
<?rfc iprnotified="no" ?>
<!-- Change to "yes" if someone has disclosed IPR for the draft -->
<?rfc compact="yes"?>
<!-- This defines the specific filename and version number of your draft (and inserts the appropriate IETF boilerplate -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="std" docName="draft-carpenter-6man-rfc6874bis-03" ipr="trust200902" obsoletes="6874" updates="3986, 3987" submissionType="IETF" xml:lang="en" tocInclude="true" symRefs="true" sortRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 2.44.0 -->
<front>
    <title abbrev="IPv6 Zone IDs in URIs">Representing IPv6 Zone Identifiers in Address Literals and Uniform Resource Identifiers</title>
    <seriesInfo name="Internet-Draft" value="draft-carpenter-6man-rfc6874bis-03"/>
    
    <author initials="B." surname="Carpenter" fullname="Brian Carpenter">
      <organization abbrev="Univ. of Auckland"/>
      <address>
        <postal>
          <postalLine>School of Computer Science</postalLine>
          <postalLine>University of Auckland</postalLine>
          <postalLine>PB 92019</postalLine>
          <postalLine>Auckland 1142</postalLine>
          <postalLine>New Zealand</postalLine>
        </postal>
        <email>brian.e.carpenter@gmail.com</email>
      </address>
    </author>
    
    <author initials="S." surname="Cheshire" fullname="Stuart Cheshire">
      <organization abbrev="Apple Inc.">
        Apple Inc.
      </organization>
      <address>
        <postal>
          <postalLine>1 Infinite Loop</postalLine>
          <postalLine>Cupertino, CA 95014</postalLine>
          <postalLine>USA</postalLine>
        </postal>
        <email>cheshire@apple.com</email>
      </address>
    </author>
    

    <author fullname="Robert M. Hinden" initials="R" surname="Hinden">
      <organization>Check Point Software</organization>
      <address>
        <postal>
          <postalLine>959 Skyway Road</postalLine>
          <postalLine>San Carlos, CA 94070</postalLine>
          <postalLine>USA</postalLine>
        </postal>
        <phone/>
        <email>bob.hinden@gmail.com</email>
      </address>
    </author>
    
    
        <area>Internet</area>
    <workgroup>6MAN</workgroup>
    <!-- [rfced] Please insert any keywords (beyond those that appear in
the title) for use on http://www.rfc-editor.org/rfcsearch.html. 

<keyword>example</keyword>
-->

<abstract>
      <t>This document describes how the zone identifier of an IPv6 scoped address, defined
as &lt;zone_id&gt; in the IPv6 Scoped Address Architecture (RFC 4007), can be
represented in a literal IPv6 address and in a Uniform Resource Identifier 
that includes such a literal address. It updates the URI Generic Syntax
and Internationalized Resource Identifier
specifications (RFC 3986, RFC 3987) accordingly, and obsoletes RFC 6874.

</t>
    </abstract>
    
<note removeInRFC="true">
  <name>Discussion Venue</name>
      <t>Discussion of this document takes place on the
  6MAN mailing list (ipv6@ietf.org),
  which is archived at <eref target="https://mailarchive.ietf.org/arch/browse/ipv6/">https://mailarchive.ietf.org/arch/browse/ipv6/</eref>.</t>
</note>
    
    
    
    
  </front>
  <middle>
    <section anchor="intro" numbered="true">
      <name>Introduction</name>
      <t>The Uniform Resource Identifier (URI) syntax specification <xref target="RFC3986"/> defined how a
literal IPv6 address can be represented in the "host" part of a URI.
Two months later, the IPv6 Scoped Address Architecture specification <xref target="RFC4007"/> extended
the text representation of limited-scope IPv6 addresses such that a zone identifier may be concatenated
to a literal address, for purposes described in that specification. Zone identifiers are especially
useful in contexts in which literal addresses are typically used, for example, during fault diagnosis,
when it may be essential to specify which interface is used for sending to a link-local address. 
It should be noted that zone identifiers have purely local meaning within the
node in which
they are defined, often being the same as IPv6 interface names. They are completely meaningless
for any other node. Today, they are meaningful only when attached to addresses with less
than global scope, but it is possible that other uses might be defined in the future. </t>
      <t>The IPv6 Scoped Address Architecture specification <xref target="RFC4007"/> does not specify how zone identifiers are to be represented
in URIs. Practical experience has shown that this feature is useful or necessary, 
 in at least three use cases:</t>
<ol>
   <li>When using a web browser for simple debugging actions 
   involving link-local addresses on a host with more than one
   active link interface.</li>

   <li>When using a web browser to configure or reconfigure a
   device which only has a link local address and whose only
   configuration tool is a web server, again from a host with
   more than one active link interface.</li>

   <li>When using an HTTP-based protocol for establishing link-local
   relationships, such as the Apple CUPS printing
   mechanism <xref target="CUPS"/>.</li>
</ol>

<t>It should be noted that whereas some operating systems and network APIs
support a default zone identifier as described in <xref target="RFC4007"/>,
others do not, and for them an appropriate URI syntax is particularly important.</t>

<t>In the past, some browser versions directly accepted the IPv6 Scoped Address
syntax <xref target="RFC4007"/>
for scoped IPv6 addresses embedded in URIs, i.e., they were coded to
interpret a "%" sign following the literal address as introducing a zone
identifier <xref target="RFC4007"/>, instead of introducing two hexadecimal
characters representing some percent-encoded octet <xref target="RFC3986"/>. Clearly, 
interpreting the "%" sign as introducing a zone identifier is very convenient
for users, although it is not supported by
the URI syntax <xref target="RFC3986"/> or the Internationalized Resource Identifier (IRI)
syntax <xref target="RFC3987"/>.
Therefore, this document updates RFC 3986 and RFC 3987 by adding syntax to allow a zone identifier
to be included in a literal IPv6 address within a URI. </t>
      <!-- It also updates <xref target="RFC4007"/>,
in particular by adding a second allowed delimiter for zone identifiers. -->

<t>It should be noted that in contexts other than a user interface, a zone identifier is mapped into
a numeric zone index or interface number. The MIB textual convention InetZoneIndex <xref target="RFC4001"/> and the
socket interface <xref target="RFC3493"/> define this as a 32-bit unsigned integer. The mapping
between the human-readable zone identifier string and the numeric value is a host-specific
function that varies between operating systems. The present document is concerned only
with the human-readable string. </t>
      <t>Several alternative solutions were considered while this document was developed. Appendix
A briefly describes the various options and their advantages and disadvantages. </t>

<t>This document obsoletes its predecessor <xref target="RFC6874"/> by greatly
simplifying its recommendations and requirements for URI parsers.
Its effect on the formal URI syntax <xref target="RFC3986"/> is different
from that of RFC 6874.</t>

<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and
   only when, they appear in all capitals, as shown here.
</t>
</section>
<!-- intro -->

<section anchor="issues" numbered="true">
      <name>Issues with Implementing RFC 6874</name>
<t>Several issues prevented RFC 6874 being implemented in browsers:</t>
<ol>
  <li>There was some disagreement with requiring percent-encoding of the "%" sign preceding a zone identifier.
  This requirement is dropped in the present document.</li>
  <li>The requirement to delete any zone identifier before emitting a URI from the host in an HTTP message
  was considered both too complex to implement and in violation of normal HTTP practice <xref target="RFC7230"/>.
  This requirement has been dropped from the present document.</li>
  <li>The suggestion to pragmatically allow a bare "%" sign when this would be unambiguous was considered both
  too complex to implement and confusing for users. This suggestion has been dropped from the present document
  since it is now irrelevant.</li>
</ol>
</section>
<!-- issues -->

<section anchor="spec" numbered="true">
      <name>Specification</name>
      <t>According to IPv6 Scoped Address syntax <xref target="RFC4007"/>, a zone identifier is attached to the textual representation of an IPv6
address by concatenating "%" followed by &lt;zone_id&gt;, where &lt;zone_id&gt; is a string identifying the zone of the address.
However, the IPv6 Scoped Address Architecture specification gives no precise definition of the character set allowed in &lt;zone_id&gt;.
There are no rules or de facto standards for this. For example, the first Ethernet interface in a host
might be called %0, %1, %en1, %eth0, or whatever the implementer happened to choose. Also, %25
would be valid.</t>
      <t>In a URI, a literal IPv6 address is always embedded between "[" and "]". 
This document specifies how a &lt;zone_id&gt; can be appended to the address.
According to the text in Section 2.4 of <xref target="RFC3986"/>, "%" must be percent-encoded
as "%25" to be used as data within a URI.
However, in the formal ABNF syntax of RFC 3986, this only applies
where the "pct-encoded" element appears. For this reason, it is possible to extend the ABNF
such that the scoped address fe80::abcd%en1 would appear in a URI as http://[fe80::abcd%en1]
or https://[fe80::abcd%en1].
</t>


      <t>A &lt;zone_id&gt; MUST contain only ASCII characters classified 
as "unreserved" for use in URIs <xref target="RFC3986"/>. This excludes characters such as
"]" or even "%" that would complicate parsing.
The &lt;zone_id&gt; "25" cannot be forbidden since it is valid in some
operating systems, so a parser MUST NOT apply percent decoding to a URI such as http://[fe80::abcd%25].
</t>
      <t>If an operating system uses any other characters in zone or interface identifiers that are not in the
"unreserved" character set, they cannot be used in a URI.</t>
      <t>We now present the corresponding formal syntax. </t>
      <t>
The URI syntax specification <xref target="RFC3986"/> formally defines the
IPv6 literal format in ABNF <xref target="RFC5234"/> by the following rule:
</t>
      <artwork name="" type="" align="left" alt=""><![CDATA[
   IP-literal = "[" ( IPv6address / IPvFuture  ) "]"
]]></artwork>
      <t>To provide support for a zone identifier, 
the existing syntax of IPv6address is retained, and a zone identifier may be
added optionally to any literal address. This syntax allows flexibility for unknown future
uses. The rule quoted above from
<xref target="RFC3986"/> is replaced by three rules:</t>

<artwork name="" type="" align="left" alt=""><![CDATA[
   IP-literal = "[" ( IPv6address / IPv6addrz / IPvFuture  ) "]"
   
   ZoneID = 1*( unreserved )
   
   IPv6addrz = IPv6address "%" ZoneID
]]></artwork>

<t>This change also applies to <xref target="RFC3987"/>.</t>


      <t>This syntax fills the gap that is described at the end of Section 11.7 of
the IPv6 Scoped Address Architecture specification <xref target="RFC4007"/>. It replaces
and obsoletes the syntax in Section 2 of <xref target="RFC6874"/>.</t>
      <t>The established rules for textual representation of IPv6 addresses <xref target="RFC5952"/> SHOULD be applied in producing URIs. </t>
      <t>The URI syntax specification <xref target="RFC3986"/> states that URIs have a global scope, but that in some cases their
interpretation depends on the end-user's context. URIs including a ZoneID are
to be interpreted only in the context of the host at which they originate, since
the ZoneID is of local significance only. </t>
      <t>The IPv6 Scoped Address Architecture specification <xref target="RFC4007"/> offers guidance on how the ZoneID affects interface/address selection
inside the IPv6 stack. Note that the behaviour of an IPv6 stack, if it is passed a non-null
zone index for an address other than link-local, is undefined. </t> 
    </section>
    <!-- spec  -->

<section anchor="browsers" numbered="true">
      <name>URI Parsers</name>
      <t>This section discusses how URI parsers, such as those embedded in web browsers,
       might handle this syntax extension.
Unfortunately, there is no formal distinction between the syntax allowed
in a browser's input dialogue box and the syntax allowed in URIs. For this
reason, no normative statements are made in this section. </t>
<t>In practice, although parsers respect the established syntax, they are coded
pragmatically rather than being formally syntax-driven. Typically, IP address
literals are handled by an explicit code path. Parsers have been
inconsistent in providing for ZoneIDs. Most have no support, but there
have been examples of ad hoc support. For example, some versions of Firefox allowed the
use of a ZoneID preceded by a bare "%" character, but 
this feature was removed for consistency with established syntax <xref target="RFC3986"/>.
As another example, some
versions of Internet Explorer allowed use of a ZoneID preceded by a "%"
character encoded as "%25", still beyond the syntax allowed by the established
rules <xref target="RFC3986"/>. This
syntax extension is in fact used internally in the Windows operating system and some
of its APIs. </t>
      <t>It is desirable for all URI parsers to recognise a ZoneID according to the syntax
      defined in <xref target="spec"/>.
      </t>
      <t>URIs including
a ZoneID have no meaning outside the originating HTTP client node. However, in some use cases,
such as CUPS mentioned above, the URI 
will be reflected back to the client.

</t>
      <t>The various use cases for the ZoneID syntax will cause it to be entered in a browser's
input dialogue box. Thus, URIs including a ZoneID are unlikely to occur in
HTML documents. However, if they do (for example, in a diagnostic script coded in HTML),
it would be appropriate to treat them exactly as above. </t>
    </section>
    <!-- browsers -->

<section anchor="security" numbered="true">
      <name>Security Considerations</name>
      <t>The security considerations from the URI syntax specification <xref target="RFC3986"/> and the IPv6 Scoped Address Architecture specification <xref target="RFC4007"/> apply. 
In particular, this URI format creates a specific pathway by which a deceitful zone
index might be communicated, as mentioned in the final security consideration
of the Scoped Address Architecture specification.
</t>
      <t>To limit this risk, implementations MUST NOT allow use of this format except for
well-defined usages, such as sending to link-local addresses under prefix fe80::/10.
At the time of writing, this is the only well-defined usage known. </t>

    </section>
    <!-- security -->



<section anchor="ack" numbered="true">
      <name>Acknowledgements</name>
      <t>
The lack of this format was first pointed out by Margaret Wasserman and
later by Kerry Lynn. A previous draft document by Bill
Fenner and <contact fullname="Martin Dürst"/> <xref target="LITERAL-ZONE"/> discussed this topic but was not finalised.
Michael Sweet and Andrew Cady explained some of the difficulties caused by RFC 6874. The ABNF syntax proposed above
was drafted by Andrew Cady.</t>
<t>Valuable comments and contributions were made by
Karl Auer,
Carsten Bormann,
Benoit Claise,
<contact fullname="Martin Dürst"/>,
Stephen Farrell,
Brian Haberman,
Ted Hardie,
Philip Homburg,
Tatuya Jinmei,
Yves Lafon,
Barry Leiba,
Radia Perlman,
Tom Petch,
Michael Richardson,
Tomoyuki Sahara,
Juergen Schoenwaelder,
Nico Schottelius,
Dave Thaler,
Martin Thomson,
Ole Troan,
and others.
      </t>
    </section>
    <!-- ack -->
    
<!-- <section anchor="contributors" numbered="true" toc="default">
      <name>Contributor</name>

        <t>A co-author of RFC 6874 was:</t>

    <contact initials="S." surname="Cheshire" fullname="Stuart Cheshire">
      <organization abbrev="Apple Inc.">
        Apple Inc.
      </organization>
      <address>
        <postal>
          <postalLine>1 Infinite Loop</postalLine>
          <postalLine>Cupertino, CA 95014</postalLine>
          <postalLine>USA</postalLine>
        </postal>
        <email>cheshire@apple.com</email>
      </address>
    </contact>
  </section> -->



</middle>
  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3986.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3987.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4007.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5952.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <!-- &RFC5234; -->
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5234.xml"/>
      </references>
      <references>
        <name>Informative References</name>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3493.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4001.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6874.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7230.xml"/>
   <reference anchor="LITERAL-ZONE">
          <front>
            <title>Formats for IPv6 Scope Zone Identifiers in Literal Address Formats</title>
            <author initials="B." surname="Fenner" fullname="B. Fenner">
              <organization/>
            </author>
            <author surname="Dürst" initials="M." fullname="Martin Dürst" asciiSurname="Duerst" asciiFullname="Martin Duerst">
              <organization/>
            </author>
            <date month="October" year="2005"/>
          </front>
          <seriesInfo name="Work in" value="Progress"/>
    </reference>
        
    <reference anchor="CUPS" target="https://www.cups.org/">
          <front>
            <title>CUPS open source printing system</title>
            <author fullname="Apple"/>
            <date year="2021"/>
          </front>
    </reference>
        
      </references>
    </references>
    <section anchor="AppendixA" numbered="true">
      <name>Options Considered</name>

      <t>The syntax defined above allows a ZoneID to be added to any
IPv6 address. The 6man WG discussed and rejected an alternative in which
the existing syntax of IPv6address would be extended by an option
to add the ZoneID only for the case of link-local addresses. It
was felt that the solution presented in this document offers more flexibility for
future uses and is more straightforward to implement.
</t>
      <t>The various syntax options considered are now briefly described.</t>
      <ol spacing="normal" type="1"><li>
          <t>Leave the problem unsolved.
</t>
          <t>
This would mean that per-interface diagnostics would still have to be performed using ping or ping6:
</t>
          <t>
   ping fe80::abcd%en1
</t>
          <t> 
Advantage: works today.
</t>
          <t> 
Disadvantage: less convenient than using a browser. Leaves some use cases unsatisfied.
</t>
          <t/>
        </li>
        <li>
          <t>Simply use the percent character:
</t>
          <t> 
   http://[fe80::abcd%en1]
</t>
          <t>
Advantage: allows use of browser; allows cut and paste.
</t>
          <t>
Disadvantage: requires code changes to all URI parsers.
</t>
<t>
This is the option chosen for standardisation.
</t>
        </li>
        <li>
          <t>Simply use an alternative separator:
</t>
          <t> 
    http://[fe80::abcd-en1]
</t>
          <t>
Advantage: allows use of browser; simple syntax.
</t>
          <t> 
Disadvantage: Requires all IPv6 address literal parsers and
generators to be updated in order to allow simple cut and paste; inconsistent
with existing tools and practice.
</t>
          <t> 
Note: The initial proposal for this choice was to use an underscore
as the separator, but it was noted that this becomes effectively invisible when
a user interface automatically underlines URLs.
</t>
          <t/>
        </li>
        <li>
          <t>Simply use the "IPvFuture" syntax left open in RFC 3986:
</t>
          <t>
    http://[v6.fe80::abcd_en1]
</t>
          <t>
Advantage: allows use of browser.
</t>
          <t>
Disadvantage: ugly and redundant; doesn't allow simple cut and paste.
</t>
          <t/>
        </li>
        <li>
          <t>Retain the percent character already specified for introducing
       zone identifiers for IPv6 Scoped Addresses [RFC4007], and then
       percent-encode it when it appears in a URI, according to the
       already-established URI syntax rules [RFC 3986]:
</t>
          <t>
   http://[fe80::abcd%25en1]
</t>
          <t>
Advantage: allows use of browser; consistent with general URI
       syntax.
</t>
          <t>
Disadvantage: somewhat ugly and confusing; doesn't allow simple
       cut and paste.
</t>
          
        </li>
      </ol>
    </section>
    
    <section anchor="changes" numbered="true" removeInRFC="true">
      <name>Change log</name>
      <ul>
      
      <li><t>draft-carpenter-6man-rfc6874bis-03, 2022-02-08:</t>
       <ul>
       <li>Changed to bare % signs.</li>
       <li>Added IRIs, RFC3987</li>
       <li>Editorial fixes</li></ul></li>      
      
      <li><t>draft-carpenter-6man-rfc6874bis-02, 2021-18-12:</t>
       <ul>
       <li>Give details of open issues</li>
       <li>Update authorship</li>
       <li>Editorial fixes</li></ul></li>
       
      <li><t>draft-carpenter-6man-rfc6874bis-01, 2021-07-11:</t>
       <ul><li>Added section on issues with RFC6874</li>
       <li>Removed suggested heuristic for bare % signs</li>
       <li>Editorial fixes</li></ul></li>
       
      <li><t>draft-carpenter-6man-rfc6874bis-00, 2021-07-05:</t>
       <ul><li>Initial version</li></ul></li>
      </ul>
    </section>
  </back>
</rfc>
