<?XML333 version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>  <!-- Required for schema
      validation and schema-aware editing -->

<!DOCTYPE rfc [
  <!ENTITY filename "draft-eastlake-bess-evpn-vxlan-bypass-vtep-12">
  <!ENTITY nbsp     "&#160;">
  <!ENTITY zwsp     "&#8203;">
  <!ENTITY nbhy     "&#8209;">
  <!ENTITY wj       "&#8288;">
]>
<!-- <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> -->
<!-- This third-party XSLT can be enabled for direct transformations
in XML processors, including most browsers -->
<!-- If further character entities are required then they should be
added to the DOCTYPE above. Use of an external entity file is not
recommended. -->
<?rfc strict="yes" ?>
<?rfc toc="yes"?>

<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="std"
  docName="&filename;"
  ipr="trust200902"
  obsoletes=""
  submissionType="IETF"
  xml:lang="en"
  version="3">
<!--
    * docName should be the name of your draft * category should be
    one of std, bcp, info, exp, historic * ipr should be one of
    trust200902, noModificationTrust200902, noDerivativesTrust200902,
    pre5378Trust200902 * updates can be an RFC number as NNNN *
    obsoletes can be an RFC number as NNNN
-->


<!-- ____________________FRONT_MATTER____________________ -->
<front>
   <title abbrev="EVPN VXLAN Bypass VTEP">EVPN VXLAN Bypass VTEP</title>
   <!--  The abbreviated title is required if the full title is
        longer than 39 characters -->

<seriesInfo name="Internet-Draft"
		value="&filename;"/>
    
<author fullname="Donald Eastlake" initials="D." surname="Eastlake">
  <organization>Futurewei Technologies</organization>
  <address>
    <postal>
      <street>2386 Panoramic Circle</street>
      <city>Apopka</city>
      <region>FL</region>
      <code>32703</code>
      <country>USA</country>
    </postal>
    <phone>+1-508-333-2270</phone>
    <email>d3e3e3@gmail.com</email>
    <email>donald.eastlake@futurewei.com</email>
  </address>
</author>

<author fullname="Zhenbin Li" initials="Z." surname="Li">
  <organization>Huawei Technologies</organization>
  <address>
    <postal>
      <street>Huawei Blduilding, No.156 Beiqing Rdoad</street>
      <city>Beijing</city>
      <code>100095</code>
      <country>China</country>
    </postal>
    <email>lizhenbin@huawei.com</email>
  </address>
</author>

<author fullname="Shunwan Zhuang" initials="S." surname="Zhuang">
  <organization>Huawei Technologies</organization>
  <address>
    <postal>
      <street>Huawei Blduilding, No.156 Beiqing Rdoad</street>
      <city>Beijing</city>
      <code>100095</code>
      <country>China</country>
    </postal>
    <email>zhuangshunwan@huawei.com</email>
  </address>
</author>

<author fullname="Russ White" initials="R." surname="White">
  <organization>Juniper Networks</organization>
  <address>
    <email>russ@riw.us</email>
  </address>
</author>

<date year="2023" month="May" day="29"/>

  <!-- Meta-data Declarations -->

<area>Routing</area>

<workgroup>BESS Working Group</workgroup>
    <!-- WG name at the upperleft corner of the doc, IETF is fine for
       individual submissions.  If this element is not present, the
       default is "Network Working Group", which is used by the RFC
       Editor as a nod to the history of the IETF. -->

<keyword></keyword>
  <!-- Keywords will be incorporated into HTML output files in a meta
       tag but they have no effect on text or nroff output. If you
       submit your draft to the RFC Editor, the keywords will be used
       for the search engine. -->

<abstract>
  <t>A principal feature of EVPN is the ability to support multihoming
  from a customer equipment (CE) to multiple provider edge equipment
  (PE) with all-active links. This draft specifies a mechanism to
  simplify PEs used with VXLAN tunnels and enhance VXLAN Active-Active
  reliability.</t>
</abstract>
  
</front>


<!-- ***** MIDDLE MATTER ***** -->

<middle>
  
<section anchor="Introduction">  <!-- 1. -->
  <name>Introduction</name>

<t>A principal feature of EVPN is the ability to support multihoming from
a customer equipment (CE) to multiple provider edge equipment (PE)
with links used in the all-active redundancy mode. That mode is where
a device is multihomed to a group of two or more PEs and where all PEs
in such a redundancy group can forward traffic to/from the multihomed
device or network for a given VLAN <xref target="RFC7209"/>. This draft specifies a
VXLAN gateway mechanism to simplify PE processing in the multi-homed
case and enhance VXLAN Active-Active reliability.</t>

<section>  <!-- 1.1 -->
  <name>Terminology and Acronyms"</name>

<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all
capitals, as shown here.</t>

<t>This document uses the following acronyms and terms:</t>

<dl>
<dt>All-Active Redundancy Mode</dt><dd>- When a device is multihomed to a group of
two or more PEs and when all PEs in such redundancy group can forward
traffic to/from the multihomed device or network for a given VLAN.</dd>

<dt>BUM</dt><dd>- Broadcast, Unknown unicast, and Multicast.</dd>

<dt>CE</dt><dd>- Customer Edge equipment.</dd>

<dt>DCI</dt><dd>- Data Center Interconnect.</dd>

<dt>ESI</dt><dd>- Ethernet Segment Identifier - A unique non-zero identifier that
identifies an Ethernet segment.</dd>

<dt>NVE</dt><dd>- Network Virtualization Edge.</dd>

<dt>PE</dt><dd>- Provider Edge equipment.</dd>

<dt>Single-Active Redundancy Mode</dt><dd>- When a device or a network is
multihomed to a group of two or more PEs and when only a single PE in
such a redundancy group can forward traffic to/from the multihomed
device or network for a given VLAN.</dd>

<dt>VTEP</dt><dd>- VXLAN Tunnel End Point.</dd>

<dt>VXLAN</dt><dd>- Virtual eXtensible Local Area Network <xref target="RFC7348"/>.</dd>
</dl>

</section>

</section>

<section>  <!-- 2. -->
  <name>VXLAN Gateway High Reliability</name>

<t>One example of the current situation would be a DCI (data center
interconnect) using VXLAN tunnels that is multihomed for reliability
as show in Figure 1. Each PE as a VXLAN Tunnel End Point (VTEP) uses a
different IP adress. Thus each PE must process EVPN updates based on
the ESIs <xref target="RFC7432"/>.</t>

<figure anchor="Current">
   <name>Current Situation</name>
   <artwork align="center"><![CDATA[
                       .........
                       .  DCI  .
     +----------+      .       .      +----------+
     | PE       +---------------------+ PE       |
     |VTEP IP-1 +---   . VXLAN .   ---+VTEP IP-3 |
     +----------+   \  .Tunnels.  /   +----------=
    /     |          -----   -----          |     \
+--+      |            .  \ /  .            |      +--+
|CE|      |            .   X   .            |      |CE|
+--+      |            .  / \  .            |      +--+
    \     |          -----   -----          |    /
     +----------+   /  . VXLAN .  \   +----------+
     | PE       +---   .Tunnels.   ---+ PE       |
     |VTEP IP-2 +---------------------+VTEP IP-4 |
     +----------+      .       .      +----------+
                       .........
]]></artwork>
</figure>

<t>The situation is greatly simplified if the set of VTEPs connected to a
particular Ethernet segment all use the same anycast IP address. PEs
no longer need to conern themselves with whether a remote CE is single
or multi-homed. The situation is as shown in Figure 2. The IP
address within each VTEP group is synchronized by messages within that
group.</t>

<figure anchor="Anycast">
   <name>Situation Using Anycast</name>
   <artwork align="center"><![CDATA[
                       .........
                       .  DCI  .
     +----------+      .       .      +----------+
     | Anycast  |      .       .      | Anycast  |
     |VTEP IP-1 +---   .       .   ---+VTEP IP-2 |
     +----------+   \  .       .  /   +----------=
    /     ^          \ .       . /          ^     \
+--+      |           \.       ./           |      +--+
|CE|    Sy|nc          >-------<          Sy|nc    |CE|
+--+      |           /. VXLAN .\           |      +--+
    \     v          / . Tunnel. \          v    /
     +----------+   /  .       .  \   +----------+
     | Anycast  +---   .       .   ---+ Anycast  |
     |VTEP IP-1 |      .       .      |VTEP IP-2 |
     +----------+      .       .      +----------+
                       .........
]]></artwork>
</figure>

</section>

<section>  <!-- 3. -->
  <name>Detailed Problem and Solution Requirement</name>

<t>In the scenario illustrated in Figure 3, where an enterprise site and
a data center are interconnected, the VPN gateways (PE1 and PE2) and the
enterprise site (CPE) are connected through a VXLAN tunnel to provide
L2/L3 services between the enterprise site (CPE) and data center.  The
data center gateway (CE1) is dual-homed to PE1 and PE2 to access the
VXLAN network, which enhances network access reliability.  When one PE
fails, services can be rapidly switched to the other PE, minimizing
the impact on services.</t>

<t>As shown in Figure 3, PE1 and PE2 use a virtual address as a Network
Virtualization Edge (NVE) interface address at the network side,
namely, the Anycast VTEP address.  In this way, the CPE is aware of
only one remote NVE interface and establishes a VXLAN tunnel with the
virtual address.  The packets from the CPE can reach CE1 through
either PE1 or PE2.  However, single-homed CEs may exist, such as CE2
and CE3.  As a result, after reaching a PE, the packets from the CPE
may need to be forwarded by the other PE to a single-homed CE.
Therefore, a bypass VXLAN tunnel needs to be established between PE1
and PE2.  An EVPN peer relationship is established between PE1 and
PE2.  Different addresses, namely, bypass VTEP addresses, are
configured for PE1 and PE2 so that they can establish a bypass VXLAN
tunnel.</t>

<figure anchor="VXLAN">
   <name>Basic networking of the VXLAN active-active scenario</name>
   <artwork align="center"><![CDATA[
                           +-----+
          ---------------- | CPE |   Enterprise site
             ^             +-----+
             |               / \
             |              /   \
           VXLAN Tunnel    /     \
             |            /       \
             |           / Anycast \
             v      +-----+ VTEP +-----+
          --------- | PE1 |------| PE2 |
                    +-----+      +-----+
                      /\           /\
                     /  \         /  \
                    /    \ Trunk /    \
                   /      \     /      \
                  /       +\---/+       \
                 /        | \ / |        \
                /         +--+--+         \
               /             |             \
           +-----+        +-----+        +-----+
           | CE2 |        | CE1 |        | CE3 |
           +-----+        +-----+        +-----+
]]></artwork>
</figure>

</section>

<section>  <!-- 4. -->
  <name>The Bypass VXLAN Extended Community Attribute</name>

<t>This sections specifies the extensions to meet the requirements given
in Section 3 and enhance VXLAN active-active reliability.</t>

<t>This document specifies two new BGP extended communities, the IPv4 and
IPv6 Bypass VXLAN Extended Communities.  These extended communities
are IPv4-address-specific or IPv6-address-specific, depending on
whether the VTEP address to be accommodated is IPv4 or IPv6.  In the
new extended communities, the 4-byte or 16-byte global administrator
field encodes the IPv4 or IPv6 address that is the VTEP address and
the 2-byte local administrator field is formatted as shown in Figures
4 and 5.</t>

<figure anchor="v4ExtComm">
   <name>IPv4-address-specific Bypass VXLAN Extended Community</name>
   <artwork align="center"><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Type=0x01    | Sub-Type=TBA1 |         IPv4 Address          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      IPv4 Address (cont.)     |    Flags      |   Reserved    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
</figure>

<t></t>

<figure anchor="v6ExtComm">
   <name>IPv6-address-specific Bypass VXLAN Extended Community</name>
   <artwork align="center"><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x00/0x40| Sub-Type=TBA2 |    Target IPv6 Address        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Target IPv6 Address (cont.)                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Target IPv6 Address (cont.)                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Target IPv6 Address (cont.)                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Target IPv6 Address (cont.)  |    Flags      |   Reserved    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
</figure>

<t>Where</t>

<dl>
  <dt>Type:</dt>
<dd>
<t>0x01 = type for transitive IPv4 specific use.</t>

<t>0x00 = type for transitive IPv6 specific use.</t>

<t>0x40 = type for non-transitive IPv6 specific use.</t>
</dd>

<dt>Sub-Type:</dt>
<dd>
<t>TBA1 = subtype for IPv4 specific use.</t>

<t>TBA2 = subtype for IPv6 specific use.</t>
</dd>

<dt>IPv4/IPv6:</dt><dd>An address of that type.</dd>

<dt>Flags:</dt><dd>MUST be sent as zero and ignored on receipt.</dd>

<dt>Reserved:</dt><dd>MUST be sent as zero and ignored on
receipt.</dd>
</dl>

</section>

<section>  <!-- 5. -->
  <name>Control Plane Processing</name>

<t>Using the topology in Figure 3:</t>

<ol>
<li>PE2 sends a multicast route to PE1.  The source address of the
route is the Anycast VTEP address shared by PE1 and PE2.  The route
carries the bypass VXLAN extended community attribute, including the
bypass VTEP address of PE1.</li>

<li>After receiving the multicast route from PE2, PE1 considers that
an Anycast relationship be established with PE2.  This is because the
source address (Anycast VTEP address) of the route is the same as the
local virtual address of PE1 and the route carries the bypass VTEP
extended community attribute.  Based on the bypass VXLAN extended
attribute of the route, PE1 establishes a bypass VXLAN tunnel to
PE2.</li>

<li>PE1 learns the MAC address of the CEs through upstream packets
from the CEs and advertises them as routes to PE2 through BGP EVPN.
The routes carry the ESI of the links accessed by the CEs, and
information about the VLANs that the CE access, and the bypass VXLAN
extended community attribute.</li>

<li>PE1 learns the MAC address of the CPE through downstream packets
at the network side, specifies that the next-hop address of the MAC
route can be iterated to a static VXLAN tunnel, and advertises the
route to PE2.  The next-hop address of the MAC route cannot be
changed.</li>
</ol>

</section>

<section>  <!-- 6. -->
  <name>Data Packet Processing</name>

<t>This section describes how Layer 2 unicast and BUM (Broadcast,
Unknown unicast, and Multicast) packets are forwarded. A description
of how Layer 3 packets transmitted on the same subnet and Layer 3
packets transmitted across subnets cases are forwarded will be
provided in a furture version of this document.</t>

<section>  <!-- 6.1 -->
  <name>Layer 2 Unicast Packet Forwarding</name>

<t>The following two subsections discuss Layer 2 unicast forwarding in
the topology shown in Figure 3.</t>

<section>  <!-- 6.1.1 -->
  <name>Uplink</name>

<t>After receiving Layer 2 unicast packets destined for the CPE from
CE1, CE2, and CE3, PE1 and PE2 search for their local MAC address
table to obtain outbound interfaces, perform VXLAN encapsulation on
the packets, and forward them to the CPE.</t>

</section>

<section>  <!-- 6.1.2 -->
  <name>Downlink</name>

<t>After receiving a Layer 2 unicast packet sent by the CPE to CE1,
PE1 performs VXLAN decapsulation on the packet, searches the local MAC
address table for the destination MAC address, obtains the outbound
interface, and forwards the packet to CE1.</t>

<t>After receiving a Layer 2 unicast packet sent by the CPE to CE2,
PE1 performs VXLAN decapsulation on the packet, searches the local MAC
address table for the destination MAC address, obtains the outbound
interface, and forwards the packet to CE2.</t>

<t>After receiving a Layer 2 unicast packet sent by the CPE to CE3,
PE1 performs VXLAN decapsulation on the packet, searches the local MAC
address table for the destination MAC address, and forwards it to PE2
over the bypass VXLAN tunnel.  After the packet reaches PE2, PE2
searches the destination MAC address, obtains the outbound interface,
and forwards the packet to CE3.</t>

<t>The process for PE2 to forward packets from the CPE is the same as
that for PE1 to forward packets from the CPE with the roles of CE2 and
CE3 swapped.</t>

</section>

</section>

<section>  <!-- 6.2 -->
  <name>BUM Packet Forwarding</name>

<t>Using the topology in Figure 3, if the destination address of a
BUM packet from the CPE is the Anycast VTEP address of PE1 and PE2,
the BUM packet may be forwarded to either PE1 or PE2.  If the BUM
packet reaches PE2, PE2 sends a copy of the packet to CE3 and
CE1.  In addition, PE2 sends a copy of the packet to PE1 through the
bypass VXLAN tunnel between PE1 and PE2.  After the copy of the packet
reaches PE1, PE1 sends it to CE2, not to the CPE or CE1.  In this way,
CE1 receives only one copy of the packet.</t>

<t>Using the topology in Figure 3, after a BUM packet from CE2 reaches
PE1, PE1 sends a copy of the packet to CE1 and the CPE.  In addition,
PE1 sends a copy of the packet to PE2 through the bypass VXLAN tunnel
between PE1 and PE2.  After the copy of the packet reaches PE2, PE2
sends it to CE3, not to the CPE or CE1.</t>

<t>Using the topology in Figure 3, after a BUM packet from CE1 reaches
PE1, PE1 sends a copy of the packet to CE2 and the CPE.  In addition,
PE1 sends a copy of the packet to PE2 through the bypass VXLAN tunnel
between PE1 and PE2.  After the copy of the packet reaches PE2, PE2
sends it to CE3, not to the CPE or CE1.</t>

</section>
</section>

<section>  <!-- 7. -->
  <name>IANA Considerations</name>

<t>IANA is requested to assign two new Extended Community attribute
SubTypes as follows:</t>

<section>  <!-- 7.1 -->
  <name>IPv4 Specific</name>

  <table>
    <thead>
<tr><th>Sub-Type Value</th><th
align="center">Name</th><th>Reference</th></tr> 
    </thead>
    <tbody>
<tr><td align="center">TBA1</td><td>Bypass VXLAN Extended
Community</td><td>[this doc]</td></tr>
    </tbody>
  </table>

  </section>

  <section>  <!-- 7.2 -->
    <name>IPv6 Specific</name>

  <table>
    <thead>
<tr><th>Sub-Type Value</th><th
align="center">Name</th><th>Reference</th></tr> 
    </thead>
    <tbody>
<tr><td align="center">TBA2</td><td>Bypass VXLAN Extended
Community</td><td>[this doc]</td></tr>
    </tbody>
  </table>

  </section>
</section>

<section>  <!-- 8. -->
  <name>Security Considerations</name>

<t>TBD</t>

<t>For general EVPN Security Considerations, see <xref
target="RFC7432"/>.</t>

</section>

</middle>


<!-- ____________________BACK_MATTER____________________ -->
<back>

<references>
  <name>Normative References</name>

<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.2119.xml"/>
<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7432.xml"/>
<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.8174.xml"/>

</references>

<references>
  <name>Informative References</name>

<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7209.xml"/>
<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7348.xml"/>

</references>

<section anchor="Acknowledgements" numbered="false">
  <name>Acknowledgements</name>
  
<t>The authors would like to thank the following for their comments
and review of this document: TBD.</t>

</section>

<section anchor="Contributors" numbered="false">
  <name>Contributors</name>

  <t>Thanks to the following who made significant
  contributions to this document:</t>
     
 <contact fullname="Haibo Wang">
   <organization>Huawei Technologies</organization>
   <address>
    <postal>
      <street>Huawei Blduilding, No.156 Beiqing Rdoad</street>
      <city>Beijing</city>
      <code>100095</code>
      <country>China</country>
    </postal>
     <email>rainsword.wang@huawei.com</email>
   </address>
 </contact>
 
</section>

</back>
  
</rfc>
