BGP EVPN Multi-Homing Extensions for Split Horizon Filtering

Ethernet Virtual Private Networks (EVPN) are commonly used with the following tunnel encapsulations: Network Virtualization Overlay (NVO) tunnels, where the EVPN procedures are specified in . MPLSoGRE , MPLSoUDP , GENEVE or VXLAN tunnels are considered NVO tunnels. MPLS and Segment Routing with MPLS data plane (SR-MPLS), where the relevant EVPN procedures are specified in . Segment Routing with MPLS data plane tunneling is specified in . Segment Routing with IPv6 data plane (SRv6), where the relevant EVPN procedures are specified in . SRv6 is specified in . Split Horizon, in this document, follows the definition in . Split Horizon refers to the EVPN multihoming procedure that prevents a PE from sending a frame back to a multihomed Customer Edge (CE) when that CE originated the frame in the first place. EVPN multihoming procedures may vary depending on the type of tunnel utilized within the EVPN Broadcast Domain. Specifically, there are two multihoming Split Horizon procedures employed to prevent looped frames on multihomed CE devices: the ESI Label-based procedure and the Local Bias procedure. The ESI Label-based Split Horizon procedure is used for MPLS or MPLS-over-X (MPLSoX) tunnels, such as MPLS-over-UDP, and its procedures are detailed in . Conversely, the Local Bias procedure is used for IP-based tunnels, such as VXLAN tunnels, and it is described in .

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here. AC: Attachment Circuit. A-D per ES route: refers to the EVPN Ethernet Auto-Discovery per ES route defined in . Arg.FE2: refers to the ESI filtering argument used for Split Horizon as specified in . Broadcast Domain (BD): an emulated Ethernet, such that two systems on the same BD will receive each other's broadcast, unknown and multicast traffic. In this document, BD also refers to the instantiation of a Broadcast Domain on an EVPN PE. An EVPN PE can be attached to one or multiple BDs of the same tenant. BUM: Broadcast, Unknown unicast and Multicast traffic. Designated Forwarder (DF): as defined in , an ES may be multihomed (attached to more than one PE). An ES may also contain multiple BDs, of one or more EVIs. For each such EVI, one of the PEs attached to the segment becomes that EVI's DF for that segment. Since a BD may belong to only one EVI, we can speak unambiguously of the BD's DF for a given segment. ES and ESI: Ethernet Segment and Ethernet Segment Identifier. EVI: EVPN Instance EVI-RT: EVI Route Target. A group of NVEs attached to the same EVI will share the same EVI-RT. GENEVE: Generic Network Virtualization Encapsulation, ( tunnel type 19). MPLS tunnels and non-MPLS NVO tunnels: refer to Multi-Protocol Label Switching (or the absence of it) Network Virtualization Overlay tunnels. Network Virtualization Overlay tunnels use an IP encapsulation for overlay frames, where the source IP address identifies the ingress NVE and the destination IP address the egress NVE. MPLSoUDP: Multi-Protocol Label Switching over User Datagram Protocol, ( tunnel type 13). MPLSoGRE: Multi-Protocol Label Switching over Generic Network Encapsulation, ( tunnel type 11). MPLSoX: refers to MPLS over any IP encapsulation. Examples are MPLS-over-UDP or MPLS-over-GRE. NVE: Network Virtualization Edge device. NVGRE: Network Virtualization Using Generic Routing Encapsulation, ( tunnel type 9). VXLAN: Virtual eXtensible Local Area Network, ( tunnel type 8). VXLAN-GPE: VXLAN Generic Protocol Extension, ( tunnel type 12). SHT: Split Horizon Type, it refers to the Split Horizon method that a PE intends to use and advertises in an A-D per ES route. SRv6: Segment Routing with an IPv6 data plane, . This document also assumes familiarity with the terminology of and .

EVPN supports two Split Horizon Filtering mechanisms: ESI Label based Split Horizon filtering When EVPN is employed for MPLS transport tunnels, an MPLS label facilitates Split Horizon filtering to support All-Active multihoming. The ingress Network Virtualization Edge (NVE) device appends a label corresponding to the source Ethernet Segment Identifier (ESI label) during packet encapsulation. The egress NVE verifies the ESI label when attempting to forward a multi-destination frame through a local Ethernet Segment (ES) interface. If the ESI label matches the site identifier (ESI) associated with that ES interface, the packet is not forwarded. This mechanism effectively prevents forwarding loops for BUM traffic. The ESI Label Split Horizon filtering should also be utilized with Single-Active multihoming to prevent transient loops for in-flight packets when the egress NVE assumes the role of Designated Forwarder for an ES. Local Bias Since IP tunnels, such as VXLAN or NVGRE, do not support the ESI label or any MPLS label, an alternative Split Horizon filtering procedure must be implemented for All-Active multihoming. This mechanism, known as Local Bias, relies on the source IP address of the tunnel to determine whether to forward BUM traffic to a local Ethernet Segment (ES) interface at the egress Network Virtualization Edge (NVE). In summary and as specified in , each NVE tracks the IP address(es) of other NVEs with which it shares multihomed ESs. Upon receiving a BUM frame encapsulated in an IP tunnel, the egress NVE inspects the source IP address in the tunnel header, which identifies the ingress NVE. The egress NVE then filters out the frame on all local interfaces connected to ESs that are shared with the ingress NVE. Due to this behavior at the egress NVE, the ingress NVE is required to perform local replication to all directly attached ESs, regardless of the Designated Forwarder election state, for all BUM traffic ingressing from the access Attachment Circuits (ACs). This local replication at the ingress NVE is the basis for the term Local Bias. Local Bias is not suitable for Single-Active multihoming, as the ingress NVE deactivates the ACs for which it is not the Designated Forwarder. Consequently, local replication to non-Designated Forwarder ACs cannot occur, leading to transient in-flight BUM packets to be looped back to the originating site by newly elected Designated Forwarder egress NVEs. specifies that Local Bias is exclusively utilized for IP tunnels, while ESI Label-based Split Horizon is employed for IP-based MPLS tunnels. However, IP-based MPLS tunnels, such as MPLS over GRE (MPLSoGRE) or MPLS over UDP (MPLSoUDP), are also categorized as IP tunnels and have the potential to support both procedures. These tunnels are capable of carrying ESI labels and also utilize a tunnel IP header in which the source IP address identifies the ingress Network Virtualization Edge (NVE). Similarly, certain IP tunnels - that include an identifier for the source Ethernet Segment (ES) in the tunnel header - may also potentially support either procedure. Examples of such tunnels include GENEVE and SRv6.: In a GENEVE tunnel, the source IP address identifies the ingress NVE therefore local bias is possible. Also, section 4.1 defines an Ethernet option TLV (Type Length Value) to encode an ESI label value. In an SRv6 tunnel, the source IP address identifies the ingress NVE. By default, and as outlined in , the ingress PE adds specific information to the SRv6 packet to enable the egress PE to identify the source ES of the BUM packet. This information is the ESI filtering argument (Arg.FE2) (section 6.1.1) (section 4.12) of the service Segment Identifier (SID) received on an A-D per ES route from the egress PE. presents various tunnel encapsulations along with their supported and default Split Horizon methods. For GENEVE, the default Split Horizon Type (SHT) is contingent upon the negotiation of the Ethernet Option with the Source ID TLV. In the case of SRv6, the default SHT is specified as ESI Label filtering in the table, as its behavior is analogous to that of ESI Label filtering. In this document, ESI Label filtering refers to the Split Horizon filtering based on the presence of a source Ethernet Segment (ES) identifier in the tunnel header. This document classifies the tunnel encapsulations used by EVPN into: IP-based MPLS tunnels (SR-)MPLS tunnels, that is, MPLS and Segment Routing with MPLS data plane tunnels IP tunnels SRv6 tunnels lists the encapsulations supported by this document. Any tunnel encapsulation not listed in ) is out of scope. Tunnel encapsulations used by EVPN can be categorized into one of the four encapsulation groups mentioned above and support Split Horizon filtering based on the following rules: IP-based MPLS tunnels and SRv6 tunnels are capable of supporting both Split Horizon filtering methods. (SR-)MPLS tunnels only support ESI Label-based Split Horizon filtering IP tunnels support Local Bias Split Horizon filtering and may also support ESI Label-based Split Horizon filtering, provided they incorporate a mechanism to identify the source ESI in the header. Tunnel Encapsulation Default Split Horizon Type (SHT) Supports Local Bias Supports ESI Label MPLSoGRE (IP-based MPLS) ESI Label filtering Yes Yes MPLSoUDP (IP-based MPLS) ESI Label filtering Yes Yes (SR-)MPLS ESI Label filtering No Yes VXLAN (IP tunnels) Local Bias Yes No NVGRE (IP tunnels) Local Bias Yes No VXLAN-GPE (IP tunnels) Local Bias Yes No GENEVE (IP tunnels) Local Bias (no ESI Lb), ESI Label (if ESI lb) Yes Yes SRv6 ESI Label filtering Yes Yes The ESI Label method is applicable for both All-Active and Single-Active configurations, whereas the Local Bias method is suitable only for All-Active configurations. Moreover, the ESI Label method is effective across different network domains, while Local Bias is constrained to networks where there is no change in the next hop between the NVEs attached to the same ES. Nonetheless, some operators favor the Local Bias method due to its simplification of the encapsulation process, reduced resource consumption on NVEs, and the fact that the ingress NVE always forwards traffic locally to other interfaces, thereby decreasing the delay in reaching multihomed hosts. This document extends the EVPN multihoming procedures to allow operators to select the preferred Split Horizon method for a given NVO tunnel according to their specific requirements. The choice between Local Bias and ESI Label Split Horizon is now allowed (by configuration) for tunnel encapsulations that support both methods, and this selection is advertised along with the EVPN A-D per ES route. IP tunnels that do not support both methods, such as VXLAN or NVGRE, will continue to adhere to the procedures specified in . Note that this document does not modify the Local Bias or the ESI Label Split Horizon procedures themselves, just focuses on the signaling and selection of the Split Horizon method to apply by the multihomed NVEs.

Extensions to EVPN are required to enable NVEs to advertise their preferred Split Horizon method for a given ES. illustrates the ESI Label extended community ( Section 7.5), which is consistently advertised alongside the EVPN A-D per ES route. All NVEs connected to an ES advertise an A-D per ES route for that ES, including the extended community, which communicates information regarding the multihoming mode (either All-Active or Single-Active) and, if necessary, specifies the ESI Label to be utilized.

defines the low-order bit of the Flags octet (bit 0) as the "Single-Active" bit: A value of 0 means that the multihomed ES is operating in All-Active multihoming redundancy mode. A value of 1 means that the multihomed ES is operating in Single-Active multihoming redundancy mode. establishes a registry for the Flags octet, designating the "Single-Active" bit as the low-order bit of the newly defined multihoming redundancy mode field.

does not include any explicit indication regarding the Split Horizon method in the A-D per Ethernet Segment (ES) route. In this document, the Split Horizon procedure defined in (section 8.3.1) is considered the default behavior, presuming that Local Bias is employed exclusively for IP tunnels, while ESI Label-based Split Horizon is used for IP-based MPLS tunnels. This document specifies that the two high-order bits in the Flags octet (bits 6 and 7) constitute the "Split Horizon Type" (SHT) field, where:

Default SHT Backwards compatible with [RFC8365] and [RFC7432] 0 1 --> Local Bias 1 0 --> ESI Label based filtering 1 1 --> reserved for future use ]]> SHT = 00 is backwards compatible with and , and indicates that the advertising NVE intends to use the default or built-in SHT. The default SHT is shown in for each encapsulation. An egress NVE that follows the behavior and does not support this specification will ignore the SHT bits (which is equivalent to process them as value of 00). SHT = 01 indicates that the advertising NVE intends to use Local Bias procedures in the ES for which the AD per-ES route is advertised. SHT = 10 indicates that the advertising NVE intends to use the ESI Label based Split Horizon method procedures in the ES for which the AD per-ES route is advertised. SHT = 11 is a reserved value, for future use.

The following behavior is observed: An SHT value of 01 or 10 MUST NOT be used with encapsulations that support only one SHT in , and MAY be used by encapsulations that support the two SHTs in . An SHT value different from 00 expresses the intent to use a specific Split Horizon method, but does not reflect the actual operational SHT used by the advertising NVE, unless all the NVEs attached to the ES advertise the same SHT. In case of inconsistency in the SHT value advertised by the NVEs attached to the same ES for a given EVI, all the NVEs MUST revert to the behavior, and use the default SHT in , irrespective of the advertised SHT. An SHT different from 00 MUST NOT be set if the Single-Active bit is set. A received A-D per ES route where Single-Active and SHT bits are different from zero MUST follow the treat-as-withdraw behavior . The SHT MUST have the same value in each Ethernet A-D per ES route that an NVE advertises for a given ES and a given encapsulation (see for NVEs supporting multiple encapsulations). As an example, egress NVEs that support IP-based MPLS tunnels, such as MPLSoGRE or MPLSoUDP, will advertise A-D per ES routes for the ES along with the BGP Encapsulation extended community, as defined in . This extended community indicates the encapsulation type (MPLSoGRE or MPLSoUDP) and may use the SHT value of 01 or 10 to signify the intent to use Local Bias or ESI Label, respectively. An egress NVE MUST NOT use an SHT value other than 00 when advertising an A-D per ES route with Tunnel encapsulation types of VXLAN (type 8), NVGRE (type 9), MPLS (type 10), or no BGP tunnel encapsulation extended community at all. In all these cases, it is presumed that there is no choice for the Split Horizon method; therefore, the SHT value MUST be set to 00. If a route with any of the mentioned encapsulation options is received and has an SHT value different from 00, it SHOULD apply the treat-as-withdraw behavior, per . An egress NVE advertising A-D per ES route(s) for an ES with GENEVE encapsulation (, Tunnel encapsulation type 19, ) MAY use an SHT value of 01 or 10. A value of 01 indicates the intent to use Local Bias, regardless of the presence of an Ethernet option TLV with a non-zero Source-ID, as described in . A value of 10 indicates the intent to use ESI Label-based Split Horizon, and it is only valid if an Ethernet option TLV with non-zero Source-ID is present. A value of 00 indicates the default behavior outlined in , which is to use Local Bias if: a) no ESI-Label is present in the Ethernet option TLV, or b) if there is no Ethernet option TLV. Otherwise, the ESI Label Split Horizon method is applied. These procedures assume a single encapsulation supported in the egress NVE. describes additional procedures for NVEs supporting multiple encapsulations.

This document also updates regarding the value that is advertised in the ESI Label field of the ESI Label extended community, as follows: The A-D per ES route(s) for an ES MAY have an ESI Label value of zero if the SHT value is 01. specifies the scenarios where the SHT can be 01. An ESI Label value of zero eliminates the need to allocate labels in cases where they are not utilized, such as in the Local Bias method. The A-D per ES route(s) for an ES MAY have an ESI Label value of zero for VXLAN or NVGRE encapsulations.

As discussed in this specification is backwards compatible with the Split Horizon filtering behavior in and a non-upgraded NVE can be attached to the same ES as other NVEs supporting this specification. An NVE maintains an administrative SHT value for an Ethernet Segment (ES), which is advertised alongside the A-D per ES route, and an operational SHT value, which is the one actually used regardless of what the NVE has advertised. The administrative SHT matches the operational SHT if all the NVEs attached to the ES have the same administrative SHT. This document assumes that an implementation of or that does not support the specifications in this document will ignore the values of all the Flags in the ESI Label extended community, except for the Single-Active bit. Based on this assumption, a non-upgraded NVE will disregard any SHT value other than 00. If an upgraded NVE receives at least one A-D per ES route for the ES with an SHT value of 00, it MUST revert its operational SHT to the default Split Horizon method, as described in , irrespective of its administrative SHT. For instance, consider an NVE attached to ES N that receives two A-D per ES routes for N from different NVEs, NVE1 and NVE2. If the route from NVE1 has an SHT value of 00 and the one from NVE2 has an SHT value of 01, the NVE MUST use the default Split Horizon method specified in as its operational SHT, regardless of its administrative SHT. All NVEs attached to an ES with an operational SHT value of 10 MUST advertise a valid, non-zero ESI Label. If the operational SHT value is 01, the ESI Label MAY be zero. If the operational SHT value is 00, the ESI Label may be zero only if the default encapsulation supports Local Bias exclusively, and the NVEs do not require the presence of a valid, non-zero ESI Label. If an NVE changes its operational SHT value from 01 (Local Bias) to 00 (Default SHT) due to the presence of a new non-upgraded NVE in the ES, and it previously advertised a zero ESI Label, it MUST send an update with a valid, non-zero ESI Label, unless all the non-upgraded NVEs in the ES support only Local Bias. For example, consider NVE1 and NVE2 using MPLSoUDP as encapsulation, attached to the same Ethernet Segment ES1, and advertising an SHT value of 01 (Local Bias) with a zero ESI Label value. Suppose NVE3, which does not support this specification, joins ES1 and advertises an SHT value of 00 (default). Upon receiving NVE3's A-D per ES route, NVE1 and NVE2 MUST update their A-D per ES routes for ES1 to include a valid, non-zero ESI Label value. The assumption here is that NVE3 only supports the default ESI Label-based Split Horizon filtering.

As specified in , an NVE that supports multiple data plane encapsulations (e.g., VXLAN, NVGRE, MPLS, MPLSoUDP, GENEVE) must indicate all supported encapsulations using BGP Encapsulation extended communities as defined in for all EVPN routes. This section provides clarification on the multihoming Split Horizon behavior for NVEs that advertise and receive multiple BGP Encapsulation extended communities along with the A-D per ES routes. This section uses the notation {x, y} (more than two encapsulations is possible too) to denote the encapsulations advertised in BGP Encapsulation extended communities (or BGP Tunnel Encapsulation Attribute), where x and y represent different encapsulation values. When GENEVE is one of the encapsulations, the tunnel type is indicated in either a BGP Encapsulation extended community or a BGP Tunnel Encapsulation Attribute. It is important to note that an NVE MAY advertise multiple A-D per ES routes for the same ES, rather than a single route, with each route conveying a set of Route Targets (RT). The total set of Route Targets associated with a given ES is referred to as the RT-set for that ES. Each of the EVIs represented in the RT-set will have its RT included in one, and only one, A-D per ES route for the ES. When multiple A-D per ES routes are advertised for the same ES, each route must have a distinct Route Distinguisher. As per , an NVE that advertises multiple encapsulations in the A-D per ES route(s) for an ES MUST advertise encapsulations that use the same Split Horizon filtering method in the same route. For example: An A-D per ES route for ES-x may be advertised with {VXLAN, NVGRE} encapsulations. An A-D per ES route for ES-y may be advertised with {MPLS, MPLSoUDP, MPLSoGRE} encapsulations (or a subset). But an A-D per ES route for ES-z MUST NOT be advertised with {MPLS, VXLAN} encapsulations. This document extends the described behavior as follows: An A-D per ES route for ES-x may be advertised with multiple encapsulations, some of which support a single Split Horizon method. In this case, the Split Horizon Type (SHT) value MUST be 00. For instance, encapsulations such as {VXLAN, NVGRE}, {VXLAN, GENEVE}, or {MPLS, MPLSoGRE, MPLSoUDP} can be advertised in an A-D per ES route. In all these cases, the SHT value MUST be 00 and the behavior treat-as-withdraw is applied in case of any other value. An A-D per ES route for ES-y may be advertised with multiple encapsulations that all support both Split Horizon methods. In this case, the SHT value MAY be 01 if the preferred method is Local Bias, or 10 if the ESI Label-based method is desired. For example, encapsulations such as {MPLSoGRE, MPLSoUDP, GENEVE} (or a subset) MAY be advertised in an A-D per ES route with an SHT value of 01. The ESI Label value in this case MAY be zero. If ES-z with RT-set composed of (RT1, RT2, RT3.. RTn) supports multiple encapsulations requiring different Split Horizon methods, a distinct A-D per ES route (or group of routes) per Split Horizon method MUST be advertised. For example, consider an ES-z with n Route Targets (RTs) where: the EVIs corresponding to (RT1..RTi) support VXLAN, the ones for (RTi+1..RTm) (with i<m) support MPLSoUDP with Local Bias, and the ones for (RTm+1..RTn) (with m<n) support GENEVE with ESI Label based Split Horizon. In this scenario, three groups of A-D per ES routes MUST be advertised for ES-z: A-D per ES route group 1, including (RT1..RTi), with encapsulation {VXLAN}, and an SHT value of 00. The ESI Label MAY be zero. A-D per ES route group 2, including (RTi+1..RTm), with encapsulation {MPLSoUDP}, and an SHT value of 01. The ESI Label MAY be zero. A-D per ES route group 3, including (RTm+1..RTn), with encapsulation {GENEVE}, and an SHT value of 10. The ESI Label MUST have a valid, non-zero value, and the Ethernet option as defined in MUST be advertised. As per , it is the responsibility of the operator of a given EVI to ensure that all of the NVEs within that EVI support a common encapsulation. Failure to meet this condition may result in service disruption or failure.

All the security considerations described in are applicable to this document. Additionally, this document modifies the procedures for Split Horizon filtering as outlined in , offering operators a choice between Local Bias and ESI Label-based filtering for tunnels that support both methods. Misconfiguration of the desired Split Horizon Type (SHT) could lead to forwarding behaviors that differ from the intended configuration. Apart from this risk, this document describes procedures to ensure that all Provider Edge (PE) devices or Network Virtualization Edges (NVEs) connected to the same Ethernet Segment (ES) agree on a common SHT method, with a fallback to a default behavior in case of a mismatch in the SHT bits being advertised by any two PEs or NVEs in the Ethernet Segment. Consequently, unauthorized changes to the SHT configuration by an attacker on a single PE or NVE of the Ethernet Segment should not cause traffic disruption (as long as the SHT value is valid as per this document) but may result in alterations to forwarding behavior.

This document creates a registry called "EVPN ESI Label Extended Community Flags" for the 1-octet Flags field in the ESI Label Extended Community , as follows: Bit Position Name Reference 0-1 Multihoming Redundancy Mode 2-5 Unassigned 6-7 Split Horizon Type This Document This document also creates a registry for the "Multihoming Redundancy Mode" field of the EVPN ESI Label Extended Community Flags. This registry is called "Multihoming Redundancy Mode" and is initialized as follows: Value Multihoming redundancy mode Reference 00 All-Active mode 01 Single-Active mode 10 Unassigned 11 Unassigned Finally, a third registry for the "Split Horizon Type" field of the EVPN ESI Label Extended Community Flags is created by this document too. This registry is called "Split Horizon Type" and is initialized as follows: Value Split Horizon Type value Reference 00 Default SHT This document 01 Local Bias This document 10 ESI Label based filtering This document 11 Unassigned New registrations in the "EVPN ESI Label Extended Community Flags", "Multihoming Redundancy Mode", and "Split Horizon Type" registries will be made through the "IETF Review" procedure defined in . These registries are located in the "Border Gateway Protocol (BGP) Extended Communities" registry group.

The authors would like to thank Anoop Ghanwani, Gyan Mishra and Jeffrey Zhang for their review and useful comments. Thanks to Gunter van de Velde and Sue Hares as well, for their thorough review.