<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-kaliraj-idr-multinexthop-attribute-08"
     ipr="trust200902">
  <front>
    <title abbrev="BGP MultiNexthop attribute">BGP MultiNexthop
    Attribute</title>

    <author fullname="Kaliraj Vairavakkalai" initials="K." role="editor"
            surname="Vairavakkalai">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>kaliraj@juniper.net</email>
      </address>
    </author>

    <author fullname="Minto Jeyananth" initials="M." surname="Jeyananth">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>minto@juniper.net</email>
      </address>
    </author>

    <author fullname="Mohan Nanduri" initials="M" surname="Nanduri">
      <organization>Microsoft</organization>

      <address>
        <postal>
          <street>1 Microsoft Way,</street>

          <city>Redmond </city>

          <region>WA</region>

          <code>98052</code>

          <country>US</country>
        </postal>

        <email>mohannanduri@microsoft.com</email>
      </address>
    </author>

    <date day="10" month="July" year="2023"/>

    <abstract>
      <t>Today, a BGP speaker can advertise one nexthop for a set of NLRIs in
      an Update. This nexthop can be encoded in either the top-level
      BGP-Nexthop attribute (code 3), or inside the MP_REACH_NLRI attribute
      (code 14).</t>

      <t>This document defines a new optional non-transitive BGP attribute
      called "MultiNexthop (MNH)" with IANA BGP attribute type code TBD, that
      can be used to carry an ordered set of one or more Nexthops in the same
      route, with forwaring information scoped on a per nexthop basis.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119">RFC 2119</xref>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>Today, a BGP speaker can advertise one nexthop for a set of NLRIs in
      an Update. This nexthop can be encoded in either the top-level
      BGP-Nexthop attribute (code 3), or inside the MP_REACH_NLRI attribute
      (code 14).</t>

      <t>This document defines a new optional non-transitive BGP attribute
      called "MultiNexthop (MNH)" with IANA BGP attribute type code TBD, that
      can be used to carry an ordered set of one or more Nexthops in the same
      route, with forwaring information scoped on a per nexthop basis.</t>

      <t>A new BGP capability <xref target="RFC3392"/> called "MultiNexthop
      (MNH) Capability" is defined with IANA BGP capability type code: TBD.
      This capability is used to express the ability to send and receive MNH
      attribute.</t>

      <t/>
    </section>

    <section title="Terminology">
      <t>SN: Service Node</t>

      <t>iSN: Ingress Service Node</t>

      <t>eSN: Egress Service Node</t>

      <t>NLRI: Network Layer Reachability Information</t>

      <t>AFI: Address Family Identifier</t>

      <t>SAFI: Subsequent Address Family Identifier</t>

      <t>PE : Provider Edge</t>

      <t>RT : Route-Target extended community</t>

      <t>RD : Route-Distinguisher</t>

      <t>MPLS: Multi Protocol Label Switching</t>

      <t>ECMP: Equal Cost Multi Path</t>

      <t>WECMP: Weighted Equal Cost Multi Path</t>

      <t>FRR: Fast Re Route</t>

      <t>PNH : Protocol Next hop address carried in a BGP Update message</t>

      <t>MNH: BGP MultiNextHop attribute</t>

      <t>NFI: Nexthop Forwarding Information</t>

      <t>FI: Forwarding Instruction</t>

      <t>FA: Forwarding Argument</t>

      <section title="Definitions">
        <t>MULTI_NEXT_HOP (aka MNH): BGP MultiNexthop attribute. The new
        attribute defined by this document.</t>

        <t>MNH TLV: Top level TLV contained in a MULTI_NEXT_HOP.</t>

        <t>NFI TLV: Nexthop Forwarding Information TLV, contained in a MNH
        TLV.</t>

        <t>FI TLV: Forwarding Instruction TLV, contained in a NFI TLV.</t>

        <t>FA TLV: Forwarding Argument TLV, contained as an argument to a FI
        in the FI TLV.</t>

        <t>Service Family : BGP address family used for advertising routes for
        "data traffic" as opposed to tunnels (e.g. AFI/SAFIs 1/1 or
        1/128).</t>

        <t>Transport Family : BGP address family used for advertising tunnels,
        which are in turn used by service routes for resolution (e.g.
        AFI/SAFIs 1/4 or 1/76).</t>
      </section>
    </section>

    <section title="Motivation">
      <t/>

      <t>For cases where multiple nexthops need to be advertised, BGP Addpath
      <xref target="RFC7911"/> is used with some address families. On some
      other address families like Flowspec, nexthop addresses are carried in
      one or more extended communities of specific type.</t>

      <t>Though Addpath allows basic ability to advertise multiple-nexthops,
      it does not allow the sender to express the desired relationship between
      the multiple nexthops being advertised e.g., relative ordering, type of
      load balancing, fast reroute. These are local decisions at the receiving
      node based on local configuration and path selection between the various
      additional paths, which may tie-break on some arbitrary step like
      Router-Id or BGP nexthop address. Some scenarios with a BGP free core
      may benefit from having a mechanism, where egress node can signal
      multiple nexthops along with their relationship to ingress nodes.</t>

      <t>It would be desirable to have a common way to carry one or more
      nexthops on a BGP route of any family.</t>

      <t>This document defines a new optional non-transitive BGP attribute
      "MultiNexthop (MNH)" that can be used for this purpose.</t>

      <t>The MNH attribute can be used in any BGP family that wants to carry
      one or more nexthops, with forwaring information scoped on a per nexthop
      basis. E.g. The MNH can be used to advertise MPLS label along with
      nexthop for labeled and unlabeled families (e.g. Inet Unicast, Inet6
      Unicast, Flowspec) alike. Such that, mechanisms at the transport layer
      can work uniformly on labeled and unlabled BGP families to realize
      various usecases.</t>

      <t>The MNH plays different role in "downstream allocation" scenario than
      "upstream allocation" scenario. E.g. for <xref target="RFC8277"/>
      families that advertise downstream allocated labels, the MNH can play
      the "Label Descriptor" role, describing the forwarding semantics of the
      label being advertised. This can be useful in network visualization and
      controller based traffic engineering (e.g. EPE).</t>
    </section>

    <section title="Protocol Operations">
      <section title="BGP Capability for MNH Attribute">
        <t>A new BGP capability <xref target="RFC3392"/> called "MultiNexthop
        (MNH) Capability" is defined with IANA BGP capability type code: TBD.
        The MNH attribute MUST NOT be sent to a BGP speaker that has not
        negotiated the MNH capability. A BGP speaker SHOULD ignore the MNH
        attribute received from a peer which has not negotiated the MNH
        capability.</t>

        <t>The Capability Length field of this capability is 0. Advertising
        the MNH capability means the node is capable of sending and receiving
        the MNH attribute.</t>
      </section>

      <section title="Scope of Use, and Propagation">
        <t>The MNH attribute is intended to be used in a BGP free core,
        between egress and ingress BGP speakers that understand this
        attribute. These BGP speakers may have an intra-AS or inter-AS BGP
        session between them.</t>

        <t>To avoid un-intentionally leaking the MNH to another AS, via a BGP
        speaker that does not understand MNH attribute, it is defined as
        "optional non-transitive". But this also means that a RR needs to be
        upgraded to support this attribute before any PEs in the network can
        make use of it.</t>

        <t>If the MNH attribute is received on a BGP session where MNH
        capability was not negotiated, the attribute is ignored.</t>

        <t>When a BGP speaker receives the MNH attribute on a BGP session that
        negotiates the MNH capability, it propagates the attribute unchanged
        when readvertising the route with nexthop unchanged on a BGP session
        that negotiates the MNH capability. The BGP speaker excludes the MNH
        attribute when readvertising the route with nexthop unchanged on a BGP
        session that has not negotiated MNH capability.</t>

        <t>The MNH attribute capability negotiation provides additonal
        protection against unintentional propagation of this attribute on a
        EBGP session, when both BGP speakers understand MNH.</t>

        <t>Further, it is recommended to use export and import policy
        configuration to control propagating the MNH across AS boundaries,
        such that it is carried to AS that are under the same administrative
        control, but do not unintentionally get advertised to an AS outside
        this administrative control.</t>
      </section>

      <section title="Interaction of MNH with Nexthop (in attr codes 3, 14)">
        <t>When adding a MultiNexthop attribute to an advertised BGP route,
        the speaker MUST put the same next-hop address in the Advertising PNH
        field as it put in the Nexthop field inside MP_REACH_NLRI attribute if
        one exists, or the NEXT_HOP attribute.</t>

        <t>A speaker that recognizes the MNH attribute and does not change the
        PNH while readvertising the route, e.g. a Route Reflector, MUST
        propagate unchanged the MultiNexthop attribute in the readvertisement,
        satisfying the propagation scope constraints described in previous
        section.</t>

        <t>A speaker that recognizes MNH attribute and changes the PNH while
        readvertising the route MUST remove the MNH attribute in the
        readvertisement. The speaker MAY however add a new MNH attribute to
        the re-dvertisement. While doing so the speaker MUST record in the
        "Advertising PNH" field the same next-hop address as used in
        MP_REACH_NLRI attribute if one exists, or the NEXT_HOP attribute.</t>

        <t>A speaker receiving a MNH attribute SHOULD ignore it if the
        next-hop address contained in 'Advertising PNH' field is not the same
        as the nexthop address contained in MP_REACH_NLRI attribute if one
        exists, or the NEXT_HOP attribute.</t>

        <t>In case of <xref target="RFC2545"/>, the global (non link-local)
        IPv6 address should be used for this purpose.</t>

        <t>As specified in <xref target="RFC7606"/> BGP update message can
        contain no more than one instance of MP_REACH attribute or NEXT_HOP
        attribute. Similarly, a BGP update MUST contain only one instance of
        MNH attribute. If the MNH attribute (whether recognized or
        unrecognized) appears more than once in an UPDATE message, then all
        the occurrences of the attribute other than the first one SHALL be
        discarded and the UPDATE message will continue to be processed.</t>
      </section>

      <section title="Interaction with Addpath">
        <t><xref target="ADDPATH-GUIDELINES"/> suggests the following:</t>

        <t>"Diverse path: A BGP path associated with a different BGP next-hop
        and BGP router than some other set of paths. The BGP router associated
        with a path is inferred from the ORIGINATOR_ID attribute or, if there
        is none, the BGP Identifier of the peer that advertised the path."</t>

        <t>When selecting "diverse paths" for ADD_PATH as specified above, the
        MNH attribute should also be compared if it exists, to determine if
        two routes have "different BGP next-hop".</t>
      </section>

      <section title="Path Selection Considerations">
        <section title="Determining IGP Cost">
          <t>While tie breaking in the path-selection as described in <xref
          target="RFC4271"/>, 9.1.2.2. step (e) viz. the "IGP cost to
          nexthop", consider the highest cost among the nexthop-legs present
          in this attribute.</t>

          <t>The IGP cost thus calculated is also used when constructing AIGP
          TLV (<xref target="RFC7311"/>)</t>
        </section>
      </section>

      <section title="Denoting Upstream or Downstream Semantics">
        <t>MultiNexthop attribute may describe to a receiving speaker what the
        forwarding semantics of an Upstream-allocated label should be. This
        can be used with either labeled or unlabled BGP families.</t>

        <t>A MultiNexthop attribute may also play "Downstream signaled Label
        Descriptor" role. A BGP speaker advertising a route carrying
        downstream allocated MPLS label MAY add this attribute to the BGP
        route, to "describe" to the receiving speaker what the label's
        forwarding semantics is at the Egress node.</t>

        <t>Today semantics of a downstream-allocated label is known only to
        the egress node advertising the label. The speaker receiving the
        label-binding doesn't know what the label's forwarding semantic at the
        advertiser is. In some environments, it may be useful to convey this
        information to the receiving speaker. This may help in better
        debugging and manageability, or enable the receiving speaker, which
        could also be some centralized controller, make better decisions about
        which label to use, based on the label's forwarding-semantic.</t>

        <t>While doing upstream-label allocation, this attribute can be used
        to convey the forwarding-semantics at the receiving node should be.
        Details of the BGP protocol extensions required for signaling
        upstream-label allocation are out of scope of this document, and are
        described in <xref target="MPLS-NAMESPACES"/>.</t>

        <t>In rest of this document, the use of term "Label" will mean
        downstream allocated label, unless specified otherwise as
        upstream-allocated label.</t>

        <t>When using the MultiNexthop attribute for IP-routes, the Upstream
        role is used. Since IP prefixes are by nature upstream allocated,
        global scope.</t>
      </section>
    </section>

    <section title="Encoding of BGP MultiNexthop (MNH) Attribute">
      <t>"MultiNexthop (MNH)" is a new BGP optional non-transitive attribute
      (code TBD), that can be used to carry an ordered set of one or more
      Nexthops in the same route, with forwaring information scoped on a per
      nexthop basis. This attribute describes forwarding instructions using
      TLVs described in this document.</t>

      <t>This section describes the organization and encoding of the MNH
      attribute.</t>

      <figure>
        <preamble/>

        <artwork>  

    MNH Attribute: {
       Num[MNH TLV]
    }

    MNH TLV: {
        { Type, Nexthop Forwarding Information TLV }
    }

    Nexthop Forwarding Information TLV: {
        Num[Forwarding Instruction TLV] 
    }

    Forwarding Instruction TLV: {
        {FwdAction, Forwarding Argument TLVs}
    }
          </artwork>

        <postamble>Fig 1: Overview of MNH Attribute Layout - Eye candy
        summary.</postamble>
      </figure>

      <t>A MNH attribute consists of one of more "MNH TLVs". A MNH TLV
      contains a Type and one unit of Nexthop Forwarding Information (NFI
      TLV).</t>

      <t>A NFI TLV contains one or more Forwarding Instructions (FI TLV).</t>

      <t>A Forwarding Instruction TLV contains a "Forwarding Action" and one
      more "Forwarding Arguments" (FA TLVs). The Forwarding Arguments describe
      the parameters required to complete a Forwarding Action.</t>

      <figure>
        <artwork>
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  Attr. Flags  |Attr. Type Code|          Length               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |     MNH-Flags |  Advt-PNH-Len |       Advertising PNH ..      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                  .. Address                                   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       MNH TLV                                 ~
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    ~                       MNH TLV                                 |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    
        </artwork>

        <postamble>Fig 2: MultiNexthop - BGP Attribute.</postamble>
      </figure>

      <figure>
        <artwork>
- Attr. Flags (1 octet)
       BGP Path-attribute flags. indicating an Optional Non-Transitive
       attribute. i.e. Optional bit set, Transitive bit reset.

 - Attr. Type Code (1 octet)
        Type code allotted by IANA. TBD.

 - Length (1 or 2 octets)
       One or Two bytes field stating length of attribute value in bytes.

 - MNH-Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
       
       All bits are reserved. 
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.

 - Advt-PNH-Len (1 octet)
       Length in octets (4 for IPv4, 16 for IPv6, 12 for VPN-IPv4, 
       24 for VPN-IPv6) of Advertising PNH Address.

 - Advertising PNH Address (Advt-PNH-Len octets)
       BGP Protocol Nexthop address advertised in NEXT_HOP or MP_REACH_NLRI attr.
       Used to sanity-check the MNH attribute. In case of RFC-2545, this will be 
       the global (non link-local) IPv6 address.

 - MNH TLVs: One or more MNH TLVs are carried in a MNH attr. 
       MNH TLV is described in subsequent sections.

 </artwork>
      </figure>

      <section title="MNH TLV">
        <t>The type of MNH TLV describes how the forwarding information
        carried in the MNH TLV is used.</t>

        <figure>
          <preamble/>

          <artwork>
      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  MNH-TLV Flags| MNH. Type Code|          Length               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                              Value                            |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          </artwork>

          <postamble>Fig 3: MNH TLV</postamble>
        </figure>

        <figure>
          <artwork>

 - MNH-TLV Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
       
       All bits are reserved. 
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
           
  MNH Type Code        Meaning
 --------------     -------------
       0           None
       1           Upstream signaled primary forwarding path. 
       2           Upstream signaled backup forwarding path. 
       3           Downstream signaled Label Descriptor. 
       
 
 - Length
    Length of Value portion in octects.
 
 - Value
    Value portion contains the NFI TLV.
</artwork>
        </figure>

        <t/>

        <t>Type codes 1 and 2 are applicable for upstream allocated prefixes,
        example IP, MPLS, Flowspec routes.</t>

        <t>Type code 4 describes the forwarding behavior given to downstream
        allocated MPLS label, adveritsed in BGP route.</t>

        <t>Usage of Type code 1 in a BGP route containing IP prefix gives
        similar result as advertising the route with nexthop contained in BGP
        path-attributes: Nexthop (code 3) or MP_REACH_NLRI (code 14).</t>

        <t>Upstream allocation for MPLS routes is achieved by using mechanisms
        explained in <xref target="MPLS-NAMESPACES"/>.</t>

        <t>If an invalid Type Code (like 0) is received, the TLV is ignored
        gracefully handing the error.</t>

        <t>If an unknown Type Code is received, it SHOULD be ignored but
        propagated further when the MNH attribute is propagated, because
        nexthop is not changed.</t>

        <t>If the received Type Code is incompatible for the prefix in BGP
        NLRI, the TLV should be ignored.</t>

        <section anchor="upstr-prim"
                 title="Upstream Signaled Primary Forwarding Path">
          <t>Type Code = 1 means the TLV describes forwarding state to be
          programmed at receiving speaker as primary path nexthop leg. This
          TLV is used with Upstream allocated or global scope prefixes carried
          in BGP NLRI. Value part of this TLV contains Nexthop Forwarding
          Information TLV.</t>

          <t>A BGP speaker uses the nexthop forwarding information received in
          this TLV as a primary path nexthop leg when programming the route
          for the NLRI prefix in its Forwarding table.</t>

          <figure>
            <preamble/>

            <artwork>
      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  MNH-TLV Flags|  MNH Type = 1 |          Length               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |               Nexthop Forwarding Information TLV              |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          </artwork>

            <postamble>Fig 4: Upstream signaled Primary forwarding path
            TLV</postamble>
          </figure>
        </section>

        <section anchor="upstr-bkp"
                 title="Upstream Signaled Backup Forwarding Path">
          <t>Type Code = 2 means the TLV describes forwarding state to be
          programmed at receiving speaker as backup-path nexthop leg. This TLV
          is used with Upstream allocated prefixes or global scoped prefixes.
          Value part contains Nexthop Forwarding Information TLV.</t>

          <t>Signaling a different nexthop for use as backup path is desired
          in some labeled forwarding scenarios, where two multihomed edge
          devices use each other as backup path to protect traffic when
          primary path fails.</t>

          <t>This is required to avoid label advertisement oscillation between
          the multihomed PEs when they implement per-nexthop label allocation
          mode.</t>

          <t>The label advertised by a PE1 for primary path advertisement is
          allocated/forwarded using external paths as primary leg and
          backup-path label from other multihomed PE2 as backup-path label.
          Such that primary-path label allocation at PE1 is not a function of
          the primary-path label advertised by PE2. Thus the primary path
          label remains stable at a PE and does not change when a new primary
          path label is received from the other multihomed PE. This prevents
          the label oscillation problem.</t>

          <figure>
            <preamble/>

            <artwork>
      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  MNH-TLV Flags|  MNH Type = 2 |          Length               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |               Nexthop Forwarding Information TLV              |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          </artwork>

            <postamble>Fig 5: Upstream signaled Backup forwarding path
            TLV</postamble>
          </figure>

          <t>The backup path label allocated and advertised by a PE is a
          function of only the primary path. E.g. path to the CE device. So
          this label value does not change when a new label is received from
          the other multihomed PE</t>
        </section>

        <section anchor="dnstr-lbl-descr"
                 title="Downstream Signaled Label Descriptor.">
          <t>Type Code = 4 means the TLV describes forwarding state associated
          with downstream allocated MPLS label at the egress node identified
          in Endpoint FA TLV. Value part of this TLV contains Endpoint FA-TLV,
          Payload Info FA-TLV to identify the label being described, along
          with Nexthop Forwarding Information TLV that describes the
          forwarding state.</t>

          <t>Signaling what a label advertised in BGP route signifies is
          helpful for debugging. The information provided by label descriptor
          can enable new usecases like network visualization and off box EPE
          decisions.</t>

          <figure>
            <preamble/>

            <artwork>
      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  MNH-TLV Flags| MNH Type = 3  |          Length               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |            Endpoint Fwd Argument  TLV                         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |            Encap Info. Fwd Argument TLV                       |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |           Nexthop Forwarding Information TLV                  |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      Endpoint Fwd Argument  TLV:
          Specifies the IP endpoint. Section 5.5.1.

      Encap Info. Fwd Argument TLV:
          Specifies the Label value being described. Section 5.5.3.1.

      Nexthop Forwarding Information TLV:
          Indicates the forwarding state. Described in next section.
          
          </artwork>

            <postamble>Fig 6: Downstream signaled Label Descriptor
            TLV</postamble>
          </figure>

          <t>TBD: pointer to sec</t>
        </section>
      </section>

      <section title="Nexthop Forwarding Information TLV">
        <t>A Nexthop Forwarding Information TLV describes a MNH TLV. It
        contains one or more Forwarding Instruction TLVs. These Forwarding
        Instructions are the Forwarding Legs of the MNH.</t>

        <figure>
          <preamble/>

          <artwork>
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NFI  Flags   |      Num-Nexthops             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |        Forwarding Instruction TLV (F.I. TLV)                  ~
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      ~        Forwarding Instruction TLV (F.I. YLV)                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          </artwork>

          <postamble>Fig 7: Nexthop Forwarding Information TLV</postamble>
        </figure>

        <figure>
          <artwork>

 - NFI Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
       
       All bits are reserved. 
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.

 - Num-Nexthops
        Number of F.I. TLVs.

 - Forwarding Instruction TLV
        Each F.I. TLV describes a Nexthop Leg. 
        Layout of Forwarding Instruction TLV is described in next section.
    
</artwork>
        </figure>
      </section>

      <section title="Forwarding Instruction TLV">
        <t>Each Forwarding Instruction TLV describes a Nexthop Leg. It
        expresses a "Forwarding Action" (FwdAction) along with arguments
        required to complete the action. The type of actions defined by this
        TLV are given below. The arguments are denoted by "Forwarding Argument
        TLVs". The Forwarding Argument TLVs takes appropriate values based on
        the FwdAction.</t>

        <t>Each FwdAction should note the Arguments needed to complete the
        action. Any extranous arguments should be ignored. If the minimum set
        of arguments required to complete an action is not received, the
        Forwarding Instruction TLV should be ignored. Appropriate logging and
        diagnostic info MAY be provided by an implementation to help
        troubleshoot such scenarios.</t>

        <figure>
          <preamble/>

          <artwork>
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  F.I. Flags   |          Relative Pref        |  FwdAction    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |            Length             |  
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                   Fwd Argument TLV                            ~
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      ~                   Fwd Argument TLV                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      
          
          </artwork>

          <postamble>Fig 8: Forwarding Instruction TLV</postamble>
        </figure>

        <figure>
          <artwork>
  - F.I. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
       
       All bits are reserved. 
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
         
 - Relative Pref (2 octets)
 
     Unsigned 2 octet integer specifying relative order or preference, among
     the many forwarding instructions, to use in FIB. All usable nexthop legs 
     with lowest relative-pref are installed in FIB as primary-path. Thus if 
     multiple legs exist with that lowest relative-pref, ECMP is formed.

 FwdAction         Meaning
 ---------      -------------
       0        None
       1        Forward 
       2        Pop-And-Forward 
       3        Swap 
       4        Push 
       5        Pop-And-Lookup
       6        Replicate

   Forwarding Instruction TLV with unknown FwdAction should be ignored, skipped
   and rest of the attribute processed; gracefully handling the error. The event 
   may be appropriately logged for diagnosis.
   
 - Length (2 octets)
 
    Length in octets, of all Forwarding Argument TLVs.
    
</artwork>
        </figure>

        <t/>

        <t>Meaning of most of the above FwdAction semantics is well
        understood. FwdAction 1 is applicable for both IP and MPLS routes.
        FwdActions 2-5 are applicable for encapsulated payloads (like MPLS)
        only. FwdActions 1, 6 are applicable for Flowspec routes for Redirect
        and Mirror actions. FwdAction 6 can also be used to indicate multicast
        replication like functionality.</t>

        <t>The "Forward" action means forward the IP/MPLS packet with the
        destination prefix (IP-dest-addr/MPLS-label) value unchanged. For IP
        routes, this is the forwarding-action given for next-hop addresses
        contained in BGP path-attributes: Nexthop (code 3) or MP_REACH_NLRI
        (code 14). For MPLS routes, usage of this action is equivalent to SWAP
        with same label-value; one such usage is explained in <xref
        target="MPLS-NAMESPACES"/> when Upstream-label-allocation is in
        use.</t>

        <t>The "Pop-And-Forward" action means Pop the payload header (e.g.
        MPLS-label) and forward the payload towards the Nexthop IP-address
        specified in the Endpoint Id TLV, using appropriate encapsulation to
        reach the Nexthop.</t>

        <t>When applied to MPLS packet, the "Pop-And-Lookup" action may result
        in a MPLS-lookup or an upper-layer header (like IPv4, IPv6) lookup,
        depending on whether the label that was popped was the bottom of stack
        label.</t>

        <t>If an incompatible FwdAction is received for a prefix-type, or an
        unsupported FwdAction is received, it is considered a semantic-error
        and MUST be dealt with as explained in "Error handling procedures"
        section.</t>
      </section>

      <section title="Forwarding Argument TLV">
        <t>The Forwarding Argument TLV describes various parameters required
        to execute a FwdAction.</t>

        <t/>

        <figure>
          <preamble/>

          <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code            |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     |     Value                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          </artwork>

          <postamble>Fig 9: Forwarding Argument TLV</postamble>
        </figure>

        <figure>
          <artwork>
 - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
       
       All bits are reserved. 
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.


  F.A. Type Code  Meaning
  -------------  ---------
     0           None
     1           Endpoint Identifier
     2           Path Constraints
     3           Payload encapsulation info signaling 
     4           Endpoint attributes advertisement

 - Length (2 octets)

    Length in bytes of Value field.

           </artwork>
        </figure>

        <section title="Endpoint Identifier">
          <t>F.A. Type Code = 1. This Forwarding Argument TLV identifies an
          Endpoint of different types.</t>

          <t/>

          <figure>
            <preamble/>

            <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code =1         |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Endpoint Type |  Endpoint Len | Endpoint Value|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  Endpoint Value                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          
          </artwork>

            <postamble>Fig 10: Endpoint Identifier TLV</postamble>
          </figure>

          <figure>
            <artwork>

 - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.

 - Length (2 octets)
    Length in bytes of Value field.


  Endpoint Type   Value                    Len (octets)
  -------------  ---------                ---------------------
     0           None
     1           IPv4 Address                4
     2           IPv6 Address                16
     3           MPLS Label (Upstream        4 
                            allocated or 
                            Global scope)
     4           Fwd Context RD              8
     5           Fwd Context RT              8

 - Endpoint Len (1 octet)

    Length in bytes of Endpoint Value field.
 
           </artwork>
          </figure>
        </section>

        <section title="Path Constraints">
          <t>F.A. Type Code = 2. This Forwarding Argument TLV defines
          constraints for path to the Endpoint.</t>

          <t/>

          <figure>
            <preamble/>

            <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code = 2        |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | ConstrainType | Constrain Len | ConstrainValue|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  ConstrainValue                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          
          </artwork>

            <postamble>Fig 11: Path Constraints TLV</postamble>
          </figure>

          <figure>
            <artwork>

   - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
       
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
   - Length (2 octets)
       Length in bytes of Value field.

  ConstrainType             Value                Len (octets)
  -------------  -------------------------    ---------------------
     0           None
     1           Proximity check                 2
     2           Transport Class ID (Color)      4
     3           Load balance factor             2

  - Constrain Len (1 octet)

    Length in bytes of Constrain Value field.

   - Proximity check Flags (2 octets)
        Flags describing whether the nexthop endpoint is expected to be single hop 
        away, or multihop away. Format of flags is described in next section.

   - Transport Class ID (Color):
   
    This is a 32 bit identifier, associated with the Nexthop address.
    The Nexthop IP-address specified in "Endpoint Identifier" TLVs
    are resolved over tunnels of this color. 
    Defined in [BGP-CT] [draft-kaliraj-idr-bgp-classful-transport-planes]
             
   - Load balance factor (2 octets)
          Balance Percentage 
              
           </artwork>
          </figure>

          <section title="Proximity Check">
            <t>Usually EBGP singlehop received routes are expected to be one
            hop away, directly connected. And IBGP received routes are
            expected to be multihop away. Implementations today provide
            configuring exceptions to this rule.</t>

            <t>The 'expected proximity' of the Nexthop can be signaled to the
            receiver using the Proximity check flags. Such that irrespective
            of whether the route is received from IBGP/EBGP peer, it can be
            treated as a single-hop away or multihop away nexthop.</t>

            <t>The format of the Proximity check Sub-TLV is as follows:</t>

            <figure>
              <preamble/>

              <artwork>
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |  F.A. Flags   |     F.A. Type Code = 2        |  Length       |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |    Length     |ConstrainType=1|  Len = 2      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |       Proximity Check Flags   |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.

  - Length (2 octets)
       Length in bytes of Value field.

  - Proximity check Flags (2 octets)

           0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |S M R R R R R R R R R R R R R R|
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       
         
           S: Restrict to Singlehop path.
           M: Expect Multihop path.
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
           
</artwork>

              <postamble>Fig 12: "Proximity check sub-TLV" sub-TLV</postamble>
            </figure>

            <t>This TLV would be valid with Forwarding Instructions TLV with
            FwdAction of Forward, Pop-And-Forward, Swap or Push.</t>

            <t>When S bit is set, receiver considers the nexthop valid only if
            it is directly connected to the receiver.</t>

            <t>When M bit is set, receiver assumes that the nexthop can be
            multiple hops away, and resolves the path to the nexthop via
            another route.</t>

            <t>When both S and M bits are set, M bit behavior takes
            precedence. When both S and M bits are Clear, the current behavior
            of deriving proximity from peer type (EBGP is singlehop, IBGP is
            multihop) is followed.</t>
          </section>

          <section title="Transport Class ID (Color)">
            <t>The Nexthop can be associated with a Transport Class, so as to
            resolve a path that satisfies required Transport tunnel
            characteristics. Transport Class is defined in <xref
            target="BGP-CT"/></t>

            <t>Transport Class is a per-nexthop scoped attribute. Without MNH,
            the Transport class is applied to the nexthop IP-address encoded
            in the BGP-Nexthop attribute (code 3), or inside the MP_REACH_NLRI
            attribute (code 14). With MNH, the Transport Class can be
            specified per Nexthop-Leg (Forwarding Instruction TLV). It is
            applied to the IP-address encoded in the Endpoint Identifier TLV
            of type "IPv4 Address", "IPv6 Address" , "MPLS Label (Upstream
            allocated or Global scope)".</t>

            <t>The format of the Transport Class ID Sub-TLV is as follows:</t>

            <figure>
              <preamble/>

              <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code = 2        |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     |ConstrainType=2|  Len = 4      | Transport..   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  .. Class ID (4 bytes)        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.

  - Length (2 octets)
       Length in bytes of Value field.

  - Transport Class ID (Color):
    This is a 32 bit identifier, associated with the Nexthop address.
    The Nexthop specified in Endpoint Identifier TLVs
    are resolved over tunnels of this color. 
  Defined in [BGP-CT] [draft-kaliraj-idr-bgp-classful-transport-planes]
</artwork>

              <postamble>Fig 12: "Transport Class ID (Color)"
              sub-TLV</postamble>
            </figure>

            <t/>

            <t>This TLV would be valid with Forwarding Instructions TLV with
            FwdAction of Forward, Swap or Push.</t>
          </section>

          <section anchor="lb-perc" title="Load Balance Factor">
            <t/>

            <figure>
              <preamble/>

              <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code = 3        |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     |ConstrainType=3|  Len = 2      |   Balance..   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|.. Percentage  |
+-+-+-+-+-+-+-+-+

 - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
 - Length (2 octets)
       Length in bytes of Value field.

 - Len (1 octet)
    Length of the Constrain Value field.
    
 - Balance Percentage:
    This is the explicit "balance percentage" requested by the sender,
    for unequal load-balancing over these Nexthop-Descriptor-TLV legs.
    This balance percentage would override the implicit
    balance-percentage calculated using "Bandwidth" attribute
    sub-TLV.
</artwork>

              <postamble>Fig 13: "Load-Balance-Factor" sub-TLV</postamble>
            </figure>

            <t/>

            <t>This sub-TLV would be valid with Forwarding Instructions TLV
            with FwdAction of Forward, Swap or Push.</t>

            <t>This is the explicit "balance percentage" requested by the
            sender, for unequal load-balancing over these
            Nexthop-Descriptor-TLV legs. This balance percentage would
            override the implicit balance-percentage calculated using
            "Bandwidth" attribute sub-TLV</t>

            <t>When the sum of "balance percentage" on the nexthop legs does
            not equal 100, it is scaled up or down to match 100. The
            individual balance percentages in each nexthop leg are also scaled
            up or down proportionally to determine the effective balance
            percentage per nexthop leg.</t>
          </section>
        </section>

        <section title="Payload Encapsulation Info">
          <t>F.A. Type Code = 3. This Forwarding Argument TLV defines payload
          encapsulation information.</t>

          <t/>

          <figure>
            <preamble/>

            <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code =3         |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Encap Type  |         Encap Len               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Encap Value                                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          
          </artwork>

            <postamble>Fig 12: Payload encapsulation info signaling
            TLV</postamble>
          </figure>

          <figure>
            <artwork>

 - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
 - Length (2 octets)
       Length in bytes of Value field.

   Endcap Type        Value               
  -------------  -------------- 
     0           None         
     1           MPLS Label Info          
     2           SR MPLS label Index Info
     3           SRv6 SID info 
     4           DSCP code point

 - Encap Len (2 octets)

    Length in octets of Encap Value field.

           </artwork>
          </figure>

          <section title="MPLS Label Info">
            <figure>
              <preamble/>

              <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code =3         |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Encap Type=1 |          Encap Len             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       Flags (2 bytes)        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MPLS Label (20 bits) |Rsrv |S~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~ MPLS Label (20 bits) |Rsrv |S|
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          </artwork>

              <postamble>Fig 13: MPLS Label Info.</postamble>
            </figure>

            <figure>
              <preamble/>

              <artwork> 

  - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
  - Length (2 octets)
       Length in bytes of Value field.

  - Encap Type
  	  = 1, to signify MPLS Label Info.
  	  
  - Encap Len (2 octets)
       Length in bytes of following Encap Value field.

  - Flags (2 octets):

       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |E R R R R R R R R R R R R R R R|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       E: ELC bit. Indicates if this egress NH is Entropy Label Capable.
             1 means the Entropy Label capable.
             0 means not capable to handle Entropy Label.

       R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.

  - MPLS Label, Rsrv, S bit.
      20 bit MPLS Label stack encoded as in RFC 8277. 
      S bit set on last label in label stack.

      
          </artwork>
            </figure>
          </section>

          <section title="SR MPLS Label Index Info">
            <figure>
              <preamble/>

              <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code =3         |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Encap Type=2 |            Encap Len           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   RESERVED    |       LI Flags                |    Label ..   |        
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                ..Index                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          </artwork>

              <postamble>Fig 13: SR MPLS Label Index Info.</postamble>
            </figure>

            <figure>
              <preamble/>

              <artwork> 

  - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7 
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
  - Length (2 octets)
       Length in bytes of Value field.

  - Encap Type
  	  = 2, to signify SR MPLS SID Info.
  	  
  - Encap Len (2 octets)
       Length in bytes of following Encap Value field.
  
  Rest of the value portion is encoded as specified in RFC-8669 sec 3.1.
  
  - RESERVED:  8-bit field. MUST be set to zero, SHOULD be ignored by receiver.

  - LI Flags:  16 bits of flags. None defined. MUST be set to zero, SHOULD be ignored by receiver.
  
  - Label Index:  
      32-bit value representing the index value in the SRGB space.
         
          </artwork>
            </figure>
          </section>

          <section title="SRv6 SID Info">
            <figure>
              <preamble/>

              <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code =3         |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Encap Type=3 |           Encap Len            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         .. SRv6 SID Info (variable)                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          </artwork>

              <postamble>Fig 13: SRv6 SID Info.</postamble>
            </figure>

            <figure>
              <preamble/>

              <artwork> 

  - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7 
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
  - Length (2 octets)
       Length in bytes of Value field.

  - Encap Type
  	  = 3, to signify SR MPLS SID Info.
  	  
  - Encap Len (2 octets)
       Length in bytes of following Encap Value field.
  
  - SRv6 SID Info:
        SRv6 SID Information, as specified in RFC-9252 sec 3.1.

          </artwork>
            </figure>
          </section>

          <section title="DSCP">
            <figure>
              <preamble/>

              <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code = 3        |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Encap Type=4 |           Encap Len            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|DSCP code point|
+-+-+-+-+-+-+-+-+

          </artwork>

              <postamble>Fig 14: Carrying DSCP.</postamble>
            </figure>

            <figure>
              <preamble/>

              <artwork> 

  - F.A. Flags (1 octet)

           0 1 2 3 4 5 6 7 
          +-+-+-+-+-+-+-+-+
          |R R R R R R R R|
          +-+-+-+-+-+-+-+-+
 
           R: Reserved. MUST be set to zero, SHOULD be ignored by receiver.
  - Length (2 octets)
       Length in bytes of Value field.

  - Encap Type
  	  = 4, to signify DSCP code point.
  	  
  - Encap Len (2 octets)
      = 1, Length in bytes of following Encap Value field.
  
  - DSCP code point:
        DS Field, as specified in RFC-2474 sec 3.

          </artwork>
            </figure>
          </section>
        </section>

        <section title="Endpoint Attributes">
          <t>F.A. Type Code = 4. This Forwarding Argument TLV defines
          attributes of an endpoint.</t>

          <t/>

          <figure>
            <preamble/>

            <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code = 4        |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Attrib Type  |    Attr Len    |  Attr  Value  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Attr Value                                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          
          </artwork>

            <postamble>Fig 12: Endpoint attributes advertisement
            TLV</postamble>
          </figure>

          <figure>
            <artwork>

   EP Attrib Type      Attrib Value               Attrib Len (octets)
  ----------------  ------------------            ---------------------
     0               None
     1               Available Bandwidth             8
     
           </artwork>
          </figure>

          <section anchor="avail-bw" title="Available Bandwidth">
            <t/>

            <figure>
              <preamble/>

              <artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  F.A. Flags   |     F.A. Type Code = 4        |  Length       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Length     | Attrib Type 1|    Attr Len=8  |  Attr  Value  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Bandwidth (8 octets)                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Bandwidth (contd.)                          | 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

- Len (2 octets)
    Length in bytes of remaining portion of SubTLV.

- Bandwidth
    The bandwidth of the link expressed as 8 octets,
    units being bits per second.
</artwork>

              <postamble>Fig 6: "Available Bandwidth" attribute
              sub-TLV</postamble>
            </figure>

            <t/>

            <t>This sub-TLV would be valid with Forwarding Instruction TLV
            with FwdAction of Forward, Swap or Push.</t>
          </section>
        </section>
      </section>
    </section>

    <section title="Error Handling">
      <t/>

      <t>With MNH TLV Type = 4 (Downstream signaled Label Descriptor), this
      attribute is used to describe the label advertised by the BGP-peer. If
      the value in the attribute is syntactically parse-able, but not
      semantically valid, the receiving speaker should deal with the error
      gracefully and MUST NOT tear down the BGP session. In such cases the
      rest of the BGP-update can be consumed if possibe.</t>

      <t>With other MNH TLV Types, this attribute is used to specify the
      forwarding action at the receiving BGP-peer. If the value in the
      attribute is syntactically parse-able, but not semantically valid, the
      receiving speaker SHOULD deal with the error gracefully by ignoring the
      MNH attribute, and continue processing the route. It MUST NOT tear down
      the BGP session.</t>

      <t>If a MNH TLV Type = 4 is received for an IP-route (SAFI Unicast), the
      MNH attribute SHOULD be ignored. Because IP route prefixes are upstream
      allocated by nature.</t>

      <t>If a MNH TLV Type = 4 is received for an <xref
      target="MPLS-NAMESPACES"/> route, the MNH attribute SHOULD be ignored.
      Because the label prefix in MPLS-NAMESPACE family routes is upstream
      allocated.</t>

      <t>The receiving BGP speaker MAY consider the "Num-Nexthops" value in a
      Nexthop Forwarding Information TLV not acceptable, based on it's
      forwarding capabilities. In such cases, the MNH attribute SHOULD be
      considered Unusable, and not be used, ignored on receipt. The condition
      SHOULD be dealt gracefully and MUST NOT tear down the BGP session.</t>

      <t>A TLV or sub-TLV of a certain Type in a MNH attribute can occur only
      once, unless specified otherwise by that type value. If multiple
      instances of such TLV or sub-TLV is received, the instances other than
      the first occurance are ignored.</t>

      <t>If a TLV or sub-TLV of an unknown Type value is received, it is
      ignored and skipped. Remaining part of the MNH attribute if parseable is
      used</t>

      <t>In case of length errors inside a TLV, such that the MNH attribute
      cannot be used, but the length value in MNH attribute itself is proper,
      the MNH attribute should be considered invalid and not used. But rest of
      the route update if parseable should be used. This follows the
      'Attribute discard' approach described in <xref target="RFC7606"/>
      Section 2.</t>
    </section>

    <section title="Scaling Considerations">
      <t>The MNH attribute allows receiving multiple nexthops on the same BGP
      session. This flexibility also opens up the possibility that a peer can
      send large number of multipath (ECMP/UCMP/FRR) nexthops that may
      overwhelm the local system's forwarding plane. Prefix-limit based checks
      will not avoid this situation.</t>

      <t>To keep the scaling limits under check, a BGP speaker MAY keep
      account of number of unique multipath nexthops that are received from a
      BGP peer, and impose a configurable max-limit on that. This is
      especially useful for EBGP peers.</t>

      <t>A good scaling property of conveying multipath nexthops using the MNH
      attribute with N nexthop legs on one BGP session, as against BGP routes
      on N BGP sessions is that, it limits the amount of transitionary
      multipath combinatorial state in the latter model. Because the final
      multipath state is conveyed by one route update in deterministic manner,
      there is no transitionary multipath combinatorial explosion created
      during establishment of N sessions.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document makes request to IANA to allocate the following codes
      in BGP attributes registry.</t>

      <section title="BGP Path Attributes">
        <t>A new BGP attribute code TBD for "BGP MultiNexthop Attribute
        (MULTI_NEXT_HOP)", in "BGP Path Attributes" registry.</t>
      </section>

      <section title="Capability Codes">
        <t>This document makes request to IANA to allocate a BGP capability
        code TBD for "BGP MultiNexthop Attribute (MULTI_NEXT_HOP)".</t>
      </section>

      <section title="Registries for MULTI_NEXT_HOP">
        <t>This document creates the following sub registries for TLVs and
        Sub-TLVs within MultiNextHop attribute.</t>

        <t>1. Registry of Type codes in "MULTI_NEXT_HOP TLV"</t>

        <figure>
          <artwork>
     Registration Procedure(s)
                 Expert Review
     Expert(s)
                 Kaliraj Vairavakkalai
     Reference
                 draft-kaliraj-idr-multinexthop-attribute
                 
      MNH Type Code        Meaning
     --------------     -------------
       0              None
       1              Upstream signaled primary forwarding path.
       2              Upstream signaled backup forwarding path.
       3              Downstream signaled Label Descriptor.
  
           </artwork>
        </figure>

        <t>2. Registry of FwdAction values in MNH "Forwarding Instruction
        TLV"</t>

        <figure>
          <artwork>
     Registration Procedure(s)
                 Expert Review
     Expert(s)
                 Kaliraj Vairavakkalai
     Reference
                 draft-kaliraj-idr-multinexthop-attribute


      FwdAction         Meaning
      ---------      -------------
       0        None
       1        Forward
       2        Pop-And-Forward
       3        Swap
       4        Push
       5        Pop-And-Lookup
       6        Replicate
  
           </artwork>
        </figure>

        <t>3. Registry of Type codes in MNH "Forwarding Arguments TLV".</t>

        <figure>
          <artwork>

     Registration Procedure(s)
                 Expert Review
     Expert(s)
                 Kaliraj Vairavakkalai
     Reference
                 draft-kaliraj-idr-multinexthop-attribute

     F.A. Type Code      Meaning
     ---------------   ------------------
        0              None
        1              Endpoint Identifier
        2              Path Constraints
        3              Payload encapsulation info signaling
        4              Endpoint attributes advertisement
  
           </artwork>
        </figure>

        <t>4. Registry of Endpoint Types in MNH "Endpoint Identifier TLV"
        Forwarding Argument.</t>

        <figure>
          <artwork>
     Registration Procedure(s)
                 Expert Review
     Expert(s)
                 Kaliraj Vairavakkalai
     Reference
                 draft-kaliraj-idr-multinexthop-attribute

      Endpoint Type   Value
     -------------  ---------
        0           None
        1           IPv4 Address
        2           IPv6 Address
        3           MPLS Label
        4           Fwd Context RD
        5           Fwd Context RT
  
           </artwork>
        </figure>

        <t>5. Registry of Constrain Types in MNH "Path Constrain TLV"
        Forwarding Argument.</t>

        <figure>
          <artwork>
     Registration Procedure(s)
                 Expert Review
     Expert(s)
                 Kaliraj Vairavakkalai
     Reference
                 draft-kaliraj-idr-multinexthop-attribute

     ConstrainType             Value
     -------------  -------------------------
       0             None
       1             Proximity check
       2             Transport Class ID (Color)
       3             Load balance factor
  
           </artwork>
        </figure>

        <t>6. Registry of Encap Types in MNH "Payload Encapsulation Info TLV"
        Forwarding Argument.</t>

        <figure>
          <artwork>
     Registration Procedure(s)
                 Expert Review
     Expert(s)
                 Kaliraj Vairavakkalai
     Reference
                 draft-kaliraj-idr-multinexthop-attribute


      Encap Type        Value
    -------------  --------------
      0           None
      1           MPLS Label Info
      2           SR MPLS label Index Info
      3           SRv6 SID info
      4           DSCP code point
  
           </artwork>
        </figure>

        <t>7. Registry of Endpoint Attribute Types in MNH "Endpoint attributes
        TLV" Forwarding Argument.</t>

        <figure>
          <artwork>
     Registration Procedure(s)
                 Expert Review
     Expert(s)
                 Kaliraj Vairavakkalai
     Reference
                 draft-kaliraj-idr-multinexthop-attribute


     EP Attrib Type      Attrib Value      
     ----------------  ------------------ 
       0               None 
       1               Available Bandwidth 
     
           </artwork>
        </figure>

        <t>Note to RFC Editor: this section may be removed on publication as
        an RFC.</t>
      </section>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>The attribute is defined as optional non-transitive BGP attribute,
      such that it does not accidentally get propagated or leaked via BGP
      speakers that dont support this feature, especially does not
      unintentionally leak across EBGP boundaries.</t>
    </section>

    <section anchor=" Contributors" numbered="false" title=" Contributors">
      <author fullname="Reshma Das" initials="D." surname="Das">
        <organization>Juniper Networks, Inc.</organization>

        <address>
          <postal>
            <street>1133 Innovation Way,</street>

            <city>Sunnyvale</city>

            <region>CA</region>

            <code>94089</code>

            <country>US</country>
          </postal>

          <email>dreshma@juniper.net</email>
        </address>
      </author>

      <author fullname="Natrajan Venkataraman" initials="N."
              surname="Venkataraman">
        <organization>Juniper Networks, Inc.</organization>

        <address>
          <postal>
            <street>1133 Innovation Way,</street>

            <city>Sunnyvale</city>

            <region>CA</region>

            <code>94089</code>

            <country>US</country>
          </postal>

          <email>natv@juniper.net</email>
        </address>
      </author>
    </section>

    <section anchor="Acknowledgements" numbered="false"
             title="Acknowledgements">
      <t>Thanks to Jeff Haas, Robert Raszuk, Ron Bonica for the review,
      discussions and input to the draft.</t>

      <t>Thanks to Blaine Williams and Satya Mohanty for the discussions on
      some usecases.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3392.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7606.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8277.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2545.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7311.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4271.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7911.xml"?>
    </references>

    <references title="Informative References  ">
      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2474.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"?>

      <reference anchor="MPLS-NAMESPACES"
                 target="https://datatracker.ietf.org/doc/html/draft-kaliraj-bess-bgp-sig-private-mpls-labels-06">
        <front>
          <title abbrev="MPLS-NAMESPACES">BGP Signaled MPLS Namespaces</title>

          <author fullname="Kaliraj" initials="" role="editor"
                  surname="Vairavakkalai"/>

          <date day="10" month="07" year="2023"/>
        </front>
      </reference>

      <reference anchor="BGP-CT"
                 target="https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-ct-12">
        <front>
          <title abbrev="BGP-CT">BGP Classful Transport Planes</title>

          <author fullname="Kaliraj" initials="" role="editor"
                  surname="Vairavakkalai"/>

          <author fullname="Natarajan" initials="" role="editor"
                  surname="Venkataraman"/>

          <date day="10" month="07" year="2023"/>
        </front>
      </reference>

      <reference anchor="FLWSPC-REDIR-IP"
                 target="https://datatracker.ietf.org/doc/html/draft-ietf-idr-flowspec-redirect-ip#section-3">
        <front>
          <title abbrev="FLWSPC-REDIR-IP">BGP Flow-Spec Redirect to IP
          Action</title>

          <author fullname="Adam" initials="" role="editor" surname="Simpson"/>

          <date day="2" month="2" year="2015"/>
        </front>
      </reference>

      <reference anchor="SRTE-COLOR-ONLY"
                 target="https://tools.ietf.org/html/draft-filsfils-spring-segment-routing-policy-06#section-8.8.1">
        <front>
          <title abbrev="SRTE-COLOR-ONLY">BGP Flow-Spec Redirect to IP
          Action</title>

          <author fullname="Clarence" initials="" role="editor"
                  surname="Filsfils"/>

          <date day="21" month="2" year="2018"/>
        </front>
      </reference>

      <reference anchor="ADDPATH-GUIDELINES"
                 target="https://datatracker.ietf.org/doc/html/draft-ietf-idr-add-paths-guidelines-08#section-2">
        <front>
          <title abbrev="ADDPATH-GUIDELINES">BGP Flow-Spec Redirect to IP
          Action</title>

          <author fullname="Jim" initials="" role="editor" surname="Uttaro"/>

          <date day="25" month="4" year="2016"/>
        </front>
      </reference>
    </references>

    <section anchor="AppendixA" title="Example of Usecases">
      <t>This section describes various example usecases of the MNH
      attribute.</t>

      <section numbered="true" title="Signaling WECMP to Ingress Node">
        <t>This section describes how MNH can be used to provide weighted
        equal cost multipath in a network fabric, while not increasing RIB
        scale.</t>

        <figure anchor="topo-wecmp" suppress-title="false"
                title="Inter-AS Option C Network with a domain in untrusted zone">
          <artwork align="left" xml:space="preserve">
                                   [RR1]  
                                     .        
                   . +-[P21]         |
                  .  +-[P22]         __
                 .   +-[P23]      _.(  )..           
            [R1].    +-[P24] ..  (_      _) .. [R2]     
                 .   +-[P25]       (._..)
                  .                       
                   . +-[P2n]     
                                    
                      
                    &lt;---- Traffic Direction ----

</artwork>
        </figure>

        <t><xref target="topo-wecmp"/> shows a network with BGP speaker R1
        connected to a number of routers P21 .. P2n in its region. R1 is eSN
        and R2 is iSN for the IP traffic in consideration. BGP service
        families IPv4 Unicast (AFI/SAFI: 1/1) and IPv6 Unicast (AFI/SAFI: 2/1)
        are negotiated on the BGP sessions between RR1 - R1 and RR1 - R2. RR1
        reflects the BGP routes between R1 and R2 with next hop unchanged.</t>

        <t>When MNH is not in use, R1 advertises "n" BGP Addpath routes for a
        service prefix Pfx1, each having a distinct next hop, P21 .. P2n, and
        desired Link Bandwidth Extended Community. These Addpath routes will
        be received by R2, which can do WECMP based on the Link Bandwidth
        Extended Communities attached on the routes. This model increases RIB
        scale by "n" times, so that WECMP can be achieved.</t>

        <t>When MNH is used in this network, R1 advertises a single BGP route
        for prefix Pfx1, which contains a MNH attribute with "n" next hops,
        each carrying the desired link bandwidth using <xref
        target="lb-perc"/> or <xref target="avail-bw"/></t>

        <t>This allows achieving WECMP in the network without increasing RIB
        scale.</t>
      </section>

      <section title="Signaling Optimal Forwarding Exitpoints to Ingress Node">
        <t>In a BGP free core, one can dynamically signal to the ingress-node,
        how traffic should be load-balanced towards a set of exit nodes, in
        one BGP-route containing this attribute.</t>

        <t>Example, for prefix1, perform equal load balancing towards exit
        nodes A, B; where as for prefix2, perform weighted load balancing
        (40%, 30%, 30%) towards exit nodes A, B, C.</t>

        <t>Example, for prefix1, use PE1 as primary-nexthop and use PE2 as a
        backup-nexthop.</t>
      </section>

      <section title="Choosing a Received Label Based on it's Forwarding Semantic at Advertising Node">
        <t>In Downstream label allocation case, the MNH plays role of "Label
        descriptor" and describes the forwarding treatment given to the label
        at the advertising speaker. The receiving speaker can benefit from
        this information as in the following examples:</t>

        <t>- For a Prefix, a label with FRR enabled nexthop-set can be
        preferred to another label with a nexthop-set that doesn't provide
        FRR.</t>

        <t>- For a Prefix, a label pointing to 10g nexthop can be preferred to
        another label pointing to a 1g nexthop</t>

        <t>- Set of labels advertised can be aggregated, if they have same
        forwarding semantics (e.g. VPN per-prefix-label case)</t>
      </section>

      <section title="Signaling Desired Forwarding Behavior for MPLS Upstream labels at Receiving Node">
        <t>In Upstream label allocation case, the receiving speaker's
        forwarding-state can be controlled by the advertising speaker, thus
        enabling a standardized API to program desired MPLS forwarding-state
        at the receiving node. This is described in the <xref
        target="MPLS-NAMESPACES"/></t>
      </section>

      <section title="Load Balancing over EBGP Parallel Links">
        <t>Consider N parallel links between two EBGP speakers. There are
        different models possible to do load balancing over these links:<list>
            <t>N single-hop EBGP sessions over the N links. Interface
            addresses are used as next-hops. N copies of the RIB are exchanged
            to form N-way ECMP paths. The routes advertised on the N sessions
            can be attached with Link bandwidth comunity to perform weighted
            ECMP.</t>

            <t>1 multi-hop EBGP session between loopback addresses, reachable
            via static route over the N links. Loopback addresses are used as
            next-hops. 1 copy of the RIB is exchanged with loopback address as
            nexthop. And a static route can be configured to the loopback
            address to perform desired N-way ECMP path. M loopbacks are
            configured in this model, to achieve M different load balancing
            schemes: ECMP, weighted ECMP, Fast-reroute enabled paths etc.</t>

            <t>1 multi-hop EBGP session between loopback addresses, reachable
            via static route over the N links. Interface addresses are used as
            next-hops, without using additional loopbacks. 1 copy of the RIB
            is exchanged with MNH attribute to form N-way ECMP paths, weighted
            ECMP, Fast-reroute backup paths etc. BFD may be used to these
            directly connected BGP nexthops to detect liveness.</t>
          </list></t>
      </section>

      <section title="Flowspec Routes with Multiple &quot;Redirect IP&quot; next hops">
        <t>There are existing protocol machinery which can benefit from the
        ability of MNH to clearly specify fallback behavior when multiple
        nexthops are involved. One example is the scenario described in <xref
        target="FLWSPC-REDIR-IP"/> where multiple Redirect-to-IP nexthop
        addresses exist for a Flowspec prefix. In such a scenario, the
        receiving speakers may redirect the traffic to different nexthops,
        based on variables like IGP-cost. If instead, the MNH was used to
        specify the redirect-to-IP nexthop, then the order of preference
        between the different nexthops can be clearly specified using one
        flowspec route carrying a MNH containing those different
        nexthop-addresses specifying the desired preference-order. Such that,
        irrespective of IGP-cost, the receiving speakers will redirect the
        flow towards the same traffic collector device.</t>
      </section>

      <section title="Color-Only Resolution next hop">
        <t>Another existing protocol machinery that manufactures nexthop
        addresses from overloaded extended color community is specified in
        <xref target="SRTE-COLOR-ONLY"/>. In a way, the color field is
        overloaded to carry one anycast BGP next-hop with pre-specified
        fallback options. This approach gives us only two next-hops to play
        with. The 'BGP nexthop address' and the 'Color-only nexthop'</t>

        <t>Instead, the MNH could be used to achieve the same result with more
        flexibility. Multiple BGP nexthops can be carried, each resolving over
        a desired Transport class (Color), and with customizable fallback
        order. And the solution will work for non-SRTE networks as-well.</t>
      </section>

      <section title="Avoid Label Advertisement Oscillation Between Multihomed PEs.">
        <t>In a MPLS network, a router may be multihomed to two PEs. The PEs
        may re-advertise routes received from the router to the IBGP core with
        self as nexthop and a "per nexthop" label. The PEs may also protect
        failure of primary path to the router by using the IBGP path via the
        other multihomed PE as a backup path.</t>

        <t>In this scenario, label allocation oscillation may occur when one
        PE advertises a new label to the other PE. Reception of a new label
        results in change of nexthop, as the label is used as back nexthop
        leg, and per-nexthop label allocation is in use. Thus a new label is
        allocated and advertised. And when this new label is received by the
        first PE, it allocates a new label in turn. This process repeats.</t>

        <t>This oscillation can be stopped only if the primary path label
        allocated by a PE does not depend on the primary path label advertised
        by other PE. A PE needs to be able to advertise multiple labels, one
        for use as primary path and another to be used as bacakup path by the
        receiver.</t>

        <t>MNH attribute allows to advertise a Backup forwarding path label
        using <xref target="upstr-bkp"/> in addition to Primary forwarding
        path label using <xref target="upstr-prim"/></t>
      </section>
    </section>
  </back>
</rfc>
