<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-ietf-idr-link-bandwidth-11"
     ipr="trust200902">
  <front>
    <title abbrev="BGP Link Bandwidth Extended Community">BGP Link Bandwidth
    Extended Community</title>

    <author fullname="Pradosh Mohapatra" initials="P." surname="Mohapatra">
      <organization>Sproute Networks</organization>

      <address>
        <email>pradosh@sproute.com</email>
      </address>
    </author>

    <author fullname="Rex Fernando" initials="R." surname="Fernando">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street>170 W. Tasman Drive</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>US</country>
        </postal>

        <email>rex@cisco.com</email>
      </address>
    </author>

    <author fullname="Reshma Das" initials="R." role="editor" surname="Das">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>dreshma@juniper.net</email>
      </address>
    </author>

    <author fullname="Satya Mohanty" initials="S." role="editor"
            surname="Mohanty">
      <organization>Zscaler</organization>

      <address>
        <postal>
          <street>120 Holger Way,</street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>US</country>
        </postal>

        <email>smohanty@zscaler.com</email>
      </address>
    </author>

    <author fullname="Mankamana Mishra" initials="M." surname="Mishra">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street>821 alder drive,</street>

          <city>Milpitas</city>

          <region>CA</region>

          <code>95035</code>

          <country>US</country>
        </postal>

        <email>mankamis@cisco.com</email>
      </address>
    </author>

    <author fullname="Rafal Jan Szarecki" initials="R.J." surname="Szarecki">
      <organization>Google LLC</organization>

      <address>
        <postal>
          <street>1160 N Mathilda Ave,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>rszarecki@gmail.com</email>
      </address>
    </author>

    <date day="03" month="03" year="2025"/>

    <abstract>
      <t>This document describes an application of BGP extended communities
      that allows a router to perform unequal cost load balancing.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"/>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>Load balancing is a critical aspect of network design, enabling
      efficient utilization of available bandwidth and improving overall
      network performance. Traditional equal-cost multi-path (ECMP) routing
      does not account for the varying capacities of different paths. This
      document suggests that the external link bandwidth be carried in the
      network using one of two new extended communities <xref
      target="RFC4360"/> - the transitive and non-transitive link bandwidth
      extended community. The Link Bandwidth Extended Community provides a
      mechanism for routers to advertise the bandwidth of their downstream
      path(s), facilitating maximum utilization of network resources.</t>
    </section>

    <section title="Link Bandwidth Extended Community">
      <t>The Link Bandwidth Extended Communities are defined as a BGP extended
      community that carries the bandwidth information of a router,
      represented by BGP Protocol Next Hop, connecting to remote network. This
      community can be used to inform other routers about the available
      bandwidth on through a given route.</t>

      <t>The Link bandwidth extended communities can be either transitive or
      non-transitive. Therefore the value of the high-order octet of the
      extended Type Field can be 0x00 or 0x40, respectively. The value of the
      low-order octet of the extended type field for this communities is 0x04.
      The value of the Global Administrator subfield in the Value Field SHOULD
      represent the Autonomous System of the router that attaches the Link
      Bandwidth Community, but it can be set to any 2-byte value. If the
      Autonomous System number cannot be represented in two octets, as enabled
      by <xref target="RFC6793"/>, AS_TRANS should be used in the Global
      Administrator subfield. The bandwidth of the link is expressed as 4
      octets in <xref target="IEEE.754-2019"/> floating point format, units
      being bytes (not bits!) per second. It is carried in the Local
      Administrator subfield of the Value Field.</t>

      <figure anchor="LBWExtCom" suppress-title="false"
              title="Link Bandwidth Extended Community">
        <artwork align="left" xml:space="preserve">
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Type=0x00/0x40   | SubType= 0x04 |       AS Number          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Link Bandwidth Value                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 Type:   1-octet field MUST be set to 0x00 or 0x40 
         to indicate transitive/non-transitive.

 SubType: 1-octet field MUST be set to 0x04 
          to indicate 'Link-Bandwidth'.

 Global Administrator sub-field: 
          2-octet represent the Autonomous System.

 Local Administrator sub-field: 
          Bandwidth value (bytes per sec) encoded as 4 octets
          in IEEE floating point format.
</artwork>
      </figure>
    </section>

    <section title="Protocol Procedures">
      <section anchor="Originator"
               title="Sender (Originating Link Bandwidth Community)">
        <t>An originator of the link bandwidth community SHOULD be able to
        originate either a transitive or a non-transitive link bandwidth
        extended community. Implementations SHOULD provide configuration to
        set the transitivity type of the link bandwidth community, as well as
        the Global Administrator and bandwidth values in (Local Administrator
        field), using local policy. No more than one link bandwidth extended
        community SHALL be attached to a route.</t>

        <t>The link bandwidth extended community MAY be attached or updated
        for a BGP route upon receipt during Adj-RIB-In processing. The link
        bandwidth extended community MAY be attached or updated for a BGP
        route's Adj-RIB-Out entry while being advertised to a neighboring BGP
        speaker.</t>

        <t>Note: Implementations MAY provide a configuration option to send
        non-transitive link bandwidth extended communities on external BGP
        sessions.</t>
      </section>

      <section anchor="Receiver"
               title="Receiver (Receiving Link Bandwidth Community)&nbsp;">
        <t>A BGP receiver MUST be able to process link bandwidth extended
        community of both transitive and non-transitive types. The receiver
        MUST NOT flap or treat the route as malformed based on the
        transitivity of the link bandwidth extended community and/or BGP
        session type (internal vs. external).</t>

        <t>Note: Implementations MAY provide configuration to accept
        non-transitive link bandwidth extended communities from external BGP
        sessions.</t>
      </section>

      <section anchor="Re-advertisement" title="Re-advertisement Procedures">
        <section anchor="NextHopSelf"
                 title="Re-advertisement with Next hop Self">
          <t>When a BGP speaker re-advertises a route with the link bandwidth
          extended community and sets the next hop to itself, it SHOULD follow
          the same procedures as outlined in <xref target="Originator"/>.</t>

          <t>In the absence of any import or export policies that alter the
          Link Bandwidth Extended Community, any received Link Bandwidth
          extended community on the route will be re-advertised unchanged, in
          accordance with standard BGP procedures.</t>
        </section>

        <section anchor="NextHopUnchanged"
                 title=" Re-advertisement with Next Hop Unchanged">
          <t>A BGP speaker that receives a route with the link bandwidth
          community, re-advertises or reflects the same without changing its
          next hop, SHOULD NOT change the link bandwidth extended community in
          any way.</t>
        </section>
      </section>

      <section title="Link Bandwidth Community Arithmetic and BGP Multipath">
        <t>In a BGP multipath ECMP environment, the value of the link
        bandwidth community that is sent or re-advertised may be calculated
        based on the link bandwidth communities of the routes contributing to
        multipath in the Local Routing Information Base (Local-RIB). This
        topic is beyond the scope of this document.</t>
      </section>
    </section>

    <section anchor="Error" title="Error Handling">
      <t>If a receiver receives a route with more than one Link Bandwidth
      Extended Community, it SHOULD:<ul>
          <li>Prefer the lowest value of the attached link bandwidth
          community, irrespective of the extended community's
          transitivity.</li>

          <li>Prefer the transitive link bandwidth extended community when
          choosing between transitive and non-transitive types that have the
          same value.</li>
        </ul></t>

      <t>Implementations MAY provide configuration to change the above
      preferences.</t>

      <t>Link bandwidth extended communities with a negative value SHALL be
      ignored and MUST NOT be originated.</t>

      <t>WECMP (Weighted Equal-Cost Multi-Path) can be utilized when only all
      contributing paths have a non-zero value in the link bandwidth extended
      community. If any of the paths lack a valid link bandwidth extended
      community, ECMP (Equal-Cost Multi-Path) MUST be used instead.</t>

      <t/>
    </section>

    <section anchor="History" title="Document History">
      <t>The BGP Link Bandwidth Extended Community has evolved over several
      versions of the IETF draft. In the earlier versions up to
      draft-ietf-idr-link-bandwidth-08, only the non-transitive version of the
      link bandwidth extended community was supported. However, starting from
      draft-ietf-idr-link-bandwidth-09, both transitive and non-transitive
      versions of the link bandwidth extended community are supported.</t>

      <t>An old sender/receiver is a BGP speaker that uses procedures up to
      draft
      (https://datatracker.ietf.org/doc/html/draft-ietf-idr-link-bandwidth-08)
      or any undocumented behavior for link bandwidth extended community.</t>

      <t>A new sender/receiver is a BGP speaker that implements procedures
      specified in this document.</t>

      <t>Receiving BGP speakers need to be upgraded to support the procedures
      defined in this document to provide full interoperability for both
      transitive and non-transitive versions of the link bandwidth extended
      community. In order simplify implementations, it is not a goal to
      provide interoperability between old Receivers and new Senders.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document defines a specific application of the two-octet AS
      specific extended community.</t>

      <t>IANA is requested to update the Transitive Two-Octet AS-Specific
      Extended Community Sub-Types registry (Type 0x00) and Sub-Type 0x04
      to:</t>

      <figure>
        <artwork>    Name
    ----
    transitive Link Bandwidth Ext. Community</artwork>
      </figure>

      <t>IANA is requested to update the Non-Transitive Two-Octet AS-Specific
      Extended Community Sub-Types registry (Type 0x40) and Sub-Type 0x04
      to:</t>

      <figure>
        <artwork>    Name
    ----
    non-transitive Link Bandwidth Ext. Community
          </artwork>
      </figure>

      <t>Both updates are to Reference this document.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>There are no additional security risks introduced by this design.</t>
    </section>

    <section anchor=" Contributors" title=" Contributors">
      <author fullname="Kaliraj Vairavakkalai" initials="K."
              surname="Vairavakkalai">
        <organization>Juniper Networks, Inc.</organization>

        <address>
          <postal>
            <street>1133 Innovation Way,</street>

            <city>Sunnyvale</city>

            <region>CA</region>

            <code>94089</code>

            <country>US</country>
          </postal>

          <email>kaliraj@juniper.net</email>
        </address>
      </author>

      <author fullname="Natrajan Venkataraman" initials="N."
              surname="Venkataraman">
        <organization>Juniper Networks, Inc.</organization>

        <address>
          <postal>
            <street>1133 Innovation Way,</street>

            <city>Sunnyvale</city>

            <region>CA</region>

            <code>94089</code>

            <country>US</country>
          </postal>

          <email>natv@juniper.net</email>
        </address>
      </author>
    </section>

    <section anchor="Acknowledgments" title="Acknowledgments">
      <t>The authors would like to thank Yakov Rekhter, Srihari Sangli and Dan
      Tappan for proposing unequal cost load balancing as one possible
      application of the extended community attribute.</t>

      <t>The authors would like to thank Bruno Decraene, Robert Raszuk, Joel
      Halpern, Aleksi Suhonen, Randy Bush, Jeff Haas, Stephane Litkowski,
      Serge Krier and John Scudder for their comments and contributions.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4360.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6793.xml"?>

      <reference anchor="IEEE.754-2019"
                 target="https://ieeexplore.ieee.org/document/8766229">
        <front>
          <title abbrev="IEEE-754-2019">IEEE Standard for Floating-Point
          Arithmetic</title>

          <author>
            <organization showOnFrontPage="true">IEEE</organization>
          </author>

          <date day="22" month="July" year="2019"/>
        </front>
      </reference>
    </references>
  </back>
</rfc>
