<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!-- generated by https://github.com/cabo/kramdown-rfc version 1.7.27 (Ruby 3.2.3) -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-wh-rtgwg-adaptive-routing-arn-04" category="std" consensus="true" submissionType="IETF" xml:lang="en" version="3">
  <!-- xml2rfc v2v3 conversion 3.28.0 -->
  <front>
    <title abbrev="ARN">Adaptive Routing Notification</title>
    <seriesInfo name="Internet-Draft" value="draft-wh-rtgwg-adaptive-routing-arn-04"/>
    <author initials="H." surname="Wang" fullname="Haibo Wang">
      <organization>Huawei</organization>
      <address>
        <email>rainsword.wang@huawei.com</email>
      </address>
    </author>
    <author initials="H." surname="Huang" fullname="Hongyi Huang">
      <organization>Huawei</organization>
      <address>
        <email>hongyi.huang@huawei.com</email>
      </address>
    </author>
    <author initials="X." surname="Geng" fullname="Xuesong Geng">
      <organization>Huawei</organization>
      <address>
        <email>gengxuesong@huawei.com</email>
      </address>
    </author>
    <author initials="X." surname="Xu" fullname="Xiaohu Xu">
      <organization>China Mobile</organization>
      <address>
        <email>xuxiaohu_ietf@hotmail.com</email>
      </address>
    </author>
    <author initials="Y." surname="Xia" fullname="Yinben Xia">
      <organization>Tencent</organization>
      <address>
        <email>forestxia@tencent.com</email>
      </address>
    </author>
    <date year="2025" month="March" day="31"/>
    <area>General</area>
    <workgroup>Network Working Group</workgroup>
    <keyword>keyword1</keyword>
    <keyword>keyword2</keyword>
    <keyword>keyword3</keyword>
    <abstract>
      <?line 57?>

<t>Large-scale supercomputing and AI data centers utilize multipath to implement load balancing and/or improve transport reliability. Adaptive routing (AR), widely used in direct topologies such as dragonfly, is growing popular in commodity data centers to dynamically adjust routing policies based on path congestion and failures.
When congestion or failure occurs, the sensing node can not only apply AR locally but also send the congestion/failure information to other nodes in a timely and accurate manner to enforce AR on other nodes, thus avoiding exacerbating congestion on the reported path.
This document specifies Adaptive Routing Notification (ARN), a general mechanism to proactively disseminate congestion detection and congestion elimination information for remote nodes to perform re-routing policies. 
Particularly for AI workloads like DeepSeek's MoE models that exhibit dynamic all-to-all communication patterns with bursty traffic characteristics, such mechanisms become crucial to enable immediate network response to transient congestion conflicts.</t>
    </abstract>
  </front>
  <middle>
    <?line 66?>

<section anchor="intro">
      <name>Introduction</name>
      <t>Adaptive routing (AR) is widely used in high-performance computing (HPC) environments with directly connected topologies like Dragonfly<xref target="I-D.draft-agt-rtgwg-dragonfly-routing"/>. These topologies offer advantages such as scalability with small network diameters, making them widely adopted in HPC and supercomputing systems.</t>
      <t>In networks with directly connected topologies, multiple non-equivalent paths exist to reach the destination node. Typically, the shortest path is preferred for forwarding traffic. However, traffic congestion can occur on these shortest paths. AR addresses this by enabling nodes to make dynamic routing decisions based on network traffic variations, such as link congestion.</t>
      <t>AR is also applicable to symmetrical topologies, which are the most prevalent in current AI data centers, like the Clos topology. In symmetrical topologies, multiple equivalent paths are available. When congestion occurs on one path, AR can adjust traffic flows to avoid the congested path, thus ensuring balanced traffic distribution and optimal path usage.</t>
      <t>For example, by proactively detecting link congestion status or receiving remote congestion notifications, network nodes can forward packets along shorter, non-congested paths, improving overall throughput and resilience while reducing latency. When the link is non-congested, packets are forwarded over the shortest paths. When congestion occurs on any shortest path, the local node that detects it applies adaptive routing immediately and advertises congestion signals to other remote nodes. This allows the network to select another non-congested but non-shortest path temporarily until a congestion elimination signal is received. Adaptive routing helps mitigate traffic collisions and utilize idle links, enhancing bandwidth utilization.</t>
      <t>When data centers using symmetrical topologies employ Equal-Cost Multi-Path routing (ECMP), AR can correct the membership of ECMP groups by providing timely congestion updates. This ensures traffic is balanced across optimal paths and prevents overload on specific links.</t>
      <t>AR mechanisms are also effective in handling path failure scenarios, as path failures can be considered severe congestion cases. The re-routing strategy differs in these cases: when a link failure occurs, no traffic can pass through the failed link, necessitating a complete re-route of all affected traffic. In contrast, when congestion occurs, some traffic can still flow through the link, so AR ensures that only the excess or partial traffic is re-routed, maintaining some level of flow through the congested link.</t>
      <t>To standardize the process of disseminating information for triggering re-routing, including but not limited to congestion and failures, the concept of Adaptive Routing Notification (ARN) is introduced. ARN allows for a unified approach to adaptive routing across different network environments, ensuring consistent and efficient handling of network changes and improving overall network performance and reliability. Additionally, standardizing ARN reduces the need for multiple implementations by switch vendors, simplifying network management and deployment.</t>
      <t>This document proposes a proactive notification mechanism for adaptive routing and describes the conditions for triggering dissemination and the information carried in ARN to notify remote nodes for re-routing. ARN can be used for congestion notifications, link failure notifications, and even to convey other relevant network events for re-routing. ARN is applicable to both directly connected topologies and indirectly connected topologies. The detailed mechanisms for detecting congestion or failures are beyond the scope of this document.</t>
      <section anchor="terminology">
        <name>Terminology</name>
        <t>AR: Adaptive Routing</t>
        <t>ARN: Adaptive Routing Notification</t>
        <t>BPT: Best Path Table</t>
        <t>ECMP: Equal-Cost Multi-Path routing</t>
        <t>HPC: High-Performance Computing</t>
        <t>VXLAN: Virtual eXtensible Local Area Network</t>
      </section>
      <section anchor="requirements-language">
        <name>Requirements Language</name>
        <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
appear in all capitals, as shown here.</t>
        <?line -18?>

</section>
    </section>
    <section anchor="arn-mechanism">
      <name>ARN Mechanism</name>
      <t>The ARN mechanism primarily consists of three steps:</t>
      <ol spacing="normal" type="1"><li>
          <t>Detect changes in the status of network links/nodes (such as link congestion, link signal interruption, etc.).</t>
        </li>
        <li>
          <t>Assess the impact range of the status change and, if local measures cannot completely mitigate the impact, send an ARN message to specified remote nodes.</t>
        </li>
        <li>
          <t>When remote nodes receive the ARN message, they make rerouting decisions based on the specific information carried by the ARN, thus minimizing the impact on subsequent traffic (e.g., selecting a new path for subsequent traffic).</t>
        </li>
      </ol>
      <t>Here, link congestion is taken as an example to show how ARN works. ARN can be triggered whenever link congestion (e.g., by analyzing the queue length of the output port) is detected to appear or disappear. A congestion signal carried through ARN is sent by the detected node to other nodes of interest (usually the upstream nodes).</t>
      <t><xref target="topology"/> depicts a simplified dragonfly topology (only relevant links are drawn). The nodes in each Group are directly connected to each other. The groups are all connected with direct links. As shown in <xref target="topology"/>, Node1 has a direct link connecting Group1 and Group2. When the direct link (Node1 &lt;-&gt; Group2) is congested, all nodes of Group1 should be notified and immediately update the path selection policy. For example, partial or all flows originating from Group1 to Group2 may choose Group3 as a transmission path instead of using the direct link (Node1 &lt;-&gt; Group2) until congestion elimination.</t>
      <figure anchor="topology">
        <name>ARN Example in Dragonfly</name>
        <artwork><![CDATA[
            +----------------+            +----------------+
            |                |            |                |
            |     Group 2    | -----------|     Group 3    |
            |                |            |                |
            +----------------+            +----------------+
                     |                             |
                     |                             |
                     |                             |
  +------------------|-------------------+         |
  |                  *                   |         |
  |      @@     +----*---+     @@        |         |
  |     +-------+  Node1 +--------+      |         |
  |     |       +----+---+        |      |         |
  |     |            |            |      |         |
  | +---v----+       |       +----v---+  |         |
  | | Node2  |       |@      |  Node4 +------------+
  | +--------+       |@      +--------+  |
  |                  |                   |
  |             +----v---+               |
  |             |  Node3 |               |
  |             +--------+               |   **: congestion
  |  Group 1                             |   @@: ARN
  +--------------------------------------+
]]></artwork>
      </figure>
      <t><xref target="example2"/></t>
      <figure anchor="example2">
        <name>ARN Example in Spine-Leaf</name>
        <artwork><![CDATA[
    +---------+       +---------+               
    |  spine1 |       |  spine2 |               
    +--+---+--+       +-+----+--+               
       |   |**          |    |                  
       |   +------------+-+  |                  
     @@|                | |  |                  
       |   +------------+ |  |                  
       |   |              |  |       **: failure
    +--+---+--+       +---+--+--+    @@: ARN    
    |  leaf1  |       |  leaf2  |               
    +---------+       +---------+               
]]></artwork>
      </figure>
      <t><xref target="example2"/> depicts a reduced Spine-Leaf topology with an example of how ARN is triggered in case of a failure. Specifically, when Spine1 detects a failure on the link to Leaf2, and finds no local backup path, Spine1 sends an ARN message to Leaf1, instructing it to reroute subsequent traffic destined for Leaf2 through Spine2.</t>
      <section anchor="triggering-arn">
        <name>Triggering ARN</name>
        <t>The local node can sense the change of network states by monitoring interface status, such as bandwidth utilization and queue depth of the interface. The sensing method is out of scope in this document.</t>
        <t>When the monitored value exceeds the preset threshold, the state is determined to be congested and a congestion notification is triggered.
When the monitored value falls back below the preset threshold, the state is determined to be non-congested and a notification of congestion elimination is triggered.</t>
        <t>When the local node detects any change in congestion status, it can send the corresponding ARN continuously to other network nodes in the same group.
The notifications can be sent to multiple nodes using multicast technology provided by the network.
ARN packets <bcp14>SHOULD</bcp14> be set as high priority to ensure timely processing.
The congestion level is <bcp14>RECOMMENDED</bcp14> to be included in ARN for fine-grained control of adaptive routing.</t>
        <t>The local node can sense the change of network states by monitoring interface status, such as bandwidth utilization and queue depth of the interface. The methods for detection and sensing can vary widely, including techniques such as active probing, passive monitoring, and others. However, the specific methods are out of scope in this document.</t>
        <t>However, the detection of a state change does not necessarily trigger the ARN mechanism. Nodes can decide whether to trigger remote notifications based on predefined rules. For instance, local measures might be sufficient to handle the issue. If a link failure occurs but the local node has multiple backup links available, the local rerouting might suffice to resolve the problem without needing to trigger an ARN.</t>
        <t>When the local node decides to trigger ARN based on the change in specific network status, it can send the corresponding ARN continuously to other network nodes. The ARNs could be sent by unicast or multicast. ARN packets <bcp14>SHOULD</bcp14> be set as high priority to ensure timely processing.</t>
      </section>
      <section anchor="receiving-arn">
        <name>Receiving ARN</name>
        <t>After receiving an ARN, the node generally performs re-route operations, which include but not limited to:</t>
        <ul spacing="normal">
          <li>
            <t>Selecting a new optimal path for the traffic that avoids the congested or failed link.</t>
          </li>
          <li>
            <t>Adjusting the sending rate of traffic to prevent overload and reduce congestion.</t>
          </li>
        </ul>
        <t>If the node determines that it cannot effectively re-route the traffic based on the received ARN, it may propagate the ARN information to other nodes in the network, continuing the dissemination process to ensure network-wide adaptation and optimal traffic flow.</t>
      </section>
    </section>
    <section anchor="adaptive-routing-notification">
      <name>Adaptive Routing Notification</name>
      <section anchor="basic-concept">
        <name>Basic Concept</name>
        <t>An ARN packet should include two kinds of information:</t>
        <ul spacing="normal">
          <li>
            <t>Information reflecting the type of notification and quantifiable metrics (e.g., congestion level). The Metric value helps in quantifying the severity of the congestion or failure, enabling fine-grained control of adaptive routing.</t>
          </li>
          <li>
            <t>Information carrying details about the affected object (e.g., affected traffic, affected paths), for example, router identifier connected by the compromised link or identifiers of flows that are impacted by the congestion or failure.</t>
          </li>
        </ul>
        <t>These details are essential to assist remote nodes in making informed rerouting decisions, ensuring minimal disruption and optimal network performance despite the presence of congestion or failures.</t>
        <t>Whenever a network node receives an ARN packet indicating congestion detection, for example, it would evaluate the optimal forwarding path in its local best path table (BPT). If the optimal path passes through the affected interface, the network node deletes this path from the BPT and selects other sub-optimal paths. How to respond to ARN packets is typically related to the specific device's rerouting implementation mechanism for AR.</t>
        <t>ARN can also be used to notify the elimination of specific network conditions (e.g., congestion recovery). When such an ARN message is received, the previously made rerouting decisions can be revoked. In this case, each ARN message should be configured with an identifier (carried through parameters) to ensure the correspondence between the state notification and the state revocation notification. If ARN is not used for elimination, mechanisms such as timeouts can be employed to revoke rerouting decisions.</t>
        <t>Simple and direct ARN messages may cause routing oscillation issues and packet reordering problems within the same flow. These issues can be better addressed in future enhancements. Additionally, ARN is primarily a rapid rerouting mechanism and is typically used in conjunction with robust BGP mechanisms. Once BGP routes converge, they will replace the rerouting strategies triggered by ARN, ensuring routing correctness, loop-freeness, and reducing the side effects caused by the simplistic ARN mechanism.</t>
      </section>
      <section anchor="packet-format">
        <name>Packet Format</name>
        <figure anchor="ref-to-fig">
          <name>ARN Format</name>
          <artwork><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      Type     |Version|  Rvsd |     Metric    |    Para-Type  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
|                                                               |
+                      Parameters(Optional)                     +
|                                                               |
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
        </figure>
        <t>where:</t>
        <dl>
          <dt>Type:</dt>
          <dd>
            <t>This field indicates the purposes of ARN. 
Type 1 indicates this notification is for notifying congestion detection remotely to trigger adaptive routing.
Type 2 indicates this notification is for notifying congestion elimination remotely to revoke adaptive routing.
Type 3 indicates this notification is for notifying failure detection remotely to trigger adaptive routing.
Type 4 indicates this notification is for notifying failure elimination remotely to revoke adaptive routing.</t>
          </dd>
          <dt>Version:</dt>
          <dd>
            <t>This field indicates the version number. The default value is 0.</t>
          </dd>
          <dt>Rvsd:</dt>
          <dd>
            <t>Reserved.</t>
          </dd>
          <dt>Metric:</dt>
          <dd>
            <t>Quantified value. For example, it can be used to notify the degree of congestion or indicate the variation in available bandwidth.</t>
          </dd>
          <dt>Para-Type:</dt>
          <dd>
            <t>The Para-Type field is an 8-bit bitmap that specifies which parameters are included in the Parameters field of the ARN packet. Each bit in this field corresponds to a specific parameter. When a bit is set to 1, it indicates the presence of the corresponding parameter. The following subsections detail the explanation of each bit in the Para-Type field.</t>
          </dd>
          <dt>Parameters:</dt>
          <dd>
            <t>The parameters field can carry the information of affected object to help other devices determine the target of adaptive routing. The presence of parameters is indicated by the Para-Type bitmap. The packing order of the parameters follows the bit order specified in the Para-Type bitmap field.</t>
          </dd>
        </dl>
        <section anchor="illustration-of-para-type-and-corresponding-parameter">
          <name>Illustration of Para-Type and Corresponding Parameter</name>
          <section anchor="para-type-bit-0">
            <name>Para-Type Bit 0</name>
            <t>When bit0 of Para-Type is 1, the following parameter is concluded in Parameters to indicate the identifier of affected flow (five-tuple from packet header):</t>
            <artwork><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Opcode |   Mask  |             Rsvd            |    Protocol   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~                         Source IPv4/v6                        ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~                      Destination IPv4/v6                      ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Source Port         |        Destination Port       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
            <t>where:</t>
            <dl>
              <dt>Opcode:</dt>
              <dd>
                <t>This field indicates either IPv4 address or IPv6 address is used in the parameter.</t>
              </dd>
              <dt>Rsvd:</dt>
              <dd>
                <t>Reserved for future use.</t>
              </dd>
              <dt>Mask:</dt>
              <dd>
                <t>A bitmap used to indicate the presence of the subsequent 5 fields, excluding the reserved field. Each bit in this field corresponds to a specific domain, with a value of 1 indicating the presence of the domain and 0 indicating its absence.</t>
              </dd>
            </dl>
            <t>Here's the breakdown of each bit in the Mask field:</t>
            <ul spacing="normal">
              <li>
                <t>Bit 0 (Protocol): Indicates whether the Protocol field is present.</t>
              </li>
              <li>
                <t>Bit 1 (Source IPv4/v6): Indicates whether the Source IP address field is present.</t>
              </li>
              <li>
                <t>Bit 2 (Destination IPv4/v6): Indicates whether the Destination IP address field is present.</t>
              </li>
              <li>
                <t>Bit 3 (Source Port): Indicates whether the Source Port field is present.</t>
              </li>
              <li>
                <t>Bit 4 (Destination Port): Indicates whether the Destination Port field is present.</t>
              </li>
            </ul>
            <dl>
              <dt>Protocol:</dt>
              <dd>
                <t>Indicates the specific protocol type used by the packet, such as TCP or UDP.</t>
              </dd>
            </dl>
            <t>Source IPv4/v6: 
：  The IP address of the sender, which can be in IPv4 or IPv6 format determined by Opcode.</t>
            <t>Destination IPv4/v6: 
：  The IP address of the receiver, which can be in IPv4 or IPv6 format determined by Opcode.</t>
            <t>Source Port: 
：  The port number used by the sender.</t>
            <t>Destination Port: 
：The port number used by the receiver.</t>
          </section>
          <section anchor="para-type-bit-1">
            <name>Para-Type Bit 1</name>
            <t>When bit1 of Para-Type is 1, the following parameter is concluded in Parameters to indicate the identifier of affected path:</t>
            <artwork><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                            Path ID                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
            <dl>
              <dt>Path ID:</dt>
              <dd>
                <t>The 32-bit field is used to uniquely identify the affected path in the network.</t>
              </dd>
            </dl>
          </section>
          <section anchor="para-type-bit-2-to-bit-7">
            <name>Para-Type Bit 2 to Bit 7</name>
            <t>Bits 2-7: Reserved for future use or other parameter types.</t>
            <t>In scenarios such as VXLAN tunnels, there may be both inner and outer five-tuple flow identifiers. Different parameters can be used to distinguish between these two types of identifiers, allowing for more granular routing control. Specifically:</t>
            <ul spacing="normal">
              <li>
                <t>Bit 0 is used for the outer five-tuple identifier (related to the VXLAN tunnel).</t>
              </li>
              <li>
                <t>Additional bits (e.g., Bit 2) can be used for the inner five-tuple identifier (related to the actual payload traffic within the tunnel).</t>
              </li>
            </ul>
            <t>By using different bits in the Para-Type field, the ARN mechanism can indicate the presence of these different parameters, enabling precise and fine-grained adaptive routing decisions.</t>
          </section>
        </section>
      </section>
    </section>
    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>TBD.</t>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>TBD.</t>
    </section>
  </middle>
  <back>
    <references anchor="sec-combined-references">
      <name>References</name>
      <references anchor="sec-normative-references">
        <name>Normative References</name>
        <reference anchor="RFC2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner"/>
            <date month="March" year="1997"/>
            <abstract>
              <t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="2119"/>
          <seriesInfo name="DOI" value="10.17487/RFC2119"/>
        </reference>
        <reference anchor="RFC8174">
          <front>
            <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
            <author fullname="B. Leiba" initials="B." surname="Leiba"/>
            <date month="May" year="2017"/>
            <abstract>
              <t>RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="8174"/>
          <seriesInfo name="DOI" value="10.17487/RFC8174"/>
        </reference>
      </references>
      <references anchor="sec-informative-references">
        <name>Informative References</name>
        <reference anchor="I-D.draft-agt-rtgwg-dragonfly-routing">
          <front>
            <title>Routing in Dragonfly+ Topologies</title>
            <author fullname="Dmitry Afanasiev" initials="D." surname="Afanasiev">
         </author>
            <author fullname="Roman Glebov" initials="R." surname="Glebov">
              <organization>Yandex</organization>
            </author>
            <author fullname="Jeff Tantsura" initials="J." surname="Tantsura">
              <organization>Nvidia</organization>
            </author>
            <date day="4" month="March" year="2024"/>
            <abstract>
              <t>   This document provides an overview of Dragonfly+ network topology and
   describes routing implementation for IP networks with Dragonfly+
   topology with support for non-minimal routing.t

              </t>
            </abstract>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-agt-rtgwg-dragonfly-routing-01"/>
        </reference>
      </references>
    </references>
    <?line 376?>

<section numbered="false" anchor="acknowledgements">
      <name>Acknowledgements</name>
    </section>
    <section numbered="false" anchor="contributors">
      <name>Contributors</name>
    </section>
  </back>
  <!-- ##markdown-source:
H4sIAAAAAAAAA91c6XIbR5L+309RK/0waQGwSGl8IDweUZRsMUIHh6Kv2NiY
KHQXgDYb3ZiublCwJD/I/toH2V/7QvsK+2VmVXU1DkqyNRsToxnJRKOOrKw8
vjyaw+EwsY0us7/poirNWDV1a5JUN2ZW1euxsk2W2HayyK3Nq/JyvcSQs8eX
3yb5subBtjm+e/eru8dJocvZWJkySZq8KTDsJNPLJl8ZdVG1TV7O1POqyac5
1sZKiZ5MarPCqIvnSValpV5gSlbraTO8ng/rZnY9G2q3wrCWFYa6Lod37yfV
xFaFaYwdJ+0y0/zDbUU/jNXx3ePj4V36vxoO+ZnKrZrmRWEylZdKt021AAmp
Loq1mqzVq0VxXE9TlU9VWTVqhv1wBl0bPVbfmdLUukiuq/pqBiKWY/XcNPRJ
/Yh/6FDf0eMkuboeJ0oN1ZVZ4+vsKP5wHH+4h7XbZl7VGD/EF3lpx+rJSP0I
9uGj8OGJzieVf1TVM13mvzLb8FWrr02Ox2ah82Ksao0VaOHRNYY/mPPXo7Ra
9JfHtHj9qpyt8/Dw5h3mPHg0b/eu/9OIONUt/1NrbEW8Me+x/AyDXsmEvav/
1HZr57qat/Kkv/DpPC+1elZN8sJ0y79qX/GMv+WmmT6YVw093tjh5xEtG7b4
OS8npnSP+ntcmjI1ZdMtP61qYxts8aCRr3jpJCmrmqRsZcZJkpfT6FOSDCGY
emKbWqcNPj7V9cwMLQTSKNsuTY0VlqIx0Et1ckZSrBUtbmqr8E2R/2rUoi2a
fKmbuWoqlS+WhVlghCoqnamJhjamboXPqpq+ryuoIvYs7bKqG1WbItdgVd6s
R52qOkVTBycXhwN1nWcGStJaUZ0sr03aYLtlVVSz3FiQm86VtqS3s6qcFusB
KRs05ZoWWVbLttA1TcWRFlWGvfpnAeXZGkx32qizX2BPAhHYJk9pm4kmCqpS
8XFTSAp4jvtg/kxxDy0uYZT8ODdl/C3O7b5UVZq2tR2oZg4em9LS8mWVGZXq
ktW+Kmn/5RL/nlyAic48tI3Sha1oTsaTu+U/82uH68WWOFCFYTUvbtneqCZf
EBeJVk1kkEla6BKWhYYbmp0a2pVI7iYTsa1VelXlGZFrXunU1BPNrIlPWTJh
taFrBZuIR6Pkco6LgF1tWSjs0qQwvaDoRqNM1/4c965JKcnwqYVJ5xB/uyBS
IUKQWEzGaTI4BLOAwjUxT1QGq5yGq4m+gLTxaPo5Zhh+AumLCssIy2gfU9MA
PB9uisJIJee6hvUmwQIZNB0KQhaZBN+qIr8y6pExy5fGXH1iYQ4eK0ieKbDy
XDfg4jyf5I0XO9xuMWyqIf7DMtqWnhVgI0S0tFACyNwE0gPhhfpMwSwFppDy
mjrH6VJcFStCYBYk1mA1cKZu0xxs5HvWEyh4vliYLCeulc6RQHSXVWkNDWL1
zOnGItalpFl52kDCxXYs8iyDiUvg887Kpq6yVlj++nZOH9/im50KTbq5odLz
fDYfOn7DYtBdettz8OT89BBkr/K6KkmKHCvECmAN0FXiJywUWQThvzcHr1//
5Wz4aCRuXc8a59eDufD3+/btSF3ODTMhLFVNp9AFna102ehZZG3IVDrTJSTZ
Bd2fZyjYuzBkXgZQM/bR0I+FP7nOqmUjh8cBWUw3jK5d28YsiNtnpV/0fc4+
cCa5IFEuh+bvbb6CTcdlkkpaiB6khW4Z0AIHIaXN6IqdVpD4gwvrpRhDZ6vm
pNRWlqD7W9YGXKmxMYk+/l7rms2DE014+urarEw96IQ1EiVYOzaFzmrYjR2g
X7BDOssglJaUkawIMBILr7eZrKPgrAlK5IUsg5UhmBhZbH8pnpiVrnM+r1ca
TTJTXkVUgvEgAhuz5SWbDI6Q7mBbu4b+NDWxqMf563lOa8EaE9cWFZ0H8FLY
T/6nBc/w44YzHYi80pzTorJ+SXhEXP2+vcItb90w7a9XcAtELiDdpj9iJ8Rm
vjQ8ZUDspktxns9zaVpU18xmtv6x43EG3jkHeLK2Js6LwydxdCvAPoN0+C9v
iyH1OdREBKm1UCjw+VvIEPwKgYcB3XPPwospx+Ib94OYQDfYnC13avIVjXE2
PBpVRo4FXPOCIBJEZ3bCC4rSK9PQdRNmFHmE+JIO9Q+NVQTH0IZAMzVpfTOH
9M3mS/LUOCYEF4aBoBjJREF+EeaRT6EJoa3dtRBL+VyQs95Og44e3KajkWR5
Rf56UyXtTbesy3V/tOg0gwtBH+yShNEAC40IO9ijN813cBseSGQgp8lJR+OL
yWelLmyHQmLPShaWlUpka955INIrUxC006UHIDHvCQXRk74xgo0E4oA6kzsp
AUoBG/Y4fKGLWC0CY7IdkHNuiqWFb2vyGbnHznoVhTMqdHCPfnN4QL5ACIUp
5w7tTjAEdp4knMdpZ0/4jvoo2oqp36XiwPbLolqrx39vdTE8JWPyjHR+eE7n
Dh718emz88OgwWlVCzom+2MWE2wyz5dwYorGKY4erdOxlSA6Bwwjprlo1l0V
azeZW8cKssVez3VaV9b2tFoYRGaPnTUJLAcDdAGC/1LhmBjYCK2w3SJja+By
Wf8ZG2A5Nvp83R7tWnAQl16B7zDd8Vei1RM2AhbOlpyUJVdk+i7IygFNjO8o
GGrMjHAleX1GzuKfePwYymwIS7PKboL6suqkRRNys9abBb4OGg9aaC7ZoRS+
LW8ER2sGPJRL8NQYujIyLJp50ZlU9gk4CD7aZiAEbak9nBrhvpgcfI3VyKL3
iBJqwHNcRbhoMgccidAI84ooJTO7JMhLItoJgqc2I5AD2Ie/zEfavQDTCzrG
1qadTtP2kIPLSnH2h0DEr+IIIZ+y7zQC+WyDNmA79GY2M7VYf3+TMNFlWrQs
32I3EJLCEghU2he7DTx1qVk2tPN7hCnEhdzBXzYoF8+9cSPqNIwShTwZGVVy
axwnbxlWp0ciduTJvVGMge+gc7Qs2+BgKe7G0H0wYA/aAur9GqRgBFxp5Lbn
8qNi+C0urBeZI2rGmQUSdndFC9GJ2b0Zb88dLAwIJaQFxAuT+bEAsuAFbERW
sbzSmHy6ZnTnKAItwAcLf8jMkDmkjyQxvcASZ1pW5IV0hx16nj+KIPlWtvjP
69sUWMWdAhyWI9tNKYvE0ckPjY/FMtV1nQu4J+bgwpmWdT/ElJjTS6xIjjNd
HBfR9/uBTM8EbXzHIrEypRP1lVkHRwyl1LF4iZHeRQo56R7onVTvDLtYwsob
x4jNBdgQaxiZfyKiQ3s7EyjiIyZmXTmu27RasqlsYnmgCPX2bXVpalwTI2ly
NdupYHr6/F0Z4uTh+eVYPSTAwY73kviRJOROxzf75iRBaDdWTyi4PY+069TH
eEnyw09PT0DBD3ndYCFlfmooK0QMf8rw7ARBms/18pkuCO3XRgLhp9DrFipC
6mAos0sZiMyqW8++f3l5ayD/Vc9f8M8Xj//6/dnF40f088snJ0+fhh8SN+Ll
kxffP33U/dTNPH3x7Nnj549kMp6q3qPk1rOTn2+J2N16cX559uL5ydNb4jtj
LeW4qCL5zgn9ACKQaGibeM1jjXl4ev4//3V0X71+/W8X354eHx199fat+/Dl
0Rf38YF8nuzGTko+QhrWCQTWSKaPMyl6CQdbCEQAarwGmIB1hXR8+u/Emf8Y
q68n6fLo/jfuAR2499DzrPeQebb9ZGuyMHHHox3bBG72nm9wuk/vyc+9z57v
0cOv/wIjYdTw6Mu/fEMJ3+Q2K/Yzr3IiNvSos47LGkCOwbRzMVa0qzZQtsYs
gYGS5GikHrGiBtciOCmEZJ3vYaT3mRi8gz2RtjNmHp6TbNTtUr4xTTo6xI0d
wyhRLkCMM5wFjLyqaXOhL+wtFJF0DKiWIlHOwmjroSFBAY+2cMwO6Yd1B5Jn
1aXjjaUwlcMTl7/M+hFNktxz8VfPvrsggxeOFhJRlcxFbW7IWfChPGLe5V4m
a7+2i8Nh7YBxfnW5Js8lAt7txMJwkBJ68HZgRrPRwEVcgkFLc+2QNCzu9hS6
huQJFGiwFYpDyxucp6TLBdtcLM8sg9op+ksc4BRWz9E5t4rDkBYTSt9a2xE6
oYhTF+twOtDWEsgsZ6DYyQCYSRE4paAZmIk7EcznbAM5mdzKB5CyHbcG9nrE
6pyhJVY4lod1JXzuZ9tBC8sweYyD1racwqdZiLwa2POFjDscQZOS1699sgeG
DQiHEqy4CgeHiIyQpQxpIXXAdi84c9Yxtq4Ye10eiosNqX9O83F9UMbs8s4y
iE8hs12gKCFZEY2NEpAujINiOuuKzeLzDOBIM3METEpHiub45ULh8oitOf94
HGVG4ikHstbXw2/cOL7gKF/CWNZfgFsVZLVFRoImCIncDYPgLo0hsa6EHCT7
TiEo9U7JfmDfXnbKx0AEIl08RbFRPvPhybSuFn578FVohbaD3/MKGFWe3GNN
kUS7K2q73GqJ01C0PHW5gffgg6Q9dic9oLO//fZboqI/d4Ybf+7c/G1v8hu1
8efNzd/umCyieCwPoo3ib+/tm/w7d/5DZ96/yQ07/qMnbdEMBm4/is5Jk3Ys
++mNu0eTHjzg//C+n4aV3dM9k+50VIjY3tmga9ck/4zH3onP8Obdk/Z+2JxE
C69iDvW2XckXm5Pe8DGOuy/ePAiT6Zv7/Wu5E3bq3YWfFH+x53Z2ycH20Jjk
dwx1hN7bWnnPqlu64oj69NNxZHFkrqju0Q6S+3MfPJB+m51CvOvPHbZhr8fq
dnCBirt7/nyLnPNjBzfgfkLJ7xaVH+FdneE+fvu2M4Tdrv5k20/8n8QRbZeA
0kfdvbsnx1t89DvcEeHtdvDivHMHx5k3n0bq+Cb8s39CX9z6Irsx4cGDHRb0
zQfu8B4TNr6OJpDMuEB+L5fko3viBCXsgIUKo6dHKr4HenK8TdWH37SXMS8y
+2TsJV388Cm2JSGLRSwCcJIQy6LBHXxjCBXBZPh6j5AJSAdInEuamjPBnnEj
rCghgaTiOP/7UoTTV3B0l5qOKkwAI0TGsUTO07zMqOTkAqSJTq+gu1IdcstR
EGR3REG0ytGAkUrdCojLXUFZMtc7og2pMLukFpMR4DXvduxTNl2ajSyERKhR
pYoz2YbbFChHN/cBoA83KQg0nGBcVGXeVLVkjIHGpzr1MWJX8t1ZqGH+SHSB
++yCi7CMIGTfv7MwzbzK6OZweBorOanN7Icv/0hhmGkDO1a6aCXFbjLr8t7G
moYjbqDXIhuE4Nb4iIayWoLaJ3EunQty+zKGPcka7SdlCrGyLA5YXPL2H05T
v24ndPWIAZf2NeX0yOzojGQgiHm59hKQ92og/pYhlU5efO26li6XzKetqY6S
l23VWg6vfBjXqxH71IZeuKholEh4FaVcfTjLMSK1JXQdGLSEIHl+CIXGCJPO
JS/pynBdMO/2HlFmMpSAXdqIN2hIbqlfhlI1kO9mLW09lN/wpTxXO6FULtMa
8UaKMuBylFcKSTmqmHRpa+7sIOs1o+ZOk0nVqeKSzmYCndT3n1hXRUd7KWY3
12sxEbvS9dr158T1I76tHJt0rT+uwgA+T7jYRLU+etAdxOUnSZ5s3AwT53Q8
VRRkv8t29FbojsCuQRTRMTirjOVyl1QYJZfnVCpKRbl834jRoMgvpaAy6lcw
rAXcBybTQlorFvmuGxKqaqYsIXVbUIafImZyD5TtHmym4BaQ3YZluQ1VK+zF
hSuXhbO2xcWdTXdXWrmgt2ETKMUQlM45M5cW8Y0wcddDl3cTaoQUI07MVsUq
1CAxccEOmy6IKlssER1rxD3utVTEURuPJ+73MnydBQtyESvIR7NjoggYTEkT
lxTxKS1uN4Rd8jU7+iBpuo9ggVTiChe+R4cd+8m0MXHjjvBR7oh551o/aS0p
ndioNI5HvtYlLVfOdu2o9Y4TQoJD9XIj0dlrROIK37wrmHMFnJueQjHQOTNX
iwqla177hNumfLaG7omr0Vqq+GHRyndFdE0RUmglqNjvPDubdpwI7tVV5kUc
6JShT4KzgY458Tl6kuabXoTRWIWSUlQ51SH/zRD0xj7iyEcNvNh1aaq4MOoL
+J1suHlDMrHiQTob7q8jbj0bScni5uocROuhtphxKmV7iFYZia5PAXoBAQXq
irEvp2nDUSEmQ3UWHb02Uy8wzNG11Bl7MEacjy7pEVdJpYvH+pz1pt91mdln
PMoBLuk3AmfdQutOjCAkpFzOpe0siA66nsj3d9X9g1K6ey0lCKrJwmJOKmdg
Q+9JNfmFEpDuWJstKdETbgA6HLBChZwpyyU8QmaYU6aO0skO9lBBpq4WuXWK
RUfsxlvfReI0gBymFDfiFXawhxsFqH0nnA0zqae0bFw/NDlu6vWPCze4DNex
KwLC9Z6tQk3UjcF1F6wHBXCFq55M7+qwwD7L3OedCV7Twz4sjsrezr9wfUT3
7LrX6hCoObGnKny61aQfgMPGDcEYXLOaUK9q662Bpz9q73U5akywPnLsOvFY
BQ4enl8esvOOl+ABhJJMvycqCE4AbINeT6AzgPyWlWAisdeUZKdx2MzBuIJj
ArFWCECHvZY0RmDOuy+5d6DquTeKOnzHM9VVtKuK9OBaZlbACJ/YSBj6jS0b
TSYnF+T7fLGLG9t8a0fXEcIdVlH8QxhwEwdErSjbhgX3T95kfejqJgJR+0F7
1PE48BK3ygUtLHS2uw7pQhqMrK6osenMgVJKSQykXhTv0ZVa6EWBfNbWvliE
dSLVP9gsry117RrlD2Mc0cM5rB0TsMOYrtBsto1x9xWRnfqO9m4UC6ZLtJAH
DY020R0M4qYUj/gJ14BFgS3SmylXKSzaxUTyYC9ZSKTDSOo4EdusVIY06Ai9
SJVN86Lw4bBtXV+NU+zaVHUmGRIHT+WlgDhMZefp3mRwKziywcOG32aQ1nqO
9aZtQxyX9lXpLNls+HIc65oDNNDNMo/tYif7XGGLFcq/5QHB+KUtJXBhyQD9
1G7+8LvziOUj9YJumx6y47DSwlSHyvk1NTLWZllQlCjAZqN/MzdxGm2yFsQT
zLUf7XplAayon6qqlsNpDQHjjwGYBXdMoEUAl5ULC65HirX0Cs5GcCXI91wu
7lv2uC4LfXc7h7ozcX6849k9mn6Er+6p++pP6nP1hfpSffUhzxLKRv+h/yUu
3Upv4koi9gcoMC4Wzy9WNnPpWId0lEvPnkPVhzLlzcej4Xf/efMRVtjMIrs/
58GoHbxYih4d7hz4T3KKP3wXPnkO2EwvssEB9NLnIvyUL7+mFiygbXmHOxkr
aTKHbygyj1pcC+ayraWrs2KbDW/KsnPUGyaGvJfnJHsu7nUf/HF4T4LlEMpv
AWXe7vh3bxf79XhD5y/27Hfvw/bzuZHfdbb7v2+vDz5Y4qzD+Mb7XskgVbb0
3oJvFJ3qtmhcsISJdwlUkYnhtS7g4uoVp4rF1vDTv7qYzGe1N3o4XD5lNxjL
zIx63baguCdVKPWvj3Groc8wdSlKkBNsnTuziayfOzxj9i+H9B4o/i70UqKb
7i1ZyW10AEkCnyhN27hl3deyrosXO3w7Uo8JrdE+PrMoIzuAJW95ddAz7OlQ
pZbZltM/GHrEXNxQ1yiM2U5TRSsSM6YV9cizz6ZqUSrwVsI099oBHHwHik3v
BFvMdAwXPgSOLzdZw2/IULy71bJNgfJGtEtJSQTnLqYQ9B8VOyQtQC/NNzvD
bKEg4klEDb8zILwLKKI7kQiDWwA3yKCQQJ/nbHyuqnuRitgj47pGxS12OUnz
XANAua3OiqJl6OQ40Y0mGHTau8fAZp56Oxr7ENvfdYlQ7HK3vxKOfCTBR3f1
4RyulasT7Eio6dcKxLoXxRLxpfFLJgdT+jUdTUuAm2NEB5rnBlFOfTj+l0Ff
L5YpRcbk+Z9pe7VZ/r6wq6zn4emf87pqqrQqPpbH34snXlYt/SqBs/PV/c9W
n+8b9ds/joZH0fvMN1LxMWiIOe9Ofk6/2aLH+02qohEfC30FZCWysd/VmpwN
GjHGB4Lk4PD58/A5tyFq6xkcWAySrZ7vlUKhBJGYRK4YIslDTry58Y62p8mb
DiPqG/iTUE1JtlehDsehnt+T7deHO7asovfUBi4z4WAF9j+KM2a7iJOJbBDv
xmMpE6YnPHQk7dGfOGtcG32VUVvsDvfFSsu0cuKZTac68Bp6OFZn4bpCPY7M
uFfhgCCEzGbkFjlSB33t27tUGBbufN+ax+pgh0LtXbg/9p2r3wsUk068i1zW
m31L3e8TeuN6W8q4vWjiuS1hylkP7HRIyV8JVwjitID4nq6AfXl6Tnr2/aNz
rN2/Jezwv//9nwJaIp55vTBlRqVfgYMOuuZyE0FzBcrEvRggQ0wBtttxgTfv
6fKFf2zX6NLi3fgX/wjI7zFMzrlBbTf5pqmeXIdoNmHJUQdLjv5/YQmln/91
YMduLyp/+M23s0c3Dflors7tFYD+vWOOooIOe3fTct8GglN3O+t+0cFXM+LG
G7Vbgo5pOfrhiyR5SDb/ePjFeJ8LJPWQqKETJTIP7pe5hHfXg2ng1wBV05al
KeQt5NpwZphSthUTWXKzQaaklhbjXAK+Ua1spB6Ft4ijQGEj4qVfzAFZb3M7
jxPrViqkTCxXSLt1B/JaM+cBqFGgqqkfSpf8u7W6lCrXHftdipGL83fjC+5b
p4krBRuFmJhJh2Tzuxw1KXYojvB1HW69SisRX/ne2+mUX8lc6jVX6n1hOsq1
B1KSh2vX49W9wM0U7Q5WB9t9OEztTeCIapg7rjWq/2JCmlvjuzu7WvDWy85d
eUJRgf2lSVsuNJ+6X5ggrRVJcvnwkZTgz06en+z5djgccr8NV+rTq7K6Lkwm
L23b5PVYTLXJ/nxrqgtrpCv7Nq0lvxamqveOSpL/Azfy5xkVUgAA

-->

</rfc>
