<?xml version="1.0" encoding="utf-8"?>
<!-- 
     draft-rfcxml-general-template-standard-00
  
     This template includes examples of the most commonly used features of RFC XML with comments
     explaining how to customise them. This template can be quickly turned into an I-D by editing 
     the examples provided. Look for [REPLACE], [REPLACE/DELETE], [CHECK] and edit accordingly.
     Note - 'DELETE' means delete the element or attribute, not just the contents.
     
     Documentation is at https://authors.ietf.org/en/templates-and-schemas
-->
<?xml-model href="rfc7991bis.rnc"?>  <!-- Required for schema validation and schema-aware editing -->
<!-- <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> -->
<!-- This third-party XSLT can be enabled for direct transformations in XML processors, including most browsers -->


<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<!-- If further character entities are required then they should be added to the DOCTYPE above.
     Use of an external entity file is not recommended. -->

<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="info"
  docName="draft-hhz-fantel-sar-wan-00"
  ipr="trust200902"
  obsoletes=""
  updates=""
  submissionType="IETF"
  xml:lang="en"
  version="3">
<!-- [REPLACE] 
       * docName with name of your draft
     [CHECK] 
       * category should be one of std, bcp, info, exp, historic
       * ipr should be one of trust200902, noModificationTrust200902, noDerivativesTrust200902, pre5378Trust200902
-->

  <front>
    <title abbrev="draft-hhz-fantel-sar-wan-00">FANTEL scenarios and requirements in Wide Area Network</title>
    <!--  [REPLACE/DELETE] abbrev. The abbreviated title is required if the full title is longer than 39 characters -->

    <seriesInfo name="Internet-Draft" value="draft-hhz-fantel-sar-wan-00"/>

    <author fullname="Jiayuan Hu" initials="J" surname="Hu">
      <organization>China Telecom</organization>
      <address>
        <postal>
          <street>109, West Zhongshan Road, Tianhe District</street>
          <city>Guangzhou</city>
          <region>Guangdong</region>
          <code>510000</code>
          <country>CN</country>
        </postal>
        <email>hujy5@chinatelecom.cn</email>
      </address>
    </author>

    <author fullname="Zehua Hu" initials="Z" surname="Hu">
      <organization>China Telecom</organization>
      <address>
        <postal>
          <street>109, West Zhongshan Road, Tianhe District</street>
          <city>Guangzhou</city>
          <region>Guangdong</region>
          <code>510000</code>
          <country>CN</country>
        </postal>
        <email>huzh2@chinatelecom.cn</email>
      </address>
    </author>

    <author fullname="Yongqing Zhu" initials="Y" surname="Zhu">
      <organization>China Telecom</organization>
      <address>
        <postal>
          <street>109, West Zhongshan Road, Tianhe District</street>
          <city>Guangzhou</city>
          <region>Guangdong</region>
          <code>510000</code>
          <country>CN</country>
        </postal>
        <email>zhuyq8@chinatelecom.cn</email>
      </address>
    </author>

    <date year="2025"/>
    <!-- On draft submission:
         * If only the current year is specified, the current day and month will be used.
         * If the month and year are both specified and are the current ones, the current day will
           be used
         * If the year is not the current one, it is necessary to specify at least a month and day="1" will be used.
    -->

    <area>Routing</area>
    <workgroup>FANTEL</workgroup>
    <!-- "Internet Engineering Task Force" is fine for individual submissions.  If this element is
          not present, the default is "Network Working Group", which is used by the RFC Editor as
          a nod to the history of the RFC Series. -->

    <keyword>RFC</keyword>
    <!-- [REPLACE/DELETE]. Multiple allowed.  Keywords are incorporated into HTML output files for
         use by search engines. -->

    <abstract>
      <t>
        This document introduces the main scenarios related to AI services in WAN, as well as the requirements for
        FANTEL(FAst Notification for Traffic Engineering and Load balancing) in these scenarios.
        Traditional network management mechanisms are often constrained by slow feedback and high overhead, limiting
        their ability to react quickly to sudden link failures, congestion, or load imbalances. Therefore, these new AI
        services need FANTEL to provide real-time and proactive notifications for traffic engineering and load
        balancing, meeting the ultra-high throughput and lossless data transmission requirements of these AI service scenarios.
      </t>
    </abstract>

  </front>

  <middle>

    <section>
      <name>Introduction</name>
      <t>
        The rapid development of Large Language Models (LLMs) necessitates substantial computing power. Hyperscalers
        build their own AI Data Centers (AIDCs) to train foundational models. However, most enterprises do not have a
        demand for training foundational models but want to meet the needs of fine-tuning and inference cost-effectively,
        so a good solution is to rent third-party AIDCs for LLMs fine-tuning and inference, requiring IP network
        to fulfill their needs. IP network consists of IP Backbone and IP Metropolitan Area Networks (IP MAN). IP MAN
        interconnects various customers and data centers (including AIDCs) within the metropolitan area, while IP
        Backbone interconnects IP MANs and data centers (including AIDCs). IP Backbone and IP MAN belong to IP Wide Area
        Network (IP WAN, or WAN for short).
      </t>
      <t>
        The AI services in WAN, including sample data transmission, coordinated model training and inference, require
        networks to efficiently manage traffic and rapidly adapt to network changes.
        <xref target="draft-geng-fantel-fantel-requirements"/> points out that existing network management mechanisms
        such as FRR, BFD and ECN which often rely on delayed feedback or reactive responses, resulting in network performance
        degradation, longer service disruptions, or inefficient resource utilization. Therefore, FANTEL is proposed
        to implement real-time and reliable notifications of network events, effectively supporting Traffic
        Engineering (TE) functions such as load balancing, failure protection, and congestion control. WAN need to
        deploy FANTEL to ensure high throughput and lossless transmission of data, meeting the new demands of AI
        services.
      </t>
    </section>

    <section title="Conventions Used in This Document">
      <section>
        <name>Requirements Language</name>
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
          "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
          RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
          interpreted as described in BCP 14 <xref target="RFC2119"/>
          <xref target="RFC8174"/> when, and only when, they appear in
          all capitals, as shown here.</t>
      </section>

      <section title="Abbreviations">
        <t> AIDC: AI Data Center</t>
        <t> ECMP: Equal Cost Multi Path</t>
        <t> FANTEL: Fast Notification for Traffic Engineering and Load Balancing</t>
        <t> INT: Inband Network Telemetry</t>
        <t> LLM: Large Language Model</t>
        <t> MAN: Metropolitan Area Networks</t>
        <t> RDMA: Remote Direct Memory Access</t>
        <t> TE: Traffic Engineering</t>
        <t> WAN: Wide Area Network</t>
      </section>
    </section>
      <!-- [CHECK] The 'Requirements Language' section is optional -->

    <section>
      <name>Use Cases</name>
      <t>
        For most customers, the cost of owning and maintaining AI facilities is prohibitively high. A solution is to rent AI
        facilities, which are located in third-party AIDCs to fulfill customers' LLMs training or inference requirements
        conveniently. Under these circumstances, customers access AIDCs via WAN to get various AI services as
        shown in Figure 1, including sample data transmission, coordinated model training across AIDCs, and coordinated
        model inference between the customer and AIDCs.
      </t>
      <figure>
          <name>AI service scenarios in WAN</name>
        <artwork align="center"><![CDATA[
                             S2: Coordinated
                             model training
               +--------+ <------------------> +--------+
               |  AIDC  |----------------------|  AIDC  |
               +--------+                      +--------+
                    ^ |                           | ^
        S1: Sample  | |    Wide Area Network      | |   S3: Coordinated model
            data    | |                           | |   inference
      transmission  | |                           | |   
                    v |                           | v
               +-------------+           +--------------+
               |   Customer  |           |   Customer   |
               +-------------+           +--------------+
	   ]]></artwork>
        </figure>
      <section>
        <name>Scenario 1: Sample data transmission</name>
        <t>
          When customers train AI models in third-party AIDCs, they need to transmit massive sample data into AIDCs.
          Due to differentiated customers' requirements for data security, there are two sub-scenarios involved.
        </t>
        <section>
          <name>Sub-scenario 1.1: Transmitting sample data into storage system</name>
          <t>
            Since the training process of LLMs needs rounds of fine-tuning for performance improvement, with each round
            consuming massive sample data (ranging from terabytes to petabytes). These customers need to
            upload sample data into AIDCs as soon as possible to start
            the training. To provide high-efficiency and cost-effective sample data transmission, WAN needs to meet
            the following requirements:
          </t>
          <t>
            1. Customers usually obtain fixed bandwidth based on dedicated line services to meet their service needs.
            However, the customer's sample data transmission requirements are intermittent and need to be as fast as
            possible, thus requiring WAN to have the ability to flexibly adjust bandwidth based on tasks.
          </t>
          <t>
            2. To maximize the available bandwidth for the data transmission, this scenario requires the WAN to
            support fast notification of network status changes between devices, enabling real-time service bandwidth
            adjustment, enabling efficient hourly transmission of terabyte-scale sample data.
          </t>
          <t>
            3. During high-speed sample data transmission, network failures can lead to a large number of packet
            losses, resulting in a sharp decrease of data transmission efficiency. This scenario requires WAN to
            have millisecond-level fast failure protection capability, enabling rapid failure detection and failover.
          </t>
        </section>

        <section>
          <name>Sub-scenario 1.2: Directly transmitting sample data to AI servers</name>
          <t>
            Certain customers with stringent data security requirements prohibit storing sample data outside their
            facilities. To address this problem, sample data must be uploaded to the AI training servers by using RDMA
            protocols while these servers are performing training tasks. Current mainstream RDMA protocols rely on
            Go-Back-N mechanism, making them highly sensitive to latency and packet loss (Even a 0.1% packet loss rate
            can degrade computational efficiency by 50%).
          </t>
          <t>
            To achieve lossless transmission of sample data, based on the requirements of sub-scenario 1.1, WAN needs
            further support millisecond-level fast congestion control, meeting the requirement of zero packet loss
            in RDMA transmission. Therefore, this scenario requires the mechanism of FANTEL to quickly notify upstream
            devices to reduce traffic speed based on the router cache situation.
          </t>
        </section>
      </section>

      <section>
        <name>Scenario 2: Coordinated model training</name>
        <t>
          The scaling laws demonstrate the performance of LLMs scales with model size, sample dataset size, and the
          amount of computing power used for training. The computing power demand of LLMs grows rapidly (It's estimated
          that GPT-6 requires ZFLOPS-scale computing power, reaching a ~2000x increase over GPT-4). The model training
          requirements of customers consist of foundational model training and fine-tuning. The computing power of a
          single AIDC is limited by physical infrastructure (e.g., space and power supply), making it inadequate to meet
          the demands of LLM training. Thus, a solution is proposed to fulfill the computing power demand of ultra-scale
          LLMs training through the efficient coordination of distributed computing resources across multiple AIDCs.
          Besides that, there are always some residual computing resources that are insufficient to meet the demands of
          a single customer. These resources can be coordinated across AIDCs to meet more customers' demands. For
          customers with high data security requirements and self-built DCs, the collaborative training between the
          customer's own DC and AIDC can achieve ultra-scale LLMs training while ensuring sample data does not leave
          customer's DC (Requiring the input and output layers of LLM to be deployed within the customer's DC).
        </t>
        <t>
          In this scenario, the training task of LLM is split across multiple AIDCs based on parallelization strategies
          such as pipeline parallelism and data parallelism. During model training, the parameters of LLMs need to be
          synchronized among AIDCs. The synchronization traffic of the parameter plane is transmitted via RDMA protocol
          which usually features some elephant flows. Therefore, WAN should provide efficient and lossless transmission
          of parameter plane data. WAN needs to meet the following requirements:
        </t>
        <t>
          1. The characteristics of elephant flow are long duration and large data amount, which can easily cause network
          congestion. The synchronization of parameter plane requires low latency and zero packet loss. This scenario
          requires WAN to have millisecond-level fast congestion control capability, rapidly notifying upstream devices to
          slow down traffic rates upon detecting impending congestion. In addition, because traditional load balancing
          often relies on static policy, WAN needs to have a fast response for load balancing, immediately adjust load
          balancing decisions in response to network changes, ensuring optimal resource utilization and performance.
        </t>
        <t>
          2. Interruption of parameter plane synchronization due to network failure may result in breakpoint
          rollback, causing wastage of computing power, leading to a sharp decrease in computational efficiency <xref target="draft-cheng-rtgwg-ai-network-reliability-problem"/>.
          This scenario requires WAN to implement millisecond-level failure protection, which can quickly detect network
          failures and failover.
        </t>
      </section>

      <section>
        <name>Scenario 3: Coordinated model inference</name>
        <t>
          Many customers have deployed AI servers in their own DCs to support LLM inference applications. However, the
          high deployment cost and operational complexity of on-premises deployment limit the scale of computing power.
          Due to the increasing inference concurrency, this on-premises deployment method cannot meet the computing
          power demand. To address this, the collaboration model inference between customer and AIDCs presents a
          more efficient, agile, and cost-effective approach to realize elastic computing power scaling.
        </t>
        <t>
          In this scenario, the training task of LLM is split across customers and AIDCs based on parallelization
          strategies such as pipeline parallelism and expert parallelism. Taking the LLMs inference based on
          Prefill-Decode disaggregation architecture as an example, the input and output layers of Prefill/decode are
          placed in the customer, while other layers are placed in the AIDC, ensuring large-scale inference
          concurrency, utilizing the computing resources in AIDCs to handle larger-scale inference concurrency and
          ensuring that sample data does not leave customer's DC.
        </t>
        <t>
          During model inference, the parameter synchronization between AIDCs are transmitted via RDMA protocol.
          Similar to scenario 2, this scenario also requires WAN to have real-time elephant flow load balancing,
          millisecond-level congestion control, and fast network failure protection capabilities.
        </t>
      </section>
    </section>

    <section>
      <name>Problem Statement</name>
      <t>
        According to the AI scenarios mentioned above, the primary challenge for WAN is real-time traffic engineering
        and load balancing. Current traffic engineering mechanisms have difficulty providing low-latency and low-overhead
        solutions that meet the above requirements, presenting the following issues:
      </t>
      <t>
        1. Current load balancing techniques face great challenges in highly dynamic environments. One of the
        core issues is the lack of timely awareness and adaptive response to network state changes. Traditional
        mechanisms often rely on periodic global state synchronization or static policies, which results in delayed
        decision-making. The current controller-based load balancing uses In-situ OAM (IOAM) to obtain
        network status information. IOAM provides visibility into traffic by embedding telemetry data directly in
        packets. However, IOAM data is extracted and reported by the device CPU to a controller which adds latency
        and limits responsiveness <xref target="draft-geng-fantel-fantel-gap-analysis"/>. Moreover, controllers typically
        process telemetry in software, resulting in delayed decision-making. The delay of controller-based load balancing
        typically at second-scale, inevitably leads to network congestion and severe packet loss.
      </t>
      <t>
        2. Existing flow control mechanisms rely on delayed feedback or reactive responses, which can lead to suboptimal
        network performance in high-latency or long-RTT environments like WAN. TCP-based congestion control is a
        receiver-driven congestion control which uses feedback signals from the receiver to adjust the transmission rate of the
        sender. These signals are subject to RTT delays, especially problematic in high-speed dynamic environments.
        ECN <xref target="RFC3168"/> marks packets to indicate congestion. However, ECN relies on end-to-end signaling
        and lacks precise real-time  feedback. INT provides path-level telemetry by inserting metadata at each hop,
        which is returned to the sender via the ACK. Some congestion control algorithms, such as High Precision
        Congestion Control (HPCC), utilize INT for precise load-awareness. The
        telemetry based on INT generates an RTT delay before the sender receives feedback, which limits the response
        capability. These end-to-end signaling-based flow control mechanisms introduce tens of milliseconds of latency
        in large-scale WAN, which fails to meet the requirements for lossless data transmission.
      </t>
      <t>
        3. Existing failure protection mechanisms like BFD <xref target="RFC5880"/> and FRR <xref target="RFC7490"/>
        are widely deployed, they both have limitations in speed and scope. BFD is
        designed for rapid failure detection by sending frequent control packets between peers, but high probe frequency
        not only increases CPU and bandwidth usage but also strains the control plane in large-scale networks. Furthermore,
        50ms detection cycle also makes it difficult for BFD to meet the detection requirements of some large-scale
        networks for link failures. Routing convergence mechanisms depend on routing protocol convergence, which may
        take hundreds of milliseconds. FRR serves as the complementary mechanism to routing convergence, achieving
        millisecond-level failover through pre-computed backup paths. Due to protecting against only adjacent failures,
        FRR lacks flexibility and responsiveness in complex topologies, with recovery latency reaching tens of
        milliseconds. Traditional Failure protection mechanisms rely on periodic failure detection and centralized
        rerouting, resulting in recovery times that are not fast enough.
      </t>
    </section>

    <section>
      <name>Requirements</name>
        <t>
          To solve the above-mentioned problems, FANTEL is needed to provide real-time, rapid
          notification of network events to relevant network nodes, including:
        </t>
        <t>
          1. Fast network status notification. FANTEL uses traffic state detection to monitor traffic patterns,
          link utilization, and node load to trigger notifications on significant deviations
          <xref target="draft-geng-fantel-fantel-requirements"/>.
          Nodes can adjust the path and traffic rate in real-time based on FANTEL, achieving link status to achieve efficient
          traffic engineering and load balancing.
        </t>
        <t>
          2. Fast congestion notification. FANTEL provides a fast, low-latency notification mechanism that can detect
          and alert network devices to congestion events in real time. When congestion occur, node can adjust data
          transmission rate and re-route the transmission route based on FANTEL, preventing packet loss.
        </t>
        <t>
          3. Fast failure protection. FANTEL uses fast failure detection and notification to monitor real-time link/node
          status. When failure occurs, a node with protection mechanisms may immediately switch to backup paths, reroute
          traffic, or suppress affected routes, ensuring service reliability.
        </t>
        <t>
          In summary, FANTEL provides a real-time notification mechanism that can be used in WAN, enabling bandwidth
          utilization, lossless transmission, and fast failover in different AI scenarios.
        </t>
    </section>

    <section anchor="IANA">
    <!-- All drafts are required to have an IANA considerations section. See RFC 8126 for a guide.-->
      <name>IANA Considerations</name>
      <t>TBC</t>
    </section>

    <section anchor="Security">
      <!-- All drafts are required to have a security considerations section. See RFC 3552 for a guide. -->
      <name>Security Considerations</name>
      <t>TBC</t>
    </section>

    <!-- NOTE: The Acknowledgements and Contributors sections are at the end of this template -->
  </middle>

  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7490.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5880.xml"/>

        <reference anchor="draft-geng-fantel-fantel-requirements">
        <front>
          <title>Requirements of Fast Notification for Traffic Engineering and Load Balancing</title>
          <author>
          </author>
        </front>
        </reference>

        <reference anchor="draft-geng-fantel-fantel-gap-analysis">
        <front>
          <title>Gap Analysis of Fast Notification for Traffic Engineering and Load Balancing</title>
          <author>
          </author>
        </front>
        </reference>

        <reference anchor="draft-cheng-rtgwg-ai-network-reliability-problem">
        <front>
          <title>Gap Analysis of Fast Notification for Traffic Engineering and Load Balancing</title>
          <author>
          </author>
        </front>
        </reference>

        <!-- The recommended and simplest way to include a well known reference -->

      </references>
    </references>

    <section anchor="Contributors" numbered="false">
      <!-- [REPLACE/DELETE] a Contributors section is optional -->
      <name>Contributors</name>
      <t>Thanks to all the contributors.</t>
      <!-- [CHECK] it is optional to add a <contact> record for some or all contributors -->
    </section>

 </back>
</rfc>
