<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-hwy-opsawg-ifl-framework-04"
     ipr="trust200902">
  <front>
    <title abbrev="Inband Flow Learning Framework">Inband Flow Learning
    Framework</title>

    <author fullname="Liuyan Han" initials="L." surname="Han">
      <organization>China Mobile</organization>

      <address>
        <postal>
          <street/>

          <city>Beijing</city>

          <code/>

          <country>China</country>
        </postal>

        <email>hanliuyan@chinamobile.com</email>
      </address>
    </author>

    <author fullname="Minxue Wang" initials="M." surname="Wang">
      <organization>China Mobile</organization>

      <address>
        <postal>
          <street/>

          <city>Beijing</city>

          <country>China</country>
        </postal>

        <email>wangminxue@chinamobile.com</email>
      </address>
    </author>

    <author fullname="Xuanxuan Wang" initials="X." surname="Wang">
      <organization>Huawei</organization>

      <address>
        <postal>
          <street/>

          <city>Nanjing</city>

          <country>China</country>
        </postal>

        <email>wxxuan@huawei.com</email>
      </address>
    </author>

    <author fullname="Tianran Zhou" initials="T." surname="Zhou">
      <organization>Huawei</organization>

      <address>
        <postal>
          <street/>

          <city>Beijing</city>

          <country>China</country>
        </postal>

        <email>zhoutianran@huawei.com</email>
      </address>
    </author>

    <date day="27" month="July" year="2023"/>

    <workgroup>OPSAWG Working Group</workgroup>

    <abstract>
      <t>On-path telemetry techniques can provide high-precision inband flow insight and real-time network performance monitoring by embedding instructions or metadata into user packets. They are benificial but still has problems of deployability and flexibility in large scale deployment scenario. This document proposes a reference framework called Inband Flow Learning (IFL), which outlines the architecture and functional modules for automatic deployment and adjustment of flow-oriented monitoring using on-path telemetry techniques, trying to provide a solution for reference to solve the problems. This document also provides different deployment approaches and considerations in practical network deployment.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
      "OPTIONAL" in this document are to be interpreted as described in BCP 14
      <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when,
      they appear in all capitals, as shown here.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>On-path telemetry techniques described in <xref target="I-D.song-opsawg-ifit-framework"/> such as IOAM <xref target="RFC9197"/> and Alternate-Marking <xref target="RFC9341"/> can provide high-precision inband flow insight and real-time network performance monitoring (e.g., jitter, latency, packet loss) by embedding instructions or metadata into user packets. They are benificial for network operation to monitor live traffic
running in the network, based on inband flow information telemetry on the entire forwarding path.</t>

      <t>However, when deploying flow-oriented monitoring using on-path telemetry techniques on live traffic, problems like changes of flow characteristics or paths may occur whitch make the traditional static configuration mode no longer applicable. <xref target="I-D.hwyh-ippm-ps-inband-flow-learning"/> states problems of flow identification applying on-path telemetry techniques in real network scenarios, and describes the requirements for inband flow learning mechanism whitch intends to address the problems of deployability and flexibility. This document proposes a reference framework called Inband Flow Learning (IFL), which outlines the architecture and functional modules for automatic deployment and adjustment of flow-oriented monitoring using on-path telemetry techniques. This document also provides different deployment approaches and considerations in practical network deployment. Note that this document focuses on the generation of inband flow telemetry object, and inband flow performance measurement methods are out of the scope of this document.</t>
    </section>

    <section title="Terminology and Conventions">
      <t/>

      <section title="Requirement Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
        "OPTIONAL" in this document are to be interpreted as described in BCP
        14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only
        when, they appear in all capitals, as shown here.</t>

        <t/>
      </section>

      <section title="Terminology">
        <t>IFL: Inband Flow Learning</t>

        <t>IFITI: Inband Flow Information Telemetry Instance</t>
      </section>
    </section>

    <section title="Framework of Inband Flow Learning">
      <t>The domain of inband flow information telemetry consists of ingress nodes, transit nodes and egress nodes. The ingress nodes are responsible for enabling monitoring functions and the egress nodes are responsible for terminating them. All the nodes in the domain may participate in the inband flow learning by excecuting corresponding functions in the framework of Inband Flow Learning (IFL). The framework of IFL includes three components of Service Discovery, Inband Flow Information Telemetry Deployment and Inband Flow Information Telemetry Adjustment shown in Figure 1. Among these different components, inband flow learning can be embodied in automatic service discovery, automatic flow telemetry deployment, and automatic flow telemetry adjustment.</t>

      <t><figure align="center">
          <preamble/>

          <artwork><![CDATA[
   +---------+-------------------+------------------+------------------+
   |Component|      Service      |   Inband Flow    |   Inband Flow    |
   |         |     Discovery     |   Information    |   Information    |
   |         |                   |    Telemetry     |    Telemetry     |
   |         |                   |    Deployment    |    Adjustment    |
   +---------+-------------------+------------------+------------------+
   |Functions|   Sampling polic  | Telemetry policy |                  |
   |         |-------------------+------------------+       Aging      |
   |         |Flow characteristic|Telemetry instance|                  |
   |         |    acquisition    |                  |                  |
   +---------+-------------------+------------------+------------------+
]]></artwork>

          <postamble>Figure 1 Framework of Inband Flow Learning</postamble>
        </figure></t>

      <t/>

      <section title="Service Discovery ">
        <t>Before starting the telemetry on service flows, the service should be discovered in order to further determine which flow should be monitored. The target of service
        discovery function is to obtain the flow characteristics, whitch are represented in terms of IP source address, IP destination address, TCP/UDP port number, VRF, incoming/outgoing interface etc.</t>

        <t>Automatic service discovery is implemented based on the sampling policy delivered by the control plane and flow characteristic acquisition on the forwarding plane, whitch is usually performed on the ingress node. Sampling policy is a set of rules that instruct the forwarding plane to identify service flow characteristics based on a specific scope. Flow characteristic acquisition is a process in which the forwarding plane identifies, extracts, and reports service flow characteristic on the live traffic based on the sampling policy.</t>
        
        <t>For example, if the service traffic to be monitored has a particular port number, to automatically discover all flows of the service identified by 5-tuple, a sampling policy can be configured to match the live traffic with the particular port number and generate flow information at the 5-tuple granularity. When live traffic passes through the ingress node, the forwarding plane can filters traffic based on the specified sampling policy, identifies all flows with the particular port number, and reports the flows with 5-tuple information. The automatically discovered service flow information can be stored distributedly on the ingress node, or reported to the newwork controller for centralized management.</t>
      </section>

      <section title="Inband Flow Information Telemetry Deployment">
        <t>After acquiring the flow characteristics by service discovery, telemetry based on the inband flow information can be deployed automatically. Automatic flow telemetry deployment is implemented by creating telemetry instances based on telemetry policy, and executed on different types of network nodes in the domain according to the telemetry mode.</t>

        <section title="Telemetry Mode ">
          <t>There are two modes to deploy inband flow information telemetry: End-to-End (E2E) and Hop-by-Hop (HbH). For majority of the services, E2E telemetry of service flows can meet the requirements of network operators by providing the entire performance insight of the service. In E2E mode shown in Figure 2, ingress node discovers the characteristics of service flows and proceed on-path telemetry on the flows to be monitored. Egress node need to deploy the same monitoring flows and complete the telemetry. If the telemetry data is not carried in the data packet but is reported at each node, flow identifier is required to associate the data on data consumer. Documents like <xref target="RFC9326"/> <xref target="RFC9343"/> <xref target="I-D.ietf-mpls-inband-pm-encapsulation"/> provide the encapsulation format of flow identifier.</t>

          <t><figure align="center">
              <preamble/>

              <artwork align="center"><![CDATA[                    +-------------+
                    |Data Consumer| compute E2E flow info
                    +-------------+
                       |        |
         ___flow info__|        |____flow info____
        |   telemetry                telemetry    |
        |                                         |
 +---------+   +---------+    +---------+   +---------+
 | Ingress |---| Transit | ...| Transit |---| Egress  | 
 |   Node  |   |   Node  |    |   Node  |   |   Node  |  
 +---------+   +---------+    +---------+   +---------+
]]></artwork>

              <postamble>Figure 2 End-to-End Telemetry Mode</postamble>
            </figure></t>

          <t>The distinction of HbH mode to E2E mode is that transit node also
          participates the inband flow information learning and telemetry. In
          HbH mode shown in Figure 3, telemetry covers the flow information on
          every node of the forwarding path the flow packet is transmitted,
          which provides detailed flow information on each hop. Hop-by-Hop
          telemetry usually works in the need of an on-demand fault
          diagnose.</t>

          <t><figure align="center">
              <preamble/>

              <artwork align="center"><![CDATA[                  +-------------+
                  |Data Consumer| compute HbH flow info
                  +-------------+
                    |   |  |   |  flow info telemetry
      ______________|   |  |   |_________________
     |               ___|  |___                  |
     |              |          |                 |
 +---------+   +---------+    +---------+   +---------+
 | Ingress |---| Transit | ...| Transit |---| Egress  | 
 |   Node  |   |   Node  |    |   Node  |   |   Node  |  
 +---------+   +---------+    +---------+   +---------+
]]></artwork>

              <postamble>Figure 3 Hop-by-Hop Telemetry Mode</postamble>
            </figure></t>
          
        </section>

        <section title="Telemetry Policy">
          <t>Telemetry policy is used to determine which flow should be
          monitored. By configuring telemetry policy, it can increase the
          priority of learning and telemetry to critical flow and reduce or
          filter the learning and telemetry of unimportant flows. It is
          crucial to network deployment for two reasons, one is the number of
          flows can be huge, another is the limitation of processing
          capability either on the controller or the network node. There might
          be millions of flows in a large scale network, for example 5G mobile
          backhaul network. It is important to wisely choose the granularity
          of inband flow information telemetry.</t>

          <t>Regarding IP traffics, the telemetry policy can be based on
          either one of or combination of flow characteristics, such as IP
          source/destination address, TCP/UDP port number, VRFs, or network
          device interfaces etc. An IP address with a flexible wildcard mask
          can also be used as means to provide telemetry policy to an
          aggregation of flows.</t>
        </section>

        <section title="Telemetry Instance">
          <t>Inband Flow Information Telemetry Instance(IFITI), in short called telemetry instance, is the management object of the monitored flow for the deployment of flow-oriented on-path telemetry techniques under the framework of IFL. During its life cycle, IFITI is responsible for providing performance telemetry data on the nodes that the flow it monitors traverses.</t>

          <t>On ingress nodes IFITIs can be automatically generated in either distributed or centralized way by implementing telemetry policies for automatically discovered service flows. The transit nodes and egress nodes can also automatically generate IFITIs by learning some special information of the monitored flows whitch is embedded by the ingress nodes without configuring flow characteristics. Flow identifier is such special information whitch may be a unique value within a domain encapsulated in the service packets to setup the relationship between the characteristic information, telemetry instance and the service flow. It can not only correlate the telemetry data of flows on each node, as mentioned in the previous section, but also serve as the key marker for the forwarding plane to identify the monitored flow. For the forwarding plane, it is much easier to identify a piece of data in a service packet than to identify various types of flow characteristics.</t>

          <t>The following uses flow identifier as an example to describe the flow learning process on transit and egress node. Once the telemetry instance is created, ingress node can start the telemetry of flow information based on the method of on-path telemetry techniques. At the same time, ingress node encodes inband monitoring information in the service packets, including the identifier. When a service flow packet passes through the transit node or egress node, if the node detects that the packet contains a flow identifier, it considers that the packet is a service flow packet to be monitored, and automatically creates a telemetry instance using the identifier as the key.</t>

          <t>The automatic creation of telemetry instance on network node can greatly facilitate the dynamic and incremental deployment. On all types of nodes, network operators do not need to statically configure characteristics of monitored flows, which saves a lot of workload and reduces error probability in a large-scale deployment scenario. When the path of the monitored flow changes, the monitored flow can be automatically detected on the new path node and the corresponding telemetry instance can be automatically deployed. </t>
          
        </section>
      </section>
    </section>

    <section title="Inband Flow Information Telemetry Adjustment">
      <t>When route convergence happens to the network, service flow may
      switch to other forwarding nodes. When the traffic changes, telemetry
      instance varies as well. Regarding the telemetry instance running on the
      fault path, the aging of IFITI should be supported in order to recycle
      the network resources. IFITI should be deleted once it becomes stale. To
      monitor the same flow information, new telemetry instance is required to
      add on the new transit or egress node. Note that aging and adjustment of
      IFITI can be initiated by controller or network node. When a specific
      timer used for flow information telemetry timeout, the IFITI would be
      deleted to stop the telemetry of the flow.</t>

      <t/>
    </section>

    <section title="IANA Considerations">
      <t>This document has no request to IANA</t>
    </section>

    <section title="Security Considerations">
      <t>TBD</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"
?>

      <?rfc include="reference.RFC.8174"
?>

      <?rfc ?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.RFC.9197'?>
      
      <?rfc include='reference.RFC.9341'?>
      
      <?rfc include='reference.RFC.9326'?>
      
      <?rfc include='reference.RFC.9343'?>

      <?rfc include='reference.I-D.hwyh-ippm-ps-inband-flow-learning'?>

      <?rfc include='reference.I-D.ietf-mpls-inband-pm-encapsulation'?>
      
      <?rfc include='reference.I-D.song-opsawg-ifit-framework'?>

      <?rfc ?>
    </references>
  </back>
</rfc>
