<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
     which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
     please see http://xml.resource.org/authoring/README.html. -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation --> <!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="3"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
     (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-song-ippm-postcard-based-telemetry-14" ipr="trust200902">
  <front>
    <title abbrev="Marking-based Direct Export">Marking-based Direct Export for On-path Telemetry</title>

    <author fullname="Haoyu Song" initials="H." surname="Song">
      <organization>Futurewei Technologies</organization>
      <address>
        <postal>
          <street>2330 Central Expressway</street>
          <city>Santa Clara, 95050</city>
          <country>USA</country>
        </postal>
        <email>hsong@futurewei.com</email>
      </address>
    </author>

    
	<author fullname="Greg Mirsky" initials="G." surname="Mirsky">
      <organization>Ericsson</organization>
      <address>
             <postal>
          <street></street>
          <city></city>
          <country></country>
        </postal>
        <email>gregimirsky@gmail.com</email>
      </address>
    </author>

	<author fullname="Clarence Filsfils" initials="C." surname="Filsfils">
      <organization>Cisco Systems, Inc.</organization>
      <address>
             <postal>
          <street></street>
          <city></city>
          <country>Belgium</country>
        </postal>
        <email>cfilsfil@cisco.com</email>
      </address>
    </author>

	<author fullname="Ahmed Abdelsalam" initials="A." surname="Abdelsalam">
      <organization>Cisco Systems, Inc.</organization>
      <address>
             <postal>
          <street></street>
          <city></city>
          <country>Italy</country>
        </postal>
        <email>ahabdels@cisco.com</email>
      </address>
    </author>

	<author fullname="Tianran Zhou" initials="T." surname="Zhou">
      <organization>Huawei</organization>
      <address>
        <postal>
          <street>156 Beiqing Road</street>
          <city>Beijing, 100095</city>
          <country>P.R. China</country>
        </postal>
        <email>zhoutianran@huawei.com</email>
      </address>
    </author>

    <author fullname="Zhenbin Li" initials="Z." surname="Li">
      <organization>Huawei</organization>
      <address>
        <postal>
          <street>156 Beiqing Road</street>
          <city>Beijing, 100095</city>
          <country>P.R. China</country>
        </postal>
        <email>lizhenbin@huawei.com</email>
      </address>
    </author>
	
	<author initials="T." surname="Graf" fullname="Thomas Graf">
      <organization>Swisscom</organization>
      <address>
        <postal>
          <street></street>
          <country>Switzerland</country>
        </postal>
        <email>thomas.graf@swisscom.com</email>
      </address>
    </author>
	
	<author fullname="Gyan Mishra" initials="G." surname="Mishra">
      <organization>Verizon Inc.</organization>
      <address>
        <postal>
          <street></street>
          <city></city>
          <country></country>
        </postal>
        <email>hayabusagsm@gmail.com</email>
      </address>
    </author>
	
    <author fullname="Jongyoon Shin" initials="J." surname="Shin">
      <organization>SK Telecom</organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <country>South Korea</country>
        </postal>

        <email>jongyoon.shin@sk.com</email>
      </address>
    </author>


    <author fullname="Kyungtae Lee" initials="K." surname="Lee">
      <organization>LG U+</organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <country>South Korea</country>
        </postal>

        <email>coolee@lguplus.co.kr</email>
      </address>
    </author>

	<date day="7" month="September" year="2022"/>

    <area>Operation and Management Area</area>
    <workgroup>IPPM</workgroup>

    <abstract>
      <t>
	 The document describes a postcard-based telemetry method using packet-marking, referred to as PBT-M. 
     PBT-M does not carry the telemetry data in user packets but sends the telemetry data through a dedicated packet. 
	 PBT-M does not require an extra instruction header but claims a bit as a trigger in existing header fields. 
	 PBT-M raises some unique issues that need to be considered.	 
     This document describes the high level scheme and covers the common requirements and issues when applying PBT-M in different networks. 
	 PBT-M is complementary to the other on-path telemetry schemes such as IOAM.      
      </t>
    </abstract>
  </front>

  <middle>

    <section title="Motivation">
      <t>To gain detailed data plane visibility to support effective network OAM, 
	 it is essential to be able to examine the trace of user packets along their forwarding paths. Such on-path flow data reflect  
	 the state and status of each user packet's real-time experience and provide valuable information 
	 for network monitoring, measurement, and diagnosis. 
      </t>
        
      <t>The telemetry data include but not limited to the detailed forwarding path, the timestamp/latency at each network node, and, in case of
       packet drop, the drop location, and the reason. 
	 The emerging programmable data plane devices allow user-defined data collection
	 or conditional data collection based on trigger events.
	 Such on-path flow data are from and about the live user traffic, which 
	 complements the data acquired through other passive and active OAM mechanisms such as 
	 <xref target="RFC7011">IPFIX</xref> and <xref target="RFC2925">ICMP</xref>. 
      </t>
            
      <t> On-path telemetry was developed to cater to the need of collecting on-path flow data. 
        There are two basic modes for on-path telemetry: the passport mode (e.g., <xref target="RFC9197"> IOAM trace option</xref>) and the postcard mode (e.g., <xref target="I-D.ietf-ippm-ioam-direct-export"> IOAM direct export option (DEX) </xref>).
      </t>
		
	  <t> This document describes a new variation of the postcard mode on-path telemetry, PBT-M. 
         PBT-M does not require a telemetry instruction header but a trigger bit in some existing header fields. 
		 However, PBT-M has some unique issues that need to be considered.	 
         This document discusses the challenges and their solutions which are common to the high-level scheme of PBT-M. </t>  
            
      
    </section>

    <section title="PBT-M: Marking-based Direct Export for On-path Telemetry">

      <t>As the name suggests, PBT-M only needs a marking-bit in the existing headers of user packets to trigger the telemetry data collection and export. 
          The sketch of PBT-M is as follows. 
		  If on-path data need to be collected, the user packet is marked at the path head node.   
          At each PBT-M-aware node, if the mark is detected, a postcard packet (i.e., the dedicated OAM packet triggered by a marked user packet) is generated and sent to a collector. 
	  The postcard contains the data requested by the management plane. 
	  The requested data are configured by the management plane.   
	  Once the collector receives all the postcards for a single user packet, it can infer the packet's forwarding path and analyze the data set. 
	  The path end node is configured to un-mark the packets to its original format if necessary. 
	</t><t>
          The overall architecture of PBT-M is depicted in Figure 1. 

	</t><t>
        
	<figure title="Architecture of PBT-M" anchor="figure_1">
          <artwork>

                      +------------+        +-----------+ 
                      | Network    |        | Telemetry | 
                      | Management |(-------| Data      | 
                      |            |        | Collector | 
                      +-----:------+        +-----------+ 
                            :                     ^ 
                            :configurations       |postcards 
                            :                     |(OAM pkts) 
             ...............:.....................|........
             :             :               :      |       :
             :   +---------:---+-----------:---+--+-------:---+
             :   |         :   |           :   |          :   |
             V   |         V   |           V   |          V   |
          +------+-+     +-----+--+     +------+-+     +------+-+
usr pkts  | Head   |     | Path   |     | Path   |     | End    |
     ====>| Node   |====>| Node   |====>| Node   |====>| Node   |===>
          |        |     | A      |     | B      |     |        |
          +--------+     +--------+     +--------+     +--------+
        mark usr pkts  gen postcards  gen postcards  gen postcards
        gen postcards                                unmark usr pkts  

          </artwork>
         </figure>

	</t>
          
      <t>    
        The advantages of PBT-M are summarized as follows.  
      </t>	      

        <t>
          <list style="symbols">
            <t>
              1: PBT-M avoids augmenting user packets with new headers and 
	             the signaling for telemetry data collection remains in the data plane. 
	    </t><t>
              2: PBT-M is extensible for collecting arbitrary new data to support possible future use cases. 
	      The data set to be collected can be configured through the management plane or control plane. 
	    </t><t>
              3: PBT-M can avoid interfering with the normal forwarding. 
	      The collected data are free to be transported independently through in-band or out-of-band channels. 
	      The data collecting, processing, assembly, encapsulation, and transport are, therefore, decoupled 
	      from the forwarding of the corresponding user packets and can be performed in data-plane slow-path if necessary. 
	    </t><t>
              4: For PBT-M, the types of data collected from each node can vary depending on application requirements and node capability.
	    </t><t>
              5: PBT-M makes it easy to secure the collected data without exposing it to unnecessary entities. 
	      For example, both the configuration and the telemetry data can be encrypted and/or authenticated before being transported,
	      so passive eavesdropping and a man-in-the-middle attack can both be deterred.
	    </t><t>
	          6: Even if a user packet under inspection is dropped at some node in the network, 
	      the postcards collected from the preceding nodes are still valid and can be used to 
	      diagnose the packet drop location and reason.
            </t>
			<t>7: Raw data can be processed or aggregated in data plane to reduce the exporting traffic load.
			</t>
         </list>
        </t>
        
      </section>

      <section anchor="challenge" title="New Challenges">
	   <t>
          Although PBT-M has some unique features, it also introduces a few new challenges. 
        </t><t>
	  <list style="symbols">
            <t>
              Challenge 1 (Packet Marking Bit): A user packet needs to be marked to trigger the path-associated data collection. 
			  Since PBT-M aims to avoid the need to augment user packets with new headers, 
	          it needs to reserve or reuse a single bit from the existing header fields.
			  This raises a similar issue as in <xref target="RFC8321">the Alternate Marking Scheme</xref>
            </t><t> 	      
			  Challenge 2 (Configuration): Since the packet header will not carry telemetry instructions anymore, 
	          the data plane devices need to be configured to know what data to collect. 
	          However, in general, the forwarding path of a flow packet (due to ECMP or dynamic routing) is unknown beforehand (note that there are some
              notable exceptions, such as segment routing). If the per-flow customized data collection is required, 
	          configuring the data set for each flow at all data plane devices might be expensive in terms of configuration load and data plane resources.
	        </t><t>  
              Challenge 3 (Data Correlation): Due to the variable transport latency, the dedicated postcard packets for a single packet may arrive at the collector 
	          out of order or be dropped in networks for some reason. In order to infer the packet forwarding path, 
	          the collector needs some information from the postcard packets to identify the user packet affiliation and the order of path node traversal.  
            </t><t>
              Challenge 4 (Load Overhead): Since each postcard packet has its header, the overall network bandwidth overhead of PBT-M can be high. 
			  A large number of postcards could add processing pressure on data collecting servers. That can be used as an attack vector for DoS. 
            </t>            
          </list>
        </t>  
      </section>

    <section title="Design Considerations">
        <t>      
          To address the above challenges, we propose several design details of PBT-M.
        </t>	      
	  <section title ="Packet Marking">
          <t>
            To trigger the path-associated data collection, usually, a single bit from some header field is sufficient. 
	    While no such bit is available, other packet-marking techniques are needed. 
	    We discuss several possible application scenarios.
	    </t><t>
	    <list style="symbols">
              <t>
		IPv4. <xref target="RFC8321">Alternate Marking (AM)</xref> is an IP flow performance measurement 
		framework that also requires a single bit for packet coloring. 
		The difference is that AM conducts in-network measurements such as latency and packet loss rate based on the bit alternating patterns while PBT-M only collects and exports data at each network nodes when the trigger bit is set. AM suggests to use some reserved bit of the Flag field or 
		some unused bit of the TOS field. PBT-M can share the same bit with AM, and rely on   
		the management plane to configure the actual operation mode.
              </t><t>	
	        SFC NSH. The OAM bit in the NSH header can be used to trigger the on-path data collection <xref target="RFC8300"></xref>. 
	        PBT-M does not add any other metadata to NSH.
	      </t><t>
	        MPLS. Instead of choosing a header bit, we take advantage of <xref target="I-D.bryant-mpls-synonymous-flow-labels"> 
		the synonymous flow label </xref> approach to mark the packets. A synonymous flow label indicates 
		the on-path data should be collected and 
		forwarded through a postcard. The ongoing MPLS Network Action (MNA) work <xref target="I-D.andersson-mpls-mna-fwk"/> may provide new in-stack headers for MNAs. A bit can be claimed for PBT-M. 
              </t><t>
            SRv6: A flag bit in SRH can be reserved to trigger the on-path data collection <xref target="I-D.song-6man-srv6-pbt"/>. <xref target= "RFC9259">SRv6 OAM</xref> 
		has adopted the O-bit in SRH flags as the marking bit to trigger the telemetry.  
            </t>
            </list>
          </t>
        </section>	  
	  <section title ="Flow Path Discovery">
          <t>
            In case the path that a flow traverses is unknown in advance, all PBT-M-aware nodes should be configured to react to the marked packets by 
            exporting some basic data, such as node ID and TTL before a data set template for that flow is configured. 
	    This way, the management plane can learn the flow path dynamically. 
          </t><t>	
            If the management plane wants to collect the on-path data for some flow, 
	    it configures the head node(s) with a probability or time interval for the flow packet marking. 
	    When the first marked packet is forwarded in the network, the PBT-M-aware nodes will export the basic data set to the collector. 
	    Hence, the flow path is identified. If other data types need to be collected, 
	    the management plane can further configure the data set's template to the target nodes on the flow's path. 
	    The PBT-M-aware nodes collect and export data accordingly if the packet is marked and a data set template is present. 
          </t><t>	
	    If the flow path is changed for any reason, the new path can be quickly learned 
	    by the collector. Consequently, the management plane controller can be directed 
	    to configure the nodes on the new path. The outdated configuration can be automatically timed out or 
	    explicitly revoked by the management plane controller.
          </t>	
        </section>	  
	  <section anchor="correlation" title ="Packet Identity for Export Data Correlation">
          <t>
            The collector needs to correlate all the postcard packets for a single user packet. 
	    Once this is done, the TTL (or the timestamp, if the network time is synchronized) 
	    can be used to infer the flow forwarding path. The key issue here is to correlate all the postcards for the same user packet. 
          </t><t>	
	    The first possible approach includes the flow ID plus the user packet ID in the OAM packets. 
            For example, the flow ID can be the 5-tuple IP header of the user traffic, and
	    the user packet ID can be some unique information pertaining to a user packet (e.g., the sequence number of a TCP packet).
          </t><t>	
            If the packet marking interval is large enough, the flow ID is enough to identify a user packet. 
	    As a result, it can be assumed that all the exported postcard packets for the same flow during a short time interval belong to the same user packet.
          </t><t>	
            Alternatively, if the network is synchronized, then the flow ID plus the timestamp at each node can also infer the postcard affiliation. 
	    However, some errors may occur under some circumstances. For example, 
	    two consecutive user packets from the same flows are marked, but one exported postcard from a node is lost. 
	    It is difficult for the collector to decide to which user packet the remaining postcard is related. 
	    In many cases, such a rare error has no catastrophic consequence. Therefore it is tolerable.    
          </t>	
      </section>	  
    
      <section title ="Load Control">
	  
		   <t>PBT-M should not be applied to all the packets all the time. It is better to be used in an interactive environment where the 
		   network telemetry applications dynamically decide which subset of traffic is under scrutiny. The network devices can limit the packet marking rate
           through sampling and metering. The postcard packets can be distributed to different servers to balance the processing load. </t>	

           <t>Because PBT-M sends telemetry data by dedicated postcard packets, it allows data aggregation and compression. Each node can process the generated raw data according to the configured local data-export policies. Such policies may specify how raw data is used to calculate performance metrics, e.g., max, min, mean, percentile, etc. 
           </t> 		   
	  
	  </section>
         
    </section>	    

    <section title="Implementation Recommendation">
	
	  <section title="Configuration">
        <t>Access lists with an optional sampler, <xref target="RFC5476"/>,
        should be configured and attached at the ingress of the PBT-M
        encapsulation node's to select the intended flows for PTB-M.</t>

        <t>Based on the PBT-M, the flow data should be exported at each
         transit node and at the end edge node with IPFIX <xref
        target="RFC7011"/>.</t>
      </section>

      <section title="Data Export">
        <t>The data decomposition can be achieved on the PBT-M-aware node exporting
        the data or on the IPFIX data collection. <xref
        target="I-D.spiegel-ippm-ioam-rawexport"/> describes how data is being
        exported when decomposed at IPFIX data collection. When being
        decomposed on the PBT-M-aware node the data can be aggregated according to
        section 5 of <xref target="RFC7015"/>. The following IPFIX entities
        are of interest to describe the relationship to the forwarding
        topology and the control-plane.</t>

        <t><list style="symbols">
            <t>node id and egressInterface(IE14) describes on which node which
            logical egress interfaces have been used to forward the
            packet.</t>

            <t>Node id and egressPhysicalInterface(253) describes on which
            node which physical egress interfaces have been used to forward
            the packet.</t>

            <t>Node id and ipNextHopIPv4Address(IE15) or
            ipNextHopIPv6Address(IE62), describes the forwarding path to which
            next-hop IP address.</t>

            <t>Node id and mplsTopLabelIPv4Address(IE47) or
            srhActiveSegmentIPv6 from <xref
            target="I-D.tgraf-opsawg-ipfix-srv6-srh"/> describes the
            forwarding path to which MPLS top label IPv4 address or SRv6
            active segment.</t>

            <t>BGP communities are often used for setting a path priority or
            service selection. bgpDestinationExtendedCommunityList(488) or
            bgpDestinationCommunityList(485) or
            bgpDestinationLargeCommunityList(491) describes which group of
            prefixes have been used to forward the packet.</t>

            <t>Node id and destinationIPv4Address(13),
            destinationTransportPort(11), protocolIdentifier (4) and
            sourceIPv4Address(IE8) describes the forwarding path on each node
            from each IPv4 source address to a specific application in the
            network.</t>

            <t>In order to distinguish wherever the packet has been export due
            to the packet marking or not, in case of SRv6, srhFlagsIPv6
            as described in section 4.1 of <xref
            target="I-D.tgraf-opsawg-ipfix-srv6-srh"/> can be added to the
            data export.</t>
          </list></t>
      </section>
	  
	</section>
	  

	  <section title="Use Cases">
	  
		  <t>The MPLS Open Design Team has been investigating network action support on the MPLS data plane <xref target="I-D.andersson-mpls-mna-fwk"/>.</t>
		  
		  <t>The challenge has been to continue to support existing MPLS architecture, backwards compatibility as well as not excessively increase the depth 
		  of the MPLS label stack with a variety of functional SPL labels and NAI indicators similar in concept to the MPLS Entropy label ELI, EL added to the 
		  label stack, as well as the MPLS extension headers being in stack or post stack.</t>
		  
		  
		  <t>Reference Augmented Forwarding (RAF) <xref target="I-D.raszuk-mpls-raf-fwk"></xref> utilizes In Stack Data (ISD) with parity to Entropy Label stack {TL,RFI,RFV,AL} 
		  and control plane extension to distribute special network actions and forwarding behaviors.</t> 
				 
		  <t>The MPLS Design Team may come up with other alternatives to carry network actions and PBT-M can be supported as a use case.</t> 
		  
		  
		  <t>With Segment Routing SR-MPLS and SRv6 as Maximum SID Depth(MSD) as well as PMTU in SR Policy are critical issues for SR path instantiation by a controller, PBT-M can become
		  a critical solution to ensure that on-path telemetry can be viable for operators by eliminating telemetry data from being carried in-situ in the SR-TE policy path.</t>
		  		   
		  <t>This draft provides a critical optimization that fills the gaps with IOAM DEX related to packet marking triggers using existing mechanisms as well 
		  as flow path discovery mechanisms to avoid configuration of on path data plane node complexity and helps mitigate SR MSD and PMTU issues.</t>  
		
	  
	  </section>


    <section anchor="Security" title="Security Considerations">
	    <t> Several security issues need to be considered. </t>

	    <t>
      <list style="symbols">
	<t>      
          Eavesdrop and tamper: the postcards can be encrypted and authenticated to avoid such security threats.		    
        </t><t>
	DoS attack: PBT-M can be limited to a single administrative domain. The mark must be removed at the egress domain edge. 
	The node can rate-limit the extra traffic incurred by postcards.	
        </t>
      </list>
      </t>      
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>
        No requirement for IANA is identified.
      </t>
    </section>

    
    <section anchor="Contributors" title="Contributors">
      <t>
		TBD.
      </t>
    </section>

    <section anchor="Acknowledgments" title="Acknowledgments">
      <t>
        We thank Robert Raszuk, Alfred Morton who provided valuable suggestions and comments helping improve this draft.
      </t>
    </section>

  </middle>

  <back>
    <references title="Informative References">
	  <?rfc include="reference.RFC.8321"?>
	  <?rfc include='reference.RFC.7011"?>
	  <?rfc include='reference.RFC.2925"?>
	  <?rfc include='reference.RFC.8300"?>
	  <?rfc include='reference.RFC.9197"?>
	  <?rfc include='reference.RFC.9259"?>
	  <?rfc include='reference.I-D.ietf-ippm-ioam-direct-export"?>  
	  <?rfc include='reference.I-D.song-6man-srv6-pbt"?>
      <?rfc include='reference.I-D.spiegel-ippm-ioam-rawexport"?>
	  <?rfc include='reference.I-D.bryant-mpls-synonymous-flow-labels"?>
	  <?rfc include='reference.I-D.raszuk-mpls-raf-fwk"?>
	  <?rfc include='reference.RFC.5476"?>
      <?rfc include='reference.RFC.7015"?>
      <?rfc include='reference.I-D.tgraf-opsawg-ipfix-srv6-srh"?>
	  <?rfc include='reference.I-D.andersson-mpls-mna-fwk"?>
    </references>
  </back>
</rfc>

