<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-ietf-idr-rpd-15" ipr="trust200902">
  <front>
    <title abbrev="BGP RPD">BGP Extensions for Routing Policy
    Distribution (RPD)</title>

    <author fullname="Zhenbin Li" initials="Z. " surname="Li">
      <organization>Huawei</organization>
      <address>
        <postal>
          <street>Huawei Bld., No.156 Beiqing Rd.</street>
          <city>Beijing</city>
          <code>100095</code>
          <country>China</country>
        </postal>
        <email>lizhenbin@huawei.com</email>
      </address>
    </author>

    <author fullname="Liang Ou" initials="L. " surname="Ou">
      <organization>China Telcom Co., Ltd.</organization>
      <address>
        <postal>
          <street>109 West Zhongshan Ave,Tianhe District</street>
          <city>Guangzhou</city>
          <code>510630</code>
          <country>China</country>
        </postal>
        <email>ouliang@chinatelecom.cn</email>
      </address>
    </author>

    <author fullname="Yujia Luo" initials="Y. " surname="Luo">
      <organization>China Telcom Co., Ltd.</organization>
      <address>
        <postal>
          <street>109 West Zhongshan Ave,Tianhe District</street>
          <city>Guangzhou</city>
          <code>510630</code>
          <country>China</country>
        </postal>
        <email>luoyuj@sdu.edu.cn</email>
      </address>
    </author>

    <author fullname="Sujian Lu" initials="S." surname="Lu">
      <organization>Tencent</organization>
      <address>
        <postal>
          <street>Tengyun Building,Tower A ,No. 397 Tianlin Road</street>
          <city>Shanghai</city>
          <region>Xuhui District</region>
          <code>200233</code>
          <country>China</country>
        </postal>
        <phone/>
        <facsimile/>
        <email>jasonlu@tencent.com</email>
        <uri/>
      </address>
    </author>

    <author fullname="Gyan S. Mishra" initials="G" surname="Mishra">
      <organization>Verizon Inc.</organization>
      <address>
        <postal>
          <street>13101 Columbia Pike</street>
          <city>Silver Spring</city>
          <code>MD 20904</code>
          <country>USA</country>
        </postal>
        <phone> 301 502-1347</phone>
        <email>gyan.s.mishra@verizon.com</email>
      </address>
    </author>


     <author initials="H" surname="Chen" fullname="Huaimo Chen">
      <organization>Futurewei</organization>
      <address>
        <postal>
          <street></street>
          <city>Boston, MA</city>
          <region></region>
          <code></code>
          <country>USA</country>
        </postal>
        <email>Huaimo.chen@futurewei.com</email>
      </address>
    </author>

    <author fullname="Shunwan Zhuang" initials="S. " surname="Zhuang">
      <organization>Huawei</organization>
      <address>
        <postal>
          <street>Huawei Bld., No.156 Beiqing Rd.</street>
          <city>Beijing</city>
          <code>100095</code>
          <country>China</country>
        </postal>
        <email>zhuangshunwan@huawei.com</email>
      </address>
    </author>

    <author fullname="Haibo Wang" initials="H. " surname="Wang">
      <organization>Huawei</organization>
      <address>
        <postal>
          <street>Huawei Bld., No.156 Beiqing Rd.</street>
          <city>Beijing</city>
          <code>100095</code>
          <country>China</country>
        </postal>
        <email>rainsword.wang@huawei.com</email>
      </address>
    </author>
    <date year="2022"/>

    <abstract>
      <t>It is hard to adjust traffic and optimize traffic paths in a 
      traditional IP network from time to time through manual configurations. 
      It is desirable to have a mechanism for setting up 
      routing policies, which adjusts traffic and optimizes traffic paths automatically. 
      This document describes BGP Extensions for Routing Policy Distribution 
      (BGP RPD) to support this.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"></xref> <xref target="RFC8174"></xref>
      when, and only when, they appear in all
      capitals, as shown here.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>It is difficult to optimize traffic paths in a traditional
      IP network because of the following:<list style="symbols">

          <t>Complex. 
             Traffic can only be adjusted device by device. 
             The configurations on all the routers that
             the traffic traverses need to be changed or added.

             There are already lots of policies configured on the routers 
             in an operational network. There are different types of policies,
             which include security, management and control policies. 
             These policies are relatively stable.

             However, the policies for adjusting traffic are dynamic. 
             Whenever the traffic through a route is not expected,
             the policies to adjust the traffic for that route are 
             configured on the related routers.

             It is complex to dynamically add or change 
             the policies to the existing policies 
             on the special routers to adjust the traffic. 

             Some people would like to separate the stable route
             policies from the dynamic ones even though they have 
             configuration automation systems (including YANG models). 
             </t>

          <t>Difficult maintenance. 
             The routing policies used to adjust network traffic 
             are dynamic, posing difficulties to 
             subsequent maintenance. High maintenance
             skills are required.</t>

          <t>Slow. Adding or changing some route policies on some routers
             through a configuration automation system for adjusting some
             traffic to avoid congestions may be slow. </t>
        </list></t>

      <t>It is desirable to have an automatic mechanism 
      for setting up routing policies, which can simplify routing policy
      configuration and be fast.
      This document describes extensions to BGP for Routing
      Policy Distribution to resolve these issues.
      </t>
    </section> <!-- Introduction -->


    <section title="Terminology">
    <t>The following terminology is used in this document.</t>

      <t><list style="symbols">
          <t>ACL: Access Control List</t>

          <t>BGP: Border Gateway Protocol <xref target="RFC4271"></xref></t>

          <t>FS: Flow Specification</t>
          <t>NLRI: Network Layer Reachability Information 
            <xref target="RFC4271"></xref> </t>

          <t>PBR: Policy-Based Routing</t>

          <t>RPD: Routing Policy Distribution</t>

          <t>VPN: Virtual Private Network</t>
        </list></t>
    </section> <!-- Terminology -->

    <section title="Problem Statement">
      <t>Providers have the requirement to adjust their
      business traffic routing policies from time to time because of the following:
        <list style="symbols">
          <t>Business development or network failure introduces link
          congestion and overload.</t>

          <t>Business changes or network additions produce unused resources
          such as idle links.</t>

          <t>Network transmission quality is decreased as the result of delay,
          loss and they need to adjust traffic to other paths.</t>

          <t>To control OPEX and CPEX, they may prefer the transit provider 
          with lower price.</t>
        </list></t>

      <section title="Inbound Traffic Control">
        <t>In <xref target="in-traffic-ctr"/>, for the reasons above, 
        the provider P of AS100
        may wish the inbound traffic from AS200 to enter AS100 through
        link L3 instead of the others. Since P doesn't have any 
        administrative control over AS200, 
        there is no way for P to directly modify the route selection criteria
        inside AS200.
<figure anchor="in-traffic-ctr" align="center" 
       title="Inbound Traffic Control case">
            <artwork><![CDATA[
               Traffic from PE1 to Prefix1
          ----------------------------------->

+-----------------+            +-------------------------+ 
|     +---------+ |        L1  | +----+      +----------+|
|     |Speaker1 | +------------+ |IGW1|      |policy    ||
|     +---------+ |**      L2**| +----+      |controller||
|                 |  **    **  |             +----------+|
| +---+           |    ****    |                         |
| |PE1|           |    ****    |                         |
| +---+           |  **    **  |                         |
|     +---------+ |**      L3**| +----+                  |
|     |Speaker2 | +------------+ |IGW2|      AS100       |
|     +---------+ |        L4  | +----+                  |
|                 |            |                         |
|    AS200        |            |                         |
|                 |            |  ...                    |
|                 |            |                         |
|     +---------+ |            | +----+      +-------+   |
|     |Speakern | |            | |IGWn|      |Prefix1|   |
|     +---------+ |            | +----+      +-------+   |
+-----------------+            +-------------------------+   

            Prefix1 advertised from AS100 to AS200    
          <----------------------------------------]]></artwork>
          </figure></t>
      </section>

      <section title="Outbound Traffic Control">
        <t>In <xref target="out-traffic-ctr"/>, the provider P of AS100 prefers
        link L3 for the traffic to the destination Prefix2 among multiple
        exits and links to AS200. This preference can be dynamic and might change
        frequently
        because of the reasons above. So, provider P expects an efficient
        and convenient solution.
<figure anchor="out-traffic-ctr" align="center" 
       title="Outbound Traffic Control case">
            <artwork><![CDATA[
               Traffic from PE2 to Prefix2
          ----------------------------------->
+-------------------------+            +-----------------+
|+----------+      +----+ |L1          | +---------+     |
||policy    |      |IGW1| +------------+ |Speaker1 |     |
||controller|      +----+ |**        **| +---------+     |
|+----------+             |L2**    **  |        +-------+|
|                         |    ****    |        |Prefix2||
|                         |    ****    |        +-------+|
|                         |L3**    **  |                 |
|      AS100       +----+ |**        **| +---------+     |
|                  |IGW2| +------------+ |Speaker2 |     |
|                  +----+ |L4          | +---------+     |
|                         |            |                 |
|+---+                    |            |    AS200        |
||PE2|              ...   |            |                 |
|+---+                    |            |                 |
|                  +----+ |            | +---------+     |
|                  |IGWn| |            | |Speakern |     |
|                  +----+ |            | +---------+     |
+-------------------------+            +-----------------+

            Prefix2 advertised from AS200 to AS100    
          <----------------------------------------]]></artwork>
          </figure></t>
      </section>
    </section>


    <section title="Protocol Extensions">
       <t>This document specifies a solution using a new AFI and SAFI with 
          the BGP Wide Community for encoding a routing policy.
       </t>
      <section title="Using a New AFI and SAFI">
       <t>A new AFI and SAFI are defined: the Routing Policy AFI 
        whose codepoint 16398 has been assigned by IANA, and 
        SAFI whose codepoint 75 has been assigned by IANA.</t>

        <t>The AFI and SAFI pair uses a new NLRI, 
        which is defined as follows:</t>
        <t><figure anchor="nlri" align="center" 
            title="AFI and SAFI with new NLRI">
            <artwork><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+
| NLRI Length   |
+-+-+-+-+-+-+-+-+
| Policy Type   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Distinguisher (4 octets)                                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Peer IP (4/16 octets)                                         ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+]]></artwork>
          </figure></t>
<t>
Where:
<list style="hanging">
<t hangText="  NLRI Length:"> 1 octet represents the length of NLRI.
   If the Length is anything other than 9 or 21, the NLRI is corrupt 
   and the enclosing UPDATE message MUST be ignored.</t>
<t hangText="  Policy Type:"> 1 octet indicates the type of a policy. 
   1 is for Export policy. 2 is for Import policy.
   If the Policy Type is any other value, the NLRI is corrupt and 
   the enclosing UPDATE message MUST be ignored.</t>
<t hangText="  Distinguisher:"> 4 octet unsigned integer that 
   uniquely identifies the content/policy. It is used to sort/order 
   the polices from the lower to higher distinguisher. 
   They are applied in ascending order.
   A policy with a lower/smaller distinguisher is applied before
   the policies with a higher/larger distinguisher.</t>
<t hangText="  Peer IP:"> 4/16 octet value indicates IPv4/IPv6 peers.
   Its default value is 0, which indicates that when receiving a 
   BGP UPDATE message with the NLRI, a BGP speaker will apply
   the policy in the message to all its IPv4/IPv6 peers.</t>
</list>
</t>
       <t>Under RPD AFI/SAFI, 
          the RPD routes are stored and ordered according to the keys
          (Policy type, Distinguisher, Peer IP).

          Under IPv4/IPv6 Unicast AFI/SAFI,
          there are IPv4/IPv6 unicast routes learned and 
          various static policies configured.
 
          In addition, 
          there are dynamic RPD policies from the RPD AFI/SAFI 
          when RPD is enabled.</t>

       <t>Before advertising an IPv4/IPv6 Unicast AFI/SAFI route,
          the configured policies are applied to it first, 
          and then the RPD Export policies are applied. </t>


        <t>The NLRI containing the Routing Policy is carried in
        MP_REACH_NLRI and MP_UNREACH_NLRI path attributes 
        in a BGP UPDATE 
        message, which MUST also contain the BGP mandatory attributes 
        and MAY contain some BGP optional attributes.</t>

       <t>When receiving a BGP UPDATE message with routing policy, 
          a BGP speaker processes it as follows:<list style="symbols">

          <t>If the peer IP in the NLRI is 0, then apply the routing policy
          to all the remote peers of this BGP speaker.</t>

          <t>If the peer IP in the NLRI is non-zero, then the IP address
          indicates a remote peer of this BGP speaker and the routing
          policy will be applied to it.</t>
          </list>
        </t>

        <t>The content of the Routing Policy is encoded 
           in a BGP Wide Community.</t>
    </section> <!-- Using a New AFI and SAFI  -->


      <section title="BGP Wide Community and Atoms">
        <t>The BGP wide community attribute is defined in 
        <xref target="I-D.ietf-idr-wide-bgp-communities"/>.

This document 
specifies how two wide communities associate the routing policy NLRI to 
Routing Policy NLRI (section 4.1) to distribute 
routing policy to BGP peers. The wide communities 
which define routing policy are: 

<list style="symbols">
<t>MATCH AND SET ATTR (TBD1) </t>
<t>MATCH and NOT ADVERTISE (TBD2) </t>  
</list>

These wide communities are passed in the 
BGP wide community container in the wide community attribute. 
These communities support three of the optional TLVs: 
Target TLV, Exclude Target TLV, and Parameter TLV. 
The value of each of these TLVs comprises a series of Atoms, 
each of which is a TLV (or sub-TLV). 
</t>

<t>
A new wide community Atom is defined for BGP Wide Community Target(s) TLV 
(RouteAttr), and two new Atoms are defined for 
BGP Wide Community Parameter(s) TLV. 
For your reference, the format of the TLV is illustrated below:
</t>


          <t><figure anchor="atom-tlv" align="center" 
              title="Format of Wide Community Atom TLV">
            <artwork><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |     Type      |             Length            |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                         Value (variable)                      ~
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+]]></artwork>
          </figure></t>


<!--
<t>Target TLV contains RouteAttr TLV. The Type of Target TLV is 1.
At the same level, there are ExcTarget TLV (Type = 2) and
Param TLV (Type = 3).
 </t>
-->

      <section title="RouteAttr atom Sub-TLV">

        <t>A RouteAttr Atom sub-TLV (or RouteAttr sub-TLV for short) 
        is defined and may be included in a Target TLV. 
        It has the following format.</t>
          <t><figure anchor="rta-atom-sub-tlv" 
              title="Format of RouteAttr Atom sub-TLV">
            <artwork align="center"><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  Type (TBD3)  |        Length (variable)        |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                         sub-sub-TLVs                          ~
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
          </figure></t>

<t>The Type for RouteAttr atom is TBD3.
In RouteAttr sub-TLV, four sub-sub-TLVs are defined:
IPv4 Prefix, IPv6 Prefix, AS-Path, and Community sub-sub-TLV.
 </t>

          <t>An IP prefix sub-sub-TLV gives matching criteria on IPv4 prefixes.
 
          Its format is illustrated below:</t>
          <t><figure anchor="ipv4-range-sub-tlv" align="center" 
              title="Format of IPv4 Prefix sub-sub-TLV">
            <artwork><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  Type  1      |         Length (N x 8)        |M-Type | Flags |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                          IPv4 Address                         |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |     Mask      |     GeMask    |     LeMask    |M-Type | Flags |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 ~       . . . 
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                          IPv4 Address                         |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |     Mask      |     GeMask    |     LeMask    |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
          </figure></t>
<t>
     <list style="hanging">
       <t hangText="Type:"> 1 for IPv4 Prefix.</t>
       <t hangText="Length:"> N x 8, 
          where N is the number of tuples 
          &lt;M-Type, Flags, IPv4 Address, Mask, GeMask, LeMask&gt;.
          If Length is not a multiple of 8, the Atom is corrupt and the enclosing 
          UPDATE message MUST be ignored.</t>
       <t hangText="M-Type:"> 4-bit field specifying match type. 
          The following four values are defined. 
           IPaddress is the IP address in the sub-sub-TLV 
           while IProute is the IP route being matched.
          <list style="hanging">
            <t hangText="M-Type = 0:"> Exact match with the Mask length 
               IP address prefix.
               GeMask and LeMask MUST be sent as zero and ignored on receipt.</t>
            <t hangText="M-Type = 1:"> 
               Matches if the Mask number of prefix bits exactly
               match between IPaddress and IProute and the actual 
               prefix length of IProute is greater than or equal to 
               GeMask. LeMask MUST be sent as zero and ignored on receipt.
               </t>
            <t hangText="M-Type = 2:"> 
               Matches if the Mask number of prefix bits exactly
               match between IPaddress and IProute and the actual 
               prefix length of IProute is less than or equal to 
               LeMask. GeMask MUST be sent as zero and ignored on receipt.
               </t>
            <t hangText="M-Type = 3:"> 
               Matches if the Mask number of prefix bits exactly
               match between IPaddress and IProute and the actual 
               prefix length of IProute is less than or equal to LeMask
               and greater than or equal to GeMask.
               </t>
          </list>
       </t>
       <t hangText="Flags:"> 4 bits. No flags are currently defined.
          They MUST be sent as zero and ignored on receipt.</t>
       <t hangText="IPv4 Address:"> 4 octets for an IPv4 address.</t>
       <t hangText="Mask:"> 1 octet for the IP address prefix length
          that needs to exactly match between 
          the IP address in the sub-sub-TLV and the route.</t> 
       <t hangText="GeMask:"> 1 octet for route prefix length match 
          range's lower bound, 
          MUST not be less than Mask or be 0.</t> 
       <t hangText="LeMask:"> 1 octet for route prefix length match 
          range's upper bound,
          MUST be greater than Mask or be 0.</t> 
      </list>
</t>
          <t>For example, tuple 
          &lt;M-Type=0, Flags=0, IPv4 Address = 1.1.0.0, Mask = 22, 
          GeMask = 0, LeMask = 0&gt; represents an exact IP prefix 
          match for 1.1.0.0/22. </t>

          <t>&lt;M-Type=1, Flags=0, IPv4 Address = 16.1.0.0, Mask = 24, 
          GeMask = 24, LeMask = 0&gt; represents match IP prefix 
          16.1.0.0/24 greater-equal 24 
          (i.e., route matches if route's first Mask=24 bits match
           16.1.0 and 24 =&lt; route's prefix length =&lt; 32).</t> 

          <t>&lt;M-Type=2, Flags=0, IPv4 Address = 17.1.0.0, Mask = 24, 
          GeMask = 0, LeMask = 26&gt; represents match IP prefix 
          17.1.0.0/24 less-equal 26 
          (i.e., route matches if route's first Mask=24 bits match
           17.1.0 and 24 =&lt; route's prefix length &lt;= 26).</t>

          <t>&lt;M-Type=3, Flags=0, IPv4 Address = 18.1.0.0, Mask = 24, 
          GeMask = 24, LeMask = 30&gt; represents match IP prefix 
          18.1.0.0/24 greater-equal 24 and less-equal 30
          (i.e., route matches if route's first Mask=24 bits match
           18.1.0 and 24 =&lt; route's prefix length &lt;= 30).</t>

          <t>Similarly, an IPv6 Prefix sub-sub-TLV represents 
          match criteria on IPv6 prefixes. 
          Its format is illustrated below:</t>

          <t><figure anchor="ipv6-range-sub-tlv" align="center" 
              title="Format of IPv6 Prefix sub-sub-TLV">
            <artwork><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |   Type  4     |         Length (N x 20)       |M-Type | Flags |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                     IPv6 Address (16 octets)                  ~
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |     Mask      |     GeMask    |     LeMask    |M-Type | Flags |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 ~       . . . 
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                      IPv6 Address (16 octets                  ~
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |     Mask      |     GeMask    |     LeMask    |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
         </figure></t>

          <t>An AS-Path sub-sub-TLV represents a match criteria in 
          a regular expression string. 
          Its format is illustrated below:</t>
          <t><figure anchor="as-path-sub-tlv" align="center" 
              title="Format of AS Path sub-sub-TLV">
            <artwork><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  Type  2      |      Length (Variable)        |  
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                    AS-Path Regex String                       |
 :                                                               :
 |                                                               ~
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
          </figure></t>
        <t>
          <list style="hanging">
            <t hangText="Type:"> 2 for AS-Path.</t>
            <t hangText="Length:"> Variable, maximum is 1024.</t>
            <t hangText="AS-Path Regex String:">
               AS-Path regular expression string.</t>
          </list>
        </t>

          <t>A community sub-sub-TLV represents a list of 
          communities to be matched all.
          Its format is illustrated below:</t>
          <t><figure anchor="com-sub-tlv" align="center" 
              title="Format of Community sub-sub-TLV">
            <artwork><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  Type  3      |        Length (N x 4 + 1)       |    Flags    |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                      Community 1 Value                        |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 ~                              . . .                            ~
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                      Community N Value                        |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
          </figure></t>
<t>
     <list style="hanging">
       <t hangText="Type:"> 3 for Community.</t>
       <t hangText="Length:"> N x 4 + 1, 
          where N is the number of communities.
          If Length is not a multiple of 4 plus 1, the Atom is corrupt and 
          the enclosing UPDATE MUST be ignored.
          </t>
       <t hangText="Flags:"> 1 octet. No flags are currently defined.
          These bits MUST be sent as zero and ignored on receipt.</t>
      </list>
</t>
        </section> <!-- RouteAttr sub-TLV -->

      <section title="Sub-TLVs of the Parameters TLV">

        <t>This document introduces 2 community values:
<list style="hanging">
    <t hangText="MATCH AND SET ATTR (TBD1):">
    If the IPv4/IPv6 unicast routes to a remote peer
    match the specific conditions defined in the routing policy extracted
    from the RPD route, then the attributes of the IPv4/IPv6 unicast
    routes will be modified when sending to the remote peer per the
    actions defined in the RPD route.</t>

    <t hangText="MATCH AND NOT ADVERTISE (TBD2):">
    If the IPv4/IPv6 unicast routes to a remote
    peer match the specific conditions defined in the routing policy
    extracted from the RPD route, then the IPv4/IPv6 unicast routes will
    not be advertised to the remote peer.</t>
</list>
        </t>

        <t>For the Parameter(s) TLV, two action sub-TLVs are defined:
        MED change sub-TLV and AS-Path change sub-TLV.
        When the community in the container is MATCH AND SET ATTR,
        the Parameter(s) TLV can include these sub-TLVs. 
        When the community is MATCH AND NOT ADVERTISE,
        the Parameter(s) TLV's value is empty.</t>

          <t>A MED change sub-TLV indicates an action to change 
          the MED.
          Its format is illustrated below:</t>
          <t><figure anchor="med-sub-tlv" align="center" 
              title="Format of MED Change sub-TLV">
            <artwork><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  Type  1      |          Length (5)           |      OP       |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                           Value                               |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
          </figure></t>
<t>
     <list style="hanging">
       <t hangText="Type:"> 1 for MED Change.</t>
       <t hangText="Length:"> 5.
          If Length is any other value, the sub-TLV is corrupt and the enclosing 
          UPDATE MUST be ignored.</t>
       <t hangText="OP:"> 1 octet. Three are defined:
          <list style="hanging">
            <t hangText="OP = 0:"> assign the Value to the existing MED.</t>
            <t hangText="OP = 1:"> add the Value to the existing MED.
               If the sum is greater than the maximum value for MED, 
               assign the maximum value to MED.</t>
            <t hangText="OP = 2:"> subtract the Value from the existing MED.
               If the existing MED minus the Value is less than 0, 
               assign 0 to MED.</t>

            <t hangText="If OP is any other value, the sub-TLV is ignored."></t>
          </list>
        </t>
       <t hangText="Value:"> 4 octets.</t>
      </list>
</t>

          <t>An AS-Path change sub-TLV indicates an action to change the AS-Path. 
          Its format is illustrated below:</t>
          <t><figure anchor="as-path-change-sub-tlv" align="center" 
              title="Format of AS-Path Change sub-TLV">
            <artwork><![CDATA[
  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |  Type  2      |        Length (n x 5)         |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                             AS1                               |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |    Count1     |  
 +-+-+-+-+-+-+-+-+
 ~       . . . 
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                             ASn                               |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |    Countn     |  
 +-+-+-+-+-+-+-+-+
]]></artwork>
          </figure></t>
<t>
     <list style="hanging">
       <t hangText="Type:"> 2 for AS-Path Change.</t>
       <t hangText="Length:"> n x 5.
           If Length is not a multiple of 5, the sub-TLV is corrupt and the 
           enclosing UPDATE MUST be ignored.</t>
       <t hangText="ASi:"> 4 octet. An AS number. </t>
       <t hangText="Counti:"> 1 octet. ASi repeats Counti times.</t>
      </list>
</t>
          <t>The sequence of AS numbers are added 
          to the existing AS Path.</t>

        </section> <!-- New Wide Community Atoms -->
      </section> <!-- BGP Wide Community -->

      <section title="Capability Negotiation">
        <t>It is necessary to negotiate the capability to support BGP 
        Extensions for Routing Policy Distribution (RPD). The BGP RPD
        Capability is a new BGP capability <xref target="RFC5492"/>. The
        Capability Code for this capability is 72 assigned by the IANA.
        The Capability Length field of this capability is variable. The
        Capability Value field consists of one or more of the following
        tuples:</t>

        <t><figure anchor="rpd-cap" align="center" 
              title="BGP RPD Capability">
            <artwork align="center"><![CDATA[
+--------------------------------------------------+
|  Address Family Identifier (2 octets)            |
+--------------------------------------------------+
|  Subsequent Address Family Identifier (1 octet)  |
+--------------------------------------------------+
|  Send/Receive (1 octet)                          |
+--------------------------------------------------+
]]></artwork>
          </figure></t>

        <t>The meaning and use of the fields are as follows:</t>

        <t>Address Family Identifier (AFI): This field is the same as the one
        used in <xref target="RFC4760"/>.</t>

        <t>Subsequent Address Family Identifier (SAFI): This field is the same
        as the one used in <xref target="RFC4760"/>.</t>

        <t>Send/Receive: This field indicates whether the sender is (a)
        willing to receive Routing Policies from its peer
        (value 1), (b) would like to send Routing Policies to
        its peer (value 2), or (c) both (value 3) for the &lt;AFI,
        SAFI&gt;.
        If Send/Receive is any other value, that tuple is ignored 
        but any other tuples present are still used.</t>
      </section> <!-- Capability Negotiation -->
    </section> <!-- Protocol Extensions -->


    <section title="Operations">
        <t>This section presents a typical application scenario and
           some details about handling a related failure. </t>

      <section title="Application Scenario">
        <t><xref target="ctr-rr"/> illustrates a typical scenario, 
           where RPD is used by a controller with a Route Reflector 
           (RR) to adjust traffic dynamically.
 
          <figure anchor="ctr-rr" align="center" 
                  title="Controller with RR Adjusts Traffic">
            <artwork align="center"><![CDATA[
    +--------------+  
    |  Controller  |   
    +-------+------+       
             \                 
              \ RPD               
            .--\._.+--+                       ___...__
        __(     \       '.---...             (         )
       /      RR o -------- A o) ---------- (o X   AS2  )
      (o E       |\             )     _____//(___   ___)
       (         | \_______ B o) ____/     /     '''
        (o F      \           )       ____/
         (         \_____ C o) ______/         ___...__
          '    AS1        _)  \_____          (         )
           '---._.-.     )          \_______ (o Y   AS3  )
                    '---'                     (___   ___)
                                                  '''
]]></artwork>
          </figure>
        The controller connects the RR through a BGP session. 
        There is a BGP session between the RR and each of  
        routers A, B and C in AS1, which is shown in the figure. 
        Other sessions in AS1 are not shown in the figure. </t>

        <t>There is router X in AS2. 
           There is a BGP session between X and each of routers 
           A, B and C in AS1.</t>

        <t>There is router Y in AS3. 
           There is a BGP session between Y and router C 
           in AS1.</t>

        <t>The controller sends a RPD route to the RR. 
           After receiving the RPD route from the controller,
           the RR reflects the RPD route to routers A, B and C.
           After receiving the RPD route from the RR, 
           routers A, B and C extract the routing policy from 
           the RPD route. 
           If the peer IP in the NLRI of the RPD route is 0, 
           then apply the routing policy to all the remote peers 
           of routers A, B and C.
           If the peer IP in the NLRI of the RPD route is non-zero,
           then the IP address indicates a remote peer of 
           routers A, B and C and such routing policy is applied
           to the specific remote peer.
           The IPv4/IPv6 unicast routes towards router X in AS2 
           and router Y in AS3 will be adjusted based on the 
           routing policy sent by the controller via a RPD route.</t>

        <t>The controller uses the RT extend community to notify
           a router whether to receive a RPD policy.
           For example, if there is not any adjustment on router B,
           the controller sends RPD routes with the RTs for A and C.
           B will not receive the routes.</t>

       <t>The process of adjusting traffic in a network is a close loop. 
          The loop starts from the controller with some traffic 
          expectations on a set of routes. 
          The controller obtains the information about traffic flows 
          for the related routes.
          It analyzes the traffic and checks whether the current
          traffic flows meet the expectations.
          If the expectations are not met, 
          the controller adjusts the traffic. 
          And then the loop goes to the starter of the loop 
          (The controller obtains the information about traffic ...).</t>
      </section> <!-- Application Scenario -->

      <section title="About Failure">
       <t>This section describes some details about 
          handling a failure related to a RPD route being applied.
          </t>
       <t>A RPD route is not a configuration. 
          When it is sent to a router from a controller,
          no ack is needed from the router. 
          The existing BGP mechanisms are re-used for delivering
          a RPD route. After the route is delivered to a router,
          it will be successful. This is guaranteed by the BGP protocols.
          </t>

       <t>If there is a failure for the router to install the route
          locally, this failure is a bug of the router. The bug needs 
          to be fixed.</t>

       <t>For the errors mentioned in <xref target="RFC7606"/>,
          they are handled according to <xref target="RFC7606"/>.
          These errors are bugs, which need to be resolved.</t>

       <t>When the controller fails
          while a RPD route is being applied such as on the way 
          to the router, 
          some existing mechanisms such BGP Graceful Restart (GR)
          <xref target="RFC4724"/> 
          and BGP Long-lived Graceful Restart (LLGR) can be used
          to let the router keep the routes from the controller for
          some time.</t>

       <t>With support of "Long-lived Graceful Restart Capability"
          <xref target="I-D.ietf-idr-long-lived-gr"/>, 
          the routes can be retained for a longer time after the controller
          fails.</t>

       <t>After the controller recovers from its failure, 
          the router will have all the routes (including 
          the RPD route being applied) from the controller. </t>

       <t>In the worst case, the controller fails and 
          the RPD routes for adjusting the traffic are withdrawn.
          The traffic adjusted/redirected may take its old path.
          This should be acceptable.</t>

      </section> <!-- About Failure -->

    </section> <!-- Operations -->

    <section title="Contributors">
      <t>The following people have substantially contributed to the definition
      of the BGP-FS RPD and to the editing of this document:<figure
          align="left">
          <artwork><![CDATA[Peng Zhou
Huawei
Email: Jewpon.zhou@huawei.com
]]></artwork>
        </figure></t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>Protocol extensions defined in this document do not 
      affect BGP security other than as discussed 
      in the Security Considerations section of [RFC8955].</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>The authors would like to thank Acee Lindem, Jeff Haas, Jie Dong,
      Lucy Yong, Qiandeng Liang, Zhenqiang Li, Robert Raszuk, 
      Donald Eastlake, Ketan Talaulikar, and Jakob Heitz
      for their comments to this work.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">

    <section anchor="existing" title="Existing Assignments">
      <t>IANA has assigned an AFI of value 16398 from the registry 
      "Address Family Numbers" for Routing Policy.</t>

      <t>IANA has assigned a SAFI of value 75 from the registry 
         "Subsequent Address Family Identifiers (SAFI) Parameters" 
         for Routing Policy.</t>

      <t>IANA has assigned a Code Point of value 72 from the registry
         "Capability Codes" for Routing Policy Distribution.</t>
    </section>

<section anchor="wide-com-values" title="Registered IANA Wide Communities">
<t>IANA Should assign from the Registered Wide Community Values" the following values:
<figure>
<artwork align="center">
<![CDATA[ 
+---------------------+------------------------------+-------------+ 
| Community Value     | Description                  | Reference   | 
+---------------------+------------------------------+-------------+ 
| TBD1                | MATCH AND SET ATTR           |This document|
+---------------------+------------------------------+-------------+ 
| TBD2                | MATCH AND NOT ADVISE         |This document|
+---------------------+------------------------------+-------------+ ]]>
</artwork>
</figure>
</t>
</section>

    <section anchor="rt-attr-type" title="RouteAttr Atom Type">
      <t>IANA is requested to assign a code-point from the registry
         "BGP Community Container Atom Types" as follows:

        <figure>
            <artwork align="center"><![CDATA[
   +---------------------+------------------------------+-------------+
   | Atom Code Point     | Description                  | Reference   |
   +---------------------+------------------------------+-------------+
   | TBD3 (48 suggested) | RouteAttr Atom               |This document|
   +---------------------+------------------------------+-------------+]]></artwork>
          </figure>
      </t>
    </section>

    <section anchor="rt-attr-sub-tlv" title="Route Attributes Sub-sub-TLV Registry">
      <t>IANA is requested to create a registry called 
         "Route Attributes Sub-sub-TLV"
         under RouteAttr Atom Sub-TLV.
         The allocation policy of this registry is "First Come First
         Served (FCFS)".
      </t>
      <t>The initial code points are as follows:

        <figure>
            <artwork align="center"><![CDATA[
   +-------------+-----------------------------------+-------------+
   | Code Point  | Description                       | Reference   |
   +-------------+-----------------------------------+-------------+
   |      0      |  Reserved                         |             |
   +-------------+-----------------------------------+-------------+
   |      1      |  IPv4 Prefix Sub-sub-TLV          |This document|
   +-------------+-----------------------------------+-------------+
   |      2      |  AS-Path Sub-sub-TLV              |This document|
   +-------------+-----------------------------------+-------------+
   |      3      |  Community Sub-sub-TLV            |This document|
   +-------------+-----------------------------------+-------------+ 
   |      4      |  IPv6 Prefix Sub-sub-TLV          |This document|
   +-------------+-----------------------------------+-------------+
   |   5 - 255   |  Available                        |             |
   +-------------+-----------------------------------+-------------+]]></artwork>
          </figure>
      </t>
    </section>

    <section anchor="attr-change-sub-tlv" title="Attribute Change Sub-TLV Registry">
      <t>IANA is requested to create a registry called "Attribute Change Sub-TLV"
         under Parameter(s) TLV.
         The allocation policy of this registry is "First Come First
         Served (FCFS)".
      </t>
      <t>Initial code points are as follows:
        <figure>
            <artwork align="center"><![CDATA[
   +-------------+-----------------------------------+-------------+
   | Code Point  | Description                       | Reference   |
   +-------------+-----------------------------------+-------------+
   |      0      |  Reserved                         |             |
   +-------------+-----------------------------------+-------------+
   |      1      |  MED Change Sub-TLV               |This document|
   +-------------+-----------------------------------+-------------+
   |      2      |  AS-Path Change Sub-TLV           |This document|
   +-------------+-----------------------------------+-------------+
   |   3 - 255   |  Available                        |             |
   +-------------+-----------------------------------+-------------+]]></artwork>
          </figure>
</t>
    </section>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include='reference.RFC.4271'?>

      <?rfc include='reference.RFC.4760'?>

      <?rfc include='reference.RFC.8955'?>

      <?rfc include='reference.RFC.5492'?>

      <?rfc include='reference.RFC.8174'?>

      <?rfc include='reference.I-D.ietf-idr-wide-bgp-communities'?>
    </references>

    <references title="Informative References">
      <?rfc include='reference.RFC.4724'?>
      <?rfc include='reference.RFC.7606'?>
      <?rfc include='reference.I-D.ietf-idr-registered-wide-bgp-communities'?>
      <?rfc include='reference.I-D.ietf-idr-long-lived-gr'?>
    </references>


  </back>
</rfc>
