<?xml version="1.0" encoding="iso-8859-1" ?>
<!--<!DOCTYPE rfc SYSTEM "rfc4748.dtd"> -->
    <!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [ 
    <!ENTITY rfc2629 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml'> 
		<!ENTITY rfc2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'> 
    <!ENTITY rfc8279 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.8279.xml'> 
    <!ENTITY rfc7761 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7761.xml'>
    <!ENTITY rfc7450 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7450.xml'>
    <!ENTITY rfc4875 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4875.xml'>
    <!ENTITY rfc6388 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6388.xml'>
    <!ENTITY rfc9026 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.9026.xml'>
    <!ENTITY rfc5880 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5880.xml'>
    <!ENTITY rfc5884 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5884.xml'>
    <!ENTITY rfc8562 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.8562.xml'>
    <!ENTITY rfc0792 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.0792.xml'>
    <!ENTITY rfc4443 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4443.xml'>
    <!ENTITY rfc8029 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.8029.xml'>
    <!ENTITY I-D.ietf-pim-sr-p2mp-policy PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-pim-sr-p2mp-policy.xml'>
    <!ENTITY I-D.ietf-bier-bfd PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-bier-bfd.xml'>
    <!ENTITY I-D.ietf-bier-ping PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-bier-ping.xml'>
		]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>		

<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc disable-output-escaping="yes"?>

<rfc category="info"  docName="draft-ietf-mboned-redundant-ingress-failover-01"
     ipr="trust200902">
  <!-- ***** FRONT MATTER ***** -->
  <front>
    <title abbrev="Multicast Redundant Ingress Router Failover">Multicast Redundant Ingress Router Failover</title>
	
    <author fullname="Greg Shepherd" initials="G" surname="Shepherd">
      <organization>Cisco Systems, Inc.</organization>
      <address>
        <postal>
          <street>170 W. Tasman Dr.</street>
          <city>San Jose</city>
          <region></region>
          <code></code>
          <country>US</country>
        </postal>
        <email>gjshep@gmail.com</email>
      </address>
    </author>

    <author fullname="Zheng Zhang" initials="Z" role="editor" surname="Zhang">
      <organization>ZTE Corporation</organization>
      <address>
        <postal>
          <street></street>
          <city>Nanjing</city>
          <region></region>
          <code></code>
          <country>China</country>
        </postal>
        <email>zhang.zheng@zte.com.cn</email>
      </address>
    </author>

    <author fullname="Yisong Liu" initials="Y" surname="Liu">
      <organization>China Mobile</organization>
      <address>
        <postal>
          <street/>
          <city>Beijing</city>
          <region/>
          <code/>
          <country/>
        </postal>
        <email>liuyisong@chinamobile.com</email>
      </address>
    </author>
    
    <author fullname="Ying Cheng" initials="Y" surname="Cheng">
      <organization>China Unicom</organization>
      <address>
        <postal>
          <street></street>
          <city>Beijing</city>
          <region></region>
          <code></code>
          <country>China</country>
        </postal>
        <email>chengying10@chinaunicom.cn</email>
      </address>
    </author>

     <author fullname="Gyan Mishra" initials="G" surname="Mishra">
      <organization>Verizon Inc.</organization>
      <address>
        <postal>
          <street></street>
          <city></city>
          <region></region>
          <code></code>
          <country></country>
        </postal>
        <email>gyan.s.mishra@verizon.com</email>
      </address>
    </author>
    
	<date year="2022"/>	
    <area>OPS</area>
    <workgroup>MBONED WG</workgroup>
    <keyword>Multicast Redundant Ingress Router, Failover</keyword>
    <abstract>
     <t>This document discusses multicast redundant ingress router failover issues, include global multicast and Service Provider Network MVPN use case.  This document analyzes specification of global multicast and Multicast VPN Fast Upstream Failover and the Ingress PE Standby Modes and the benefits of each mode.
     </t>
    </abstract>
  </front>

  <!-- ***** MIDDLE MATTER ***** -->

  <middle>
  <section title="Introduction">
    <t>The multicast redundant ingress router failover is an important issue in 
    multicast deployment. 
    This document tries to do a research on it in the multicast domain. 
    The Multicast Domain is a domain which is used to forward multicast flow according to 
    specific multicast technologies, such as PIM (<xref target="RFC7761"/>), 
    BIER (<xref target="RFC8279"/>), P2MP TE tunnel (<xref target="RFC4875"/>), 
    MLDP (<xref target="RFC6388"/>), etc. 
    The domain may or may not connect the multicast source and receiver directly.</t>

    <t>The ingress router is close to the multicast source. 
    The ingress router may connect the multicast source directly, 
    or there may be multiple hops between the ingress router and the multicast source. 
    In the multicast domain, the ingress router is the most adjacent router to the multicast source.
    It's also called the first-hop router in PIM, or BFIR in BIER, 
    or Ingress LSR in P2MP TE tunnel or MLDP. </t>

    <t>The failover function between the multicast source and the ingress router can be achieved by many ways, 
    and it is not included in this document. </t>

    <t>The egress router is close to the multicast receiver.
    The egress router may connect the multicast receiver directly, 
    or there may be multiple hops between the egress router and the multicast receiver. 
    In the multicast domain, the egress router is the most adjacent router to the multicast receiver. 
    It's also called the last-hop router in PIM, or BFER in BIER, 
    or Egress LSR in P2MP TE tunnel or MLDP.</t>

    <t>There may be some other function deployed in the multicast domain, 
    such as static configuration, or AMT (<xref target="RFC7450"/>), 
    or SR P2MP Policy (<xref target="I-D.ietf-pim-sr-p2mp-policy"/>).</t>

    <t>This document doesn't discuss the details of these technologies. 
    This document discusses the general redundant ingress router failover ways 
    in the multicast domain.</t>

     <t>This document discusses global multicast and Service Provider Network MVPN use case with redundant ingress PE nodes Upstream Mulitcast Hop (UMH) and failover from primary UMH to Standby UMH in a multicast domain.  This document analyzes specification Multicast VPN Fast Upstream Failover (<xref target="RFC9026"/>) 
     and the Ingress PE Standby Modes and the benefits of each mode.
     </t>
  </section>

  <section title="Terminology">
    <t>The following abbreviations are used in this document:</t>
    <t>IR: the ingress router which is the most close to the multicast source in the multicast domain.</t>
    <t>ER: the egress router which is the most close to the multicast receiver in the multicast domain.</t>
    <t>SIR: The IR that is in charge of sending the multicast flow, 
    or the flow from the IR is accepted by the ERs, 
    the IR is called as the Selected-IR, that is SIR in abbreviation.</t>
    <t>BIR: The IR that is not in charge of sending the multicast flow, 
    or the flow from the IR is not accepted by the ERs,
    but the IR replaces the role of SIR once SIR fails.   
    The IR is called as the Backup-IR, that is BIR in abbreviation.</t>
  </section>
 
  <section title="Multicast Redundant Ingress Router Failover"> 
    <figure  align="center">
          <artwork align="center"><![CDATA[			
                source
                 ...
           +-----+      +-----+
+----------+ IR1 +------+ IR2 +---------+
|multicast +-----+      +-----+         |
|domain            ...                  |
|                                       |
|          +-----+      +-----+         |
|          | Rm  |      | Rn  |         |
|          ++---++      +--+--+         |
|           |   |          |            |
|     +-----+   +---+      +-----+      |
|     |             |            |      |
|   +-v---+      +--v--+      +--v--+   |
+---+ ER1 +------+ ER2 +------+ ER3 +---+
    +-----+      +-----+      +-----+
     ...           ...          ...
   receiver      receiver     receiver
                Figure 1
            ]]></artwork>
       <postamble></postamble>
    </figure>

    <t>Usually, a multicast source connects directly, or across multiple hops to two IRs 
    to avoid single node failure. As shown in figure 1,  
    there are two IRs close to a multicast source. 
    The two IRs are UMH (Upstream Multicast Hop) candidates for the ERs.</t>

    <t>The two IRs gets multicast flow from the mutlcast source, 
    how to forward the multicast flow to ERs is different according to the technologies 
    deployed in the multicast domain. 
    For example, for PIM which is used in this domain, 
    two PIM Trees that rooted on the two IRs may be built separately.</t>

    <t>The IRs works with the other router, such as the ER, in the multicast domain to 
    minimize the multicast flow packet loss during the IR swichover.</t>
  
    <section title="Swichover"> 
      <t>There may be some failures occurs in the domain, such as link failure, 
      node failure, if the failed link or node is on the multicast flow 
      forwarding path, there may be multicast flow packet loss.</t>
      <t>If there are multiple paths from the IR to the ERs, there is no need to 
      switch IR when some nodes or links fail.</t>

      <t>
      <list style="symbols">
        <t>When PIM is used in the domain as multicast forwarding protocol, 
        the forwarding tree for (S, G) or (*, G) is built in advance. 
        When a node or link in the forwarding tree fails, the tree is rebuilt partially.</t>

        <t>When BIER is used in the domain as multicast forwarding protocol, 
        there is no need to rebuilt forwarding tree in case of node or link failure, 
        the BIER forwarding recovers along with the IGP routing convergence.</t>
        
        <t>When P2MP TE tunnel or MLDP is used in the domain as multicast forwarding protocol, 
        the forwarding LSP is built in advance. When a node or link in the LSP fails, 
        the LSP may be rebuilt partially.</t>

        <t>When static multicast tree or SR P2MP policy is used in the domain, 
        the controller needs to re-compute a new forwarding path to bypass 
        the failed node or link.</t>
      </list>
      </t>
  
      <t>In some situations, there are some key nodes or links in the network. 
      The multicast path can not be recovered due to the key node or link failure. 
      The IR needs swichover.</t>
  
      <figure  align="center">
          <artwork align="center"><![CDATA[			
                  source
                   ...
           +-----+      +-----+
+----------+ IR1 +------+ IR2 +---------+
|          +--+--+      +--+--+         |
|             |            |            |
|          +--+--+      +--+--+         |
|          | Rx  |      | Ry  |         |
|          +-+-+-+      ++---++         |
|            | |         |   |          |
|            | +-----------+ |          |
|            |           | | |          |
|            | +---------+ | |          |
|            | |           | |          |
|          +-v-v-+      +--v-v+         |
|          | Rm  |      | Rn  |         |
|          ++---++      +--+--+         |
|           |   |          |            |
|     +-----+   +---+      +-----+      |
|     |             |            |      |
|   +-v---+      +--v--+      +--v--+   |
+---+ ER1 +------+ ER2 +------+ ER3 +---+
    +-----+      +-----+      +-----+
     ...           ...          ...
   receiver      receiver     receiver
                Figure 2
            ]]></artwork>
       <postamble></postamble>
    </figure>

      <t>For example in figure 2, there is only one path in the network partially.
      The IR1, Rx are key nodes in the domain, when IR1 or Rx fails, there is no any other path 
      between the IR1 and the ERs.</t>

      <t>
      <list style="symbols">
        <t>When PIM is used in the domain, Rm and Rn may choose Ry as the 
        upstream node to send Join message to build a new tree which rooted with IR2.</t>

        <t>When BIER is used in the domain, IR2 should in charge of the 
        forwarding role to forward the flow to the ERs.</t>
        
        <t>When P2MP TE tunnel or MLDP is used in the domain, the LSP started from IR2 can be built 
        and replace the used LSP started from IR1 when the used LSP does not work.</t>

        <t>When static multicast tree or SR P2MP policy is used in the domain, 
        the controller should let the IR2 to forward multicast flow to the ERs.</t>
      </list>
      </t>    
    </section>
    
    <section title="Failure detection">
    <t>In order to achieve the successful IR switchover, some methods should be used for monitoring the IR node failure or the path failure between IR and ERs, and the IR can do the switching once the failure occurs. BFD or PING methods can be used for it.</t>
    
    <t>BFD <xref target="RFC5880"/> can be used in all the deployments. Multipoint BFD <xref target="RFC8562"/> can also be used for the failure detection between IR and ERs. BFD for MPLS LSPs <xref target="RFC5884"/> can be used in P2MP TE tunnel or MLDP deployments. BIER BFD <xref target="I-D.ietf-bier-bfd"/> can be used in BIER deployment.</t>
    
    <t>IPv4 PING <xref target="RFC0792"/> and IPv6 PING <xref target="RFC4443"/> can also be used in all the deployments. LSP-Ping <xref target="RFC8029"/> can be used for P2MP TE tunnel or MLDP deployments. BIER PING <xref target="I-D.ietf-bier-ping"/> can be used in BIER deployment.</t>
    
    <t>BIR and ER can detect the SIR node and path failure easily by the BFD and PING methods. If the monitoring is between SIR and ER, how to trigger the switchover quickly is challenging when BIR needs to start forwarding the multicast flow. If the monitoring is between BIR and SIR, the path between BIR and SIR may fail, but the path is not the way from SIR to ERs, BIR may trigger the switchover by mistake, in this case unnecessary duplicate flow occurs. In this case, the ER must support the selective receiving and can be compatible with the IR switchover mistake. In order to minimize the mistaken switchover, the reliability of SIR/BIR detection needs to be enhanced, such as using redundant reliable paths for detection, etc.</t>

    <t>Multicast VPN Fast Upstream Failover <xref target="RFC9026"/> defines a mechanism to detect the P-Tunnel X-PMSI A-D route status using P2MP BFD <xref target="RFC5880"/> with a new advertised BGP attribute called a BFP Discriminator optional transitive attribute</t>
    
    <t>Multicast VPN Fast Upstream Failover <xref target="RFC9026"/> defines a new "Standby PE" BGP Community that the downstream PE originates and sends a "Standby BGP C-multicast route" with Standby Upstream PE UMH route RT import EC that identifies the Standby Upstream PE with NLRI constructed using RD of Standby Upstream PE UMH route.</t>
    
    </section>
  </section>

  <section title="Stand-by Modes"> 
    <t>In case there are more than one IRs can be the UMH, 
    and there is no other path from an IR to ERs in case of the IR fails, 
    the IR needs to be switched.</t>
  
    <t>Usually there are three types of stand-by modes in multicast IR protection. 
    <xref target="RFC9026"/> has some description on it. 
    This document discusses the detail of the three modes here.</t>

    <t>The ER may send request to upstream router or IR when it finds the node or path failure.
    The request from the ER may be the PIM tree building, or BIER overlay protocol 
    signaling, or LSP building, or some other ways to let IR knows whether forwards the multicast flow.</t>

    <section title="Cold Root Standby Mode">
      <t>In Cold Root Standby mode, the ER selects an SIR, 
      for example IR1 in figure 1, as the SIR and signals to it to get the multicast flow.</t>
      <t>When the ER finds that the SIR is down, or the ER finds that it cannot receive flow from IR1, 
      the ER signals to IR2 to get the multicast flow.</t>
      <t>
      <list style="symbols">
        <t>For IR, the IRs, include SIR and BIR, just do the regular operation of 
        forwarding flow according to the request from the ER.</t>

        <t>For ER, the ER must select an IR as the SIR and signal to it. 
        When the SIR fails or the path between the SIR and ER fails, 
        the ER must signal to the BIR to get the flow.</t>

        <t>For the intermediate routers, they know nothing about the role of IR, 
        they just do the packet forwarding. There is no duplicate packets in the domain.</t>
      </list>
      </t>
  
      <t>In case of the IR switchover, the ER detects the failure of SIR, and signals to the BIR.
      There is packet loss during the signaling until the ER receives the flow from the BIR.</t>
    </section>
    
    <section title="Warm Root Standby Mode">
      <t>In Warm Root Standby Mode, the ER signals to both IR1 and IR2.</t>
      <t>In case IR1 is the SIR, IR1 forwards the flow to the ER. 
      The BIR, for example the IR2, must not forward the flow to the ER until the SIR is down.</t>
      <t>
      <list style="symbols">
        <t>For IR, the IR should take the role of SIR or BIR. 
        The BIR must not forward flow to the ER. 
        When the SIR fails or the path between SIR and ER fails, 
        the BIR must start forwarding the flow to ER. 
        But it's hard to know the failure for BIR itself, some methods should be taken to let the BIR 
        to get the failure notification.</t>

        <t>For ER, the ER does not select the SIR or BIR. 
        The ER just signal to both of them.</t>
	
        <t>For the intermediate routers, they know nothing about the role of IR, 
        they just do the packet forwarding. There is no duplicate packets in the domain.</t>
      </list>
      </t>
    
      <t>In case of the IR switchover, the BIR detects the failure of the SIR 
      and switch to SIR. There is packet loss during the IR switchover.</t>
      
      <t>In some deployments, the SIR and BIR may in charge of different multicast flow. For a specific multicast flow, the SIR may be IR1, for another multicast flow, the SIR may be IR2. So the two IRs can share the multicast forwarding load. And another possible deployment is, the two IRs can in charge of different ERs for one multicast flow. For example, IR1 sends the multicast flow to some of the ERs, and IR2 send the multicast flow to the other ERs. In case IR1 detects there is something wrong between IR1 and the ERs, IR1 may notify IR2 to take over the responsibility of forwarding the multicast flow to these ERs that receive flow from IR1 before.</t>
      
    </section>
    
    <section title="Hot Root Standby Mode">
      <t>In Hot Root Standby Mode, the ER signals to both IRs.</t>
      <t>Both IRs are sending the flow to the ER. The ER must discard the duplicate flow from one of the IRs.</t>
      <t>In this situation, there are no SIR or BIR. Only ER knows which IR is the SIR.</t>
      <t>
      <list style="symbols">
        <t>For IR, the IR need not to know the roles of SIR or BIR, 
        IR just forwarding the flow according to the request received from ER.</t>

        <t>For ER, the ER signal to both of the IRs to get the flow. 
        And the ER must discard the duplicated flow from the backup BIR. 
        When the SIR fails or the path between SIR and ER fails, 
        the ER must switch the forwarding plane to forward the flow packet comes from the BIR. 
        To be noted, the ERs may choose different SIR or BIR.</t>
	
        <t>For the intermediate routers, they know nothing about the role of IR, 
        they just do the packet forwarding. There are duplicate packets forwarded in the domain.</t>
      </list>
      </t>
    
      <t>In case of the IR switchover, the ER detects the failure of the SIR. 
      Because there are duplicate flow packets arrive on the ER, 
      the ER just switch to forward the flow comes from the BIR. 
      There may be packet loss during the switching.</t>
    </section>
    
    <section title="Summary"> 
      <t>The table is a brief comparison among the three modes. 
      The 'SIR failover' means the SIR fails or the path between SIR and ER fails.</t>
        <texttable anchor="TABLE_1" title="">

     <ttcol align="left">role</ttcol>
     <ttcol align="left">Cold Mode</ttcol>
     <ttcol align="left">Warm Mode</ttcol>
     <ttcol align="left">Hot Mode</ttcol>

     <c>IR</c>
     <c>Forwarding flow according to the request from ER.</c>
     <c>Takes the role of SIR or BIR, BIR must not forward flow to ER until SIR failovers.</c>
     <c>Need not to know the roles of SIR or BIR, just forwarding flow according to the request from ER.</c>
     
     <c>ER</c>
     <c>Must select an IR as SIR to signal the request, signal to the BIR to request the flow when SIR failovers.</c>
     <c>Does not select the SIR or BIR, just signal to both of them.</c>
     <c>Signal to both of SIR and BIR. Discards the duplicate flow from BIR until SIR failover.</c>
  
     <c>Intermediate Router</c>
     <c>Knows nothing about SIR or BIR. No duplicated flow is forwarded.</c>
     <c>Knows nothing about SIR or BIR. No duplicated flow is forwarded.</c>
     <c>Knows nothing about SIR or BIR. Duplicated flow is forwarded.</c>
	</texttable>
  
      <t>The Cold stand-by mode is the easiest way to implementated, 
      but it takes the longest converge time.</t> 
      <t>The Hot stand-by mode takes the most less packet loss, 
      but there is duplicated packet forwarding in the domain, more bandwidth is occupied.</t>
      <t>The Warm stand-by mode takes the middle packet loss and converge time, 
      but it's hard for BIR to know the failure between SIR and ERs.</t>
      <t>So it's hard to say which mode is the best way for multicast redundant ingress router failover, 
      the network administrator should select the most suitable mode according to the network deployment.</t>
    </section>
  </section>

  <section title="IANA Considerations">
  <t>  This document does not have any requests for IANA allocation.</t>
  </section>
  
	<section title="Security Considerations"> 
	  <t>This document adds no new security considerations.</t>
	</section>	
	
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <references title='Normative References'>
    &rfc8279;
    &rfc7761;
    &rfc7450;
    &rfc4875;
    &rfc6388;
    </references>
    
    <references title='Informative References'>
    &rfc9026;
    &rfc5880;
    &rfc5884;
    &rfc8562;
    &rfc0792;
    &rfc4443;
    &rfc8029;
    <?rfc include="reference.I-D.ietf-pim-sr-p2mp-policy"?>
    <?rfc include="reference.I-D.ietf-bier-bfd"?>
    <?rfc include="reference.I-D.ietf-bier-ping"?>
    </references>
	</back>
</rfc>
