<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="std" docName="draft-mhmcsfh-ippm-pam-02" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" xml:lang="en" tocInclude="true" tocDepth="3" symRefs="true" sortRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.6.0 -->
  <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

<front>
    <title abbrev="PAM for Multi-SLO">Precision Availability Metrics for SLO-Governed End-to-End Services</title>
    <seriesInfo name="Internet-Draft" value="draft-mhmcsfh-ippm-pam-02"/>
    <author fullname="Greg Mirsky" initials="G." surname="Mirsky">
      <organization>Ericsson</organization>
      <address>
        <postal>
          <street/>
          <city/>
          <code/>
          <country/>
        </postal>
        <email>gregimirsky@gmail.com</email>
      </address>
    </author>
        <author fullname="Joel Halpern" initials="J." surname="Halpern">
      <organization>Ericsson</organization>
      <address>
        <postal>
          <street/>
          <city/>
          <code/>
          <country/>
        </postal>
        <email>joel.halpern@ericsson.com</email>
      </address>
    </author>
    <author fullname="Xiao Min" initials="X." surname="Min">
      <organization>ZTE Corp.</organization>
      <address>
        <postal>
          <street/>
          <city/>
          <code/>
          <country/>
        </postal>
        <email>xiao.min2@zte.com.cn</email>
      </address>
    </author>
     <author fullname="Alexander Clemm" initials="A." surname="Clemm">
      <organization>Futurewei</organization>
      <address>
        <postal>
          <street>2330 Central Expressway</street>
          <city>Santa Clara</city>
          <code>CA 95050</code>
          <country>USA</country>
        </postal>
        <email>ludwig@clemm.org</email>
      </address>
    </author>
        <author fullname="John Strassner" initials="J." surname="Strassner">
      <organization>Futurewei</organization>
      <address>
        <postal>
          <street>2330 Central Expressway</street>
          <city>Santa Clara</city>
          <code>CA 95050</code>
          <country>USA</country>
        </postal>
        <email>strazpdj@gmail.com</email>
      </address>
    </author>
    <author fullname="Jerome Francois" initials="J." surname="Francois">
      <organization>Inria</organization>
      <address>
        <postal>
          <street>615 Rue du Jardin Botanique</street>
          <city>Villers-les-Nancy</city>
          <code>54600</code>
          <country>France</country>
        </postal>
        <email>jerome.francois@inria.fr</email>
      </address>
      </author>

    <date year="2022"/>
    <area>Transport</area>
    <workgroup>Network Working Group</workgroup>
    <keyword>Internet-Draft</keyword>
    <keyword>IPPM</keyword>
    <keyword>Performance Measurement </keyword>
  
  <abstract>
      <t>
   This document defines a set of metrics for networking services with
   performance requirements expressed as Service Level Objectives (SLO).
   These metrics, referred to as Precision Availability Metrics (PAM),
   are useful for defining and monitoring of SLOs.
  Specifically, PAM can be used by providers and/or users of the Network Slice service
  to assess whether the service is provided
  in compliance with its specified quality, i.e., in accordance with its
  defined SLOs.
</t>
    </abstract>
    
  </front>
  <middle>
    <section anchor="intro" numbered="true" toc="default">
      <name>Introduction</name>
      <t>
  Network operators and network users often need to assess the quality with which network services are being provided and delivered.
  In particular in cases where service level guarantees are given and service level objectives (SLOs) are defined,
  it is essential to provide a measure of the degree with which actual service levels that are delivered comply with
  SLOs that were agreed, typically in a contract or agreement. 
  Examples of service levels include service latency and packet loss. Simple examples of SLOs associated
  with such service levels would be target values for the maximum packet delay
  (one-way and/or round trip) or maximum packet loss ratio that would be deemed acceptable. 
  </t>
<t>
   An example of an SLO is one that characterizes the continued ability
   of a particular set of nodes to communicate. Essentially, the absence
   of what is, in other contexts, is called a defect. The SLO would include
   the various time and measurement aspects that would be
   interpreted as a defect or failure to communicate. It is important to note that it
   is being defined as a state, and thus, it has conditions that define
   entry into it and exit out of it.  It is expected that an SLA
   includes a defect-related SLO, possibly in addition to other SLOs.
</t>
  <t>
  To express the perceived quality of delivered networking services versus their SLOs, a set of metrics
  are needed to characterize the quality of the service being provided.  
  Of concern is not so much the absolute service level (for example, actual latency experienced),
  but whether the service is provided in accordance with the negotiated, and eventually contracted, service levels.
  For instance, this may include whether the packet delay that is experienced falls within
  an acceptable range that has been contracted for the service.
  The specific quality of service depends on the SLO that is in effect.   
  <!-- Different groups of applications set forth requirements for varying sets of service levels with different target values.
  Such applications range from Augmented Reality/Virtual Reality to mission-critical controlling industrial processes. -->
  A non-conformance to an SLO might result in degradation of the quality of experience for gamers
  or even jeopardize the safety of a large geographical area.
  However, as those applications represent clear business opportunities, they demand dependable technical solutions.
  </t>
  <t>
  The same service level may be deemed acceptable for one application, while unacceptable for another,
  depending on the needs of the application.  Hence it is not sufficient to simply measure service levels per se over time,
  but to assess the quality of the service being provided with the applicable SLO in mind. 
However, at this point, there are no standard metrics in place that can be used to account for the quality with which services
are delivered relative to their SLOs, and whether their SLOs are being met at all times. 
Such metrics and the instrumentation to support them are essential
for a number of purposes, including monitoring (to ensure that networking services
are performing according to their objectives) as well as accounting (to maintain a record of service levels delivered, important
for monetization of such services as well as for triaging of problems).
      </t>
      <t>
The current state-of-the-art of metrics available today includes, for example,
 interface metrics, useful to obtain data on traffic volume and
 behavior that can be observed at an interface <xref target="RFC2863"/>
 and <xref target="RFC8343"/>, but agnostic of actual service levels and not specific to
  distinct flows.  Flow records <xref target="RFC7011"/> and <xref target="RFC7012"/> maintain statistics
 about flows, including flow volume and flow duration, but again,
 contain very little information about end-to-end service levels, let
  alone whether the service levels delivered to meet their targets, i.e., their associated SLOs.
      </t>
      <t>
  This specification introduces a new set of metrics, Precision Availability Metrics (PAM), aimed at capturing
   end-to-end service levels for a flow, specifically the degree to
   which flows comply with the SLOs that are in effect. 
   PAM can be used to assess whether a service is provided in compliance with its specified quality,
   i.e., in accordance with its defined SLOs. This information can be used in multiple ways, for example,
   to optimize service delivery, take timely counteractions in the event of service degradation,
   or account for the quality of services being delivered.
   </t>
   <t>
   Availability is discussed in Section 3.4 of <xref target="RFC7297"/>.
   In this document, the term "availability" reflects that
   a service that is characterized by its SLOs is considered unavailable whenever those SLOs are violated,
   even if basic connectivity is still working. "Precision" refers to the fact that services
   whose end-to-end service levels are governed by SLOs, and which must therefore be precisely delivered
   according to the associated quality and performance requirements. It should be noted that precision
   refers to what is being assessed, not the mechanism used to measure it; in other words, 
   it does not refer to the precision of the mechanism with which actual service levels are measured. 
   Furthermore, the precision, with respect to the delivery of an SLO, only applies when the metric value
approaches the specified threshold levels in the SLO. The specification and implementation of methods
   that provide for accurate measurements is a separate topic independent of the definition of
   the metrics in which the results of such measurements would be expressed.  
      </t>
      <t>
      Service Level Expectations (SLEs), as defined in Section 4.1 of <xref target="I-D.ietf-teas-ietf-network-slices"/>,
      are outside the scope of this document, because it is in the nature of SLEs that they define parts of the SLA that are not easily measured.
       </t>
      <t>
      [Ed.note: It should be noted that at this point, the set of metrics proposed
   here is intended as a "starter set" that is intended to spark further
   discussion.  Other metrics are certainly conceivable; we expect that
   the list of metrics will evolve as part of the Working Group discussions.]
      </t>
    </section>
    <section numbered="true" toc="default">
      <name>Conventions and Terminology</name>
            <section numbered="true" toc="default">
        <name>Terminology</name>
        <t>
        In this document, SLA and SLO are used as defined in Section 4.1 <xref target="I-D.ietf-teas-ietf-network-slices"/>.
        </t>
        </section>
        
      <section numbered="true" toc="default">
        <name>Acronyms</name>
    
        <t>PAM               Precision Availability Metric</t>
        <t>OAM                Operations, Administration, and Maintenance</t>
        <t>SLA              Service Level Agreement</t>
        <t>SLE              Service Level Expectations</t>
        <t>SLO             Service Level Objective</t>
        <t>VI                  Violated Interval</t>
        <t>VIR               Violated Interval Ratio</t>
        <t>SVI               Severely Violated Interval</t>
       <t>SVIR            Severely Violated Interval Ratio</t>
        <t>VFI                   Violation-Free Interval</t>

      </section>
      <!--
      <section numbered="true" toc="default">
        <name>Requirements Language</name>
        <t>
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
   "MAY", and "OPTIONAL" in this document are to be interpreted as
   described in BCP 14 <xref target="RFC2119" format="default"/> <xref target="RFC8174" format="default"/> 
   when, and only when, they appear in all capitals, as shown here.
        </t>
      </section>
      -->
    </section>
    <section anchor="ep-metrics-section" numbered="true" toc="default">
      <name>Precision Availability Metrics</name>
      <section anchor="preliminaries" numbered="true" toc="default">
      <name>Introducing Violated Intervals</name>

      <t>
When analyzing the availability metrics of a service flow between two nodes,
we need to select a time interval as the unit of PAM. In <xref target="ITU.G.826" format="default"/>,
a time interval of one second is used. That is reasonable, but some services may require different granularity.
For that reason, the time interval in PAM is viewed as a variable parameter though constant for a particular measurement session.
Further, for the purpose of PAM, each time interval,
e.g., second or decamillisecond, is classified either as Violated Interval (VI),
Severely Violated Interval (SVI), or Violation-Free Interval (VFI ). These are defined as follows:
</t>
      <ul spacing="normal">
        <li>VI is a time interval during which at least one of the performance
      parameters degraded below its pre-defined optimal level threshold.</li>
        <li>SVI is a time interval during which at least one the performance
      parameters degraded below its pre-defined critical threshold.</li>
        <li>Consequently, VFI  is a time interval during which all performance objectives are
        at or better than their respective pre-defined optimal levels.
        <!-- In such a case, the service is in compliance with its specification. --></li>
      </ul>
      <t>
      Mechanisms of setting levels of threshold of an SLO are outside the scope for this document.
      </t>
<t>
From these defitions, a set of basic metrics can be defined that count the numbers of time intervals that fall into each category: 
</t>
<ul spacing="normal">
<li>VI count. </li>
<li>SVI count. </li>
<li>VFI  count. </li> 
</ul>
<t>
These count metrics are essential in calculating respective ratios (see <xref target="derived-ep-metrics-section"/>)
that can be used to assess the instability of the service.
</t>
</section>


    <section anchor="derived-ep-metrics-section" numbered="true" toc="default">
      <name>Derived Precision Availability Metrics</name>
      <t>
      A set of metrics can be created based on PAM introduced in <xref target="ep-metrics-section"/>. 
      In this document, these metrics are referred to as derived PAM.
      Some of these metrics are modeled after Mean Time Between Failure (MTBF) metrics - a
   "failure" in this context referring to a failure to deliver a packet according to its SLO.
      </t>
      <ul spacing="normal">
      <li>
      Time since the last violated interval (e.g., since last violated ms,
      since last violated second). 
      (This parameter is suitable for monitoring the current compliance status of the service, e.g., for trending analysis.)
      </li>
      <li>
      Packets since the last violated packet.  (This parameter is
     suitable for the monitoring of the current compliance status of the service.)
      </li>
      <li>
      Mean time between VIs (e.g., between violated milliseconds, violated seconds) is the
      arithmetic mean of time between consecutive VIs.
      </li>
      <li>
      Mean packets between VIs is the arithmetic
      mean of the number of SLO-compliant packets between consecutive VIs.
     (Another variation of "MTBF" in a service setting.)
      </li>
      </ul>
      <t>An analogous set of metrics can be produced for SVI:</t>
      <ul spacing="normal">
       <li>
      Time since the last SVI (e.g., since last violated ms, since last violated second).  (This parameter is suitable
      for the monitoring of the current compliance status of the service.)
      </li>
      <li>
      Packets since the last severely violated packet.  (This parameter is
      suitable for the monitoring of the current compliance status of the service.)
      </li>
      <li>
      Mean time between  SVIs (e.g., between severely violated
      milliseconds, severely violated seconds) is the
      arithmetic mean of time between consecutive SVIs.
      </li>
      <li>
      Mean packets between SVIs is the arithmetic
      mean of the number of SLO-compliant packets between consecutive SVIs.
     (Another variation of "MTBF" in a service setting.)
      </li>
     </ul>
        <t>
Determining the current condition of the monitored service with respect to availability/unavailability is helpful.
But because the transition between service availability/unavailability periods is based on a pre-defined number
of consecutive intervals, e.g., ten, shorter conditions may not be adequately reflected.
Two additional PAMs can be used, and they are defined as follows:
</t>
        <ul spacing="normal">
          <li>
 violated interval ratio (VIR) is the ratio of the combined number of VIs and SVIs to the total number of time unit intervals in a
      time of the availability periods during a fixed measurement interval.
  </li>
          <li>
severely violated interval ratio (SVIR) - is the ratio of SVIs to the total number of time unit intervals in a time of the availability periods
during a fixed measurement interval.
</li>
        </ul>
      </section>
      
      <section anchor="measure-PAM-section" numbered="true" toc="default">
        <name>Service Availability in PAMs</name>
        <t>
VI, SVI, and VFI characterize the communication between two nodes relative
to the level of required and acceptable performance and when the performance level degrades below an acceptable level.
The former condition in this document defined to as service availability. The latter is defined as service unavailability.
Based on the definitions in <xref target="preliminaries"/>, SVI is the one time interval of service
unavailability while VI and VFI  present an interval of service availability.
Since the conditions of the service are are continually changing, periods of availability
and unavailability need to be defined with duration larger than one time interval
to reduce the number of state changes while correctly reflecting the service condition.
</t>
<t>
It is worth noting that a composite service might include a set of connectivity
constructs.  An SLO might apply to all the constructs, or some  
constructs are assigned different sets    
of SLOs.  For the purpose of PAM, each connectivity construct that
composes the service can be monitored for its own SLO conformance as a 
sub-service.  The composition of PAMs of these
sub-services can be viewed as the PAM of the composite service.    
The composition of PAMs of these sub-services can be viewed as the PAM of the composite service.
          </t>
<t>
The method to determine the state of the service in terms of PAM is described below:
</t>
        <ul spacing="normal">
          <li>
If ten consecutive SVIs been detected, then the PAM state of the service
is defined as unavailability, and the beginning of that period of unavailability state
is at the start of the first SVI in the sequence of the consecutive SVIs.
</li>
          <li>
<!--Similarly, for ten consecutive non-SVIs, i.e., either VIs or VFI s, indicate that the service
is in the availability period, i.e., available.-->
Similarly, for ten consecutive non-SVIs (i.e., either VIs or VFIs), the service is defined to be available.
The start of that period is at the beginning of the first non-SVI.
</li>
          <li>
Resulting from these two definitions, a sequence of less than ten consecutive
SVIs or non-SVIs does not change the PAM state of the service.
For example, if the PAM state is determined as unavailable, a sequence of seven VFI s
is not viewed as an availability period.
</li>
        </ul>
      </section>
    </section>
    <!--
    <section anchor="requirements-PAM-section" numbered="true" toc="default">
      <name>Requirements to PAM</name>
      <t>
  TBA
      </t>
    </section>
    -->
    <!--
    <section anchor="epm-candidate-section" numbered="true" toc="default">
      <name>Active OAM Protocol for PAM</name>
      <t>
Digital communication methods characterized as the constant-bit rate digital paths and connections allow
measurement of the PAMs without using an active OAM. That is possible because a predictable flow
of digital signals is expected at an egress system. That is not the case for packet-switched networks that are based on
the principle of statistical multiplexing flows. The latter usually improves the utilization of the communication network's resources,
but it also makes the flow unpredictable for the egress system. For that reason, an active OAM has to be used in measuring the
PAM in a network. A combination of OAM protocols can provide the necessary for PAM functionality. 
For example, Bidirectional Forwarding Detection (BFD) <xref target="RFC5880" format="default"/> can be used to monitor the continuity of a path between
the ingress and egress systems. And STAMP <xref target="RFC8762" format="default"/> can be used to measure and calculate performance metrics that are
used as Service Level Objectives. But using two protocols and correlating the state of the network from them adds to the complexity in network operation.
</t>

      <t>
  The Integrated OAM, described in <xref target="I-D.mmm-rtgwg-integrated-oam" format="default"/>,
  combines lightweight FM OAM with the comprehensive set of performance measurement methods.
  PM component of the Integrated OAM is based on <xref target="RFC6374" format="default"/> that supports,
  among other measurement methods, one-way and two-way
  packet loss and packet delay measurements.
      </t>
  
    </section>
-->
    <section anchor="statistical-slo-section" numbered="true" toc="default">
      <name>Statistical SLO</name>
 <t>
     It should be noted that certain Service Level Agreements (SLA) may be
   statistical, requiring the service levels of packets in a
   flow to adhere to specific distributions.  For example, an SLA might
   state that any given SLO applies to at least a certain percentage of
   packets, allowing for a certain level of, for example, 
   packet loss and/or exceeding packet delay threshold to take place.
   Each such event, in that case, does not necessarily constitute an
   SLO violation.  However, it is still useful to maintain those
   statistics, as the number of out-of-SLO packets still matters when
   looked at in proportion to the total number of packets.
   </t>
   <t>
   Along that vein, an SLA might establish an SLO of, say, end-to-end
   latency to not exceed 20 ms for 99% of packets, to not exceed 25ms for
   99.999% of packets, and to never exceed 30ms for any packet.  In
   that case, any individual packet with latency larger than 20 ms latency
   and lower than 30 ms cannot be considered an SLO violation in itself, but compliance with
   the SLO may need to be assessed after the fact.
   </t>
   <t>
   To support statistical SLOs more directly requires
   additional metrics, such as metrics that represent histograms for
   service level parameters with buckets corresponding to individual
   service level objectives.  For the example just given, a histogram
   for a given flow could be maintained with three buckets: one
   containing the count of packets within 20ms, a second with a count of
   packets between 20 and 25ms (or simply all within 25ms), a third with
   a count of packets between 25 and 30ms (or merely all packets within
   30ms, and a fourth with a count of anything beyond (or simply a total
   count).  Of course, the number of buckets and the boundaries between
   those buckets should correspond to the needs of the SLA associated with the application,
   i.e., to the specific guarantees and SLOs that were
   provided.  The definition of histogram metrics is for further study (see <xref target="for-discussion"/>).
   </t>
   </section>
   
   <!--
    <section anchor="xaas-consider" numbered="true" toc="default">
      <name>Availability of Anything-as-a-Service</name>
       <t>
      Anything as a service (XaaS) describes a general category of services related to cloud computing and remote access.
     These services include the vast number of products, tools, and technologies that are delivered to users as a service over the Internet.
     In this document, the availability of XaaS is viewed as the ability to access the service over a period of time with pre-defined performance objectives.
     The XaaS model enables:
     </t>
           <ul spacing="normal">
        <li>Improving the expense model by purchasing services from providers on a subscription basis rather than buying
        individual products, e.g., software, hardware, servers, security, infrastructure, and install them on-site, and then link everything together to create networks.</li>
        <li>Speeding new apps and business processes by quickly adapting to changing market conditions with new applications or solutions. </li>
        <li>Shifting IT resources to specialized higher-value projects that use the core expertise of the company.</li>
      </ul>
      <t> But XaaS model also has potential challenges:</t>
                 <ul spacing="normal">
        <li>Possible downtime resulting from issues of internet reliability, resilience, provisioning, and managing the infrastructure resources.</li>
        <li>Performance issues caused by depleted resources like bandwidth, computing power,
        inefficiencies of virtualized environments, ongoing management and security of multi-cloud services.</li>
        <li>Complexity impacts enterprise IT team that must remain in the process of the continued learning of the provided services.</li>
      </ul>
     <t>
     The framework and metrics of the PAM defined in <xref target="ep-metrics-section"/> allow a provider of XaaS and their customers to quantify,
     measure, monitor for conformance what is often referred to as an ephemeral - availability of the service to be delivered.
     There are other definitions and methods of expressing availability. For example,  <xref target="HighAvailability-WP"/> uses the following equation:
     </t>
     <ul bare="true" empty="true" indent="6" spacing="compact">
     <li>Availability Average  = MTBF/(MTBF + MTRR),</li>
     <li>where:</li>
     <li>MTBF (Mean Time Between Failures) - mean time between individual component failures. For example, a hard drive malfunction or hypervisor reboot.</li>
     <li>MTTR (Mean Time To Repair) -  refers to how long it takes to fix the broken component or the application to come back online, </li>
     </ul>
     <t>
     While this approach estimates the expected availability of a XaaS, the PAM reflects near-real-time availability of a service as experienced by a user.
     It also provides valuable data for more accurate and realistic MTBF and MTTR in the particular environment, and simplifies comparison of different
     solutions that may use redundant servers (web and database), load balancers.
     </t>
     <t>
     In another field of communication, mobile voice and data services, the definition of service availability is understood as
     "the probability of successful service reception: a given area is declared “in-coverage” if the service in that area is
     available with a pre-specified minimum rate of success. Service availability has the advantage of being more easily
     understandable for consumers and is expressed as a percentage of the number of attempts to access a given service."
     <xref target="BEREC-CP"/>. The definition of the availability used in the PAM throughout this document is close to
     the quoted above. It might be considered as the extension that allows regulators, operators, and consumers to compare
     not only the rate of successfully establishing a connection but the quality of the connection during its lifetime.
     </t>
    </section>
    -->
    
    <section anchor="other" numbered="true" toc="default">
     <name>Other PAM Benefits
     </name>
     <t>
     PAM provides a number of benefits with other, more conventional performance metrics.
     Without PAM, it would be possible to conduct ongoing measurements of service levels
     and maintain a time-series of service level records, then assess compliance with specific
     SLOs after the fact.  However, doing so would require the collection of vast amounts of data
     that would need to be generated, exported, transmitted, collected, and stored. 
     In addition, extensive postprocessing would be required to compare that data against SLOs
     and analyze its compliance.  Being able to perform these tasks at scale
     and in real-time would present significant additional challenges.  
     </t>
     <t>
     Adding PAM allows for a more compact expression of service level compliance.
     In that sense, PAM does not simply represent raw data but expresses actionable information. 
     In conjunction with proper instrumentation, PAM can thus help avoid expensive postprocessing. 
     </t>
   </section>
   
        <section anchor="for-discussion" numbered="true" toc="default">
      <name>Discussion Items</name>
      <t>The following items require further discussion:</t>
          <ul spacing="normal">
      <!--
      <li>Terminology - "Errored" vs. "Violated".  The key metrics defined in
      this draft refer to intervals during which violations of
      objectives for service level parameters occur as "violated". The term "errored" was chosen in continuity with the
      concept of "errored seconds", often used in transmission systems.
      However, "violated" may be a more accurate term, as the metrics
      defined here are not "errors" in an absolute sense, but relative
      to a set of defined objectives. </li>
      -->
     <li>Metrics.  The foundational metrics defined in this draft
      refer to violated intervals.  In addition, counts of
      violations related to individual packets may also need to be
      maintained.  Metrics referring to violated packets (i.e.,
      packets that on an individual basis miss a performance objective)
      may be added in a later revision of this document.</li>
     </ul>
      <t>
      The following is a list of items for which further discussion is
   needed as to whether they should be included in the scope of this
   specification:
</t>
     <ul spacing="normal">
     <li>A YANG data model.</li>
     <li>A set of IPFIX Information Elements.</li>
     <li>Statistical metrics: e.g., histograms/buckets.</li>
     <li>Policies regarding the definition of "violated" and "severely violated" time interval.</li>
     <li>Additional second-order metrics, such as "longest disruption of service time" (measuring consecutive time units with SVIs).</li>
     </ul>
    </section>
    
    <section anchor="iana-consider" numbered="true" toc="default">
      <name>IANA Considerations</name>
      <t>This document has no IANA actions.</t>
    </section>
    <section anchor="security" numbered="true" toc="default">
      <name>Security Considerations</name>
      <t>
   Instrumentation for metrics that are used to assess compliance with
   SLOs constitute an attractive target for an attacker.  By interfering
   with the maintaining of such metrics, services could be falsely
   identified as complying (when they are not) or vice-versa
   (i.e., flagged as being non-compliant when indeed they are).  While this
   document does not specify how networks should be instrumented to
   maintain the identified metrics, such instrumentation needs to be
   adequately secured to ensure accurate measurements and prohibit
   tampering with metrics being kept.
      </t>
      <t>
         Where metrics are being defined relative to an SLO, the configuration
   of those SLOs needs to be adequately secured.  Likewise, where
   SLOs can be adjusted, the correlation between any metrics instance
   and a particular SLO must be clear. The same service levels that constitute
   SLO violations for one flow that should be maintained as part of
   the "violated time units" and related metrics,
   may be perfectly compliant for another flow.  In cases when it is
   impossible to tie together SLOs and PAM properly, it will
   be preferable to merely maintain statistics about service levels
   delivered (for example, overall histograms of end-to-end
   latency) without assessing which constitutes violations.
      </t>
      <t>
      By the same token, where the definition of what constitutes a
   "severe" or a "significant" violation depends on policy or
   context. The configuration of such policy or context needs to be
   specially secured. Also, the configuration of this policy must be bound to
   the metrics being maintained.  This way, it will be clear which policy
   was in effect when those metrics were being assessed.  An attacker
   that can tamper with such policies will render the
   corresponding metrics useless (in the best case) or misleading (in
   the worst case).
      </t>
    </section>
    <section numbered="true" toc="default">
      <name>Acknowledgments</name>
      <t>
         <!--The authors greatly appreciate review and comments by Mohamed "Med" Boucadair.-->TBA
      </t>
    </section>
  </middle>
  <back>
    <references>
      <name>References</name>
      <!--
      <references>
              
        <name>Normative References</name>

        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        -->
        <!--
  <?rfc include="reference.RFC.8126"?>


  <?rfc include="reference.RFC.4656"?>


  <?rfc include="reference.RFC.6038"?>

    </references>
      -->
      <references>
        <name>Informative References</name>
        <!--
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7799.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5880.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8762.xml"/>
        -->
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2863.xml"/>
  <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8343.xml"/>
  <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7011.xml"/>
  <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7012.xml"/>
  <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7297.xml"/>
  
  <xi:include href="https://datatracker.ietf.org/doc/bibxml3/draft-ietf-teas-ietf-network-slices.xml"/>
  
  <!-- <xi:include href="https://datatracker.ietf.org/doc/bibxml3/draft-mmm-rtgwg-integrated-oam.xml"/> -->
        <reference anchor="ITU.G.826">
          <front>
            <title>End-to-end error performance parameters and
objectives for international, constant bit-rate digital paths and connections</title>
            <author>
              <organization>ITU-T</organization>
            </author>
            <date month="December" year="2002"/>
          </front>
          <seriesInfo name="ITU-T" value="G.826"/>
        </reference>
  
  <!--
        <reference anchor="HighAvailability-WP" target="https://www.deft.com/wp-content/uploads/pdf/SCTG-High-Availability-White-Paper-Part-2.pdf">
          <front>
            <title>High Availability in Cloud and Dedicated Infrastructure</title>
            <author>
              <organization>Avi Freedman, Server Central</organization>
            </author>
          </front>
        </reference>
        
          <reference anchor="BEREC-CP" target="https://berec.europa.eu/eng/document_register/subject_matter/berec/regulatory_best_practices/common_approaches_positions/8315-berec-common-position-on-information-to-consumers-on-mobile-coverage">
          <front>
            <title>BEREC Common Position on information to consumers on mobile coverage</title>
            <author>
              <organization>Body of European Regulators for Electronic Communications</organization>
            </author>
              <date month="June" year="2018"/>
          </front>
            <seriesInfo name="Common Approaches/Positions" value="BoR (18) 237"/>
        </reference>
-->

      </references>
    </references>
    
         <section anchor="contr-sec" numbered="false" toc="default">
        <name>Contributors' Addresses</name>
            
    <contact fullname="Liuyan Han" initials="L." surname="Han">
      <organization>China Mobile</organization>
      <address>
        <postal>
          <street>32 XuanWuMenXi Street</street>
          <city>Beijing</city>
          <code>100053</code>
          <country>China</country>
        </postal>
        <email>hanliuyan@chinamobile.com</email>
      </address>
    </contact>
    
      <contact fullname="Mohamed Boucadair" initials="M." surname="Boucadair">
      <organization>Orange</organization>
      <address>
        <postal>
          <street>35000 Rennes</street>
    <city/>
          <code/>
          <country>France</country>
        </postal>
        <email>mohamed.boucadair@orange.com</email>
      </address>
    </contact>
    
        <contact fullname="Adrian Farrel" initials="A." surname="Farrel">
      <organization>Old Dog Consulting</organization>
      <address>
        <postal>
          <street/>
    <city/>
          <code/>
          <country>United Kingdom</country>
        </postal>
        <email>adrian@olddog.co.uk</email>
      </address>
    </contact>
    
        </section>

  </back>
</rfc>
