<?xml version="1.0" encoding="UTF-8"?>
<rfc category="info" docName="draft-zeng-opsawg-llm-netconf-gap-00" ipr="trust200902" submissionType="IETF" xml:lang="en" version="3" xmlns:xi="http://www.w3.org/2001/XInclude">
  <front>
    <title abbrev="LLM Netconf Gap">Gap Analysis of Network Configuration Protocols
           in LLM-Driven Intent-Based Networking</title>
     <author fullname="Guanming Zeng" initials="G." surname="Zeng">
      <organization>Huawei</organization>
      <address><email>zengguanming@huawei.com</email></address>
    </author>
    <author fullname="Jianwei Mao" initials="J." surname="Mao">
      <organization>Huawei</organization>
      <address><email>maojianwei@huawei.com</email></address>
    </author>
    <author fullname="Bing Liu" initials="B." surname="Liu">
      <organization>Huawei</organization>
      <address><email>leo.liubing@huawei.com</email></address>
    </author>
    <author fullname="Nan Geng" initials="N." surname="Geng">
      <organization>Huawei</organization>
      <address><email>gengnan@huawei.com</email></address>
    </author>
    <author fullname="Xiaotong Shang" initials="X." surname="Shang">
      <organization>Huawei</organization>
      <address><email>shangxiaotong@huawei.com</email></address>
    </author>
    <author fullname="Qiangzhou Gao" initials="Q." surname="Gao">
      <organization>Huawei</organization>
      <address><email>gaoqiangzhou@huawei.com</email></address>
    </author>
    <author fullname="Zhenbin Li" initials="Z." surname="Li">
      <organization>Huawei</organization>
      <address><email>robinli314@163.com</email></address>
    </author>       
    <date year="2025"/>
    <abstract>
      <t>Large Language Models (LLMs) are entering network operations through
      natural-language intent interfaces.  Existing south-bound protocols
      (NETCONF, RESTCONF, gNMI, MCP, A2A) were not designed for conversational,
      semantically-rich, multi-agent orchestration.  This document provides a
      systematic gap analysis and identifies extension points for each protocol
      to meet intent-based networking requirements.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">
      <t>Intent-based networking (IBN) promises to translate high-level
      operator intent into network configuration without low-level syntax
      errors.  With the advent of LLMs, the interface moves from YAML/CLI to
      natural language.  Unfortunately, none of the current configuration or
      agent-to-agent protocols provide the semantic, transactional, and
      multi-agent primitives required by LLM-driven IBN.  This draft analyses
      the gaps and proposes concrete extension directions for five widely
      deployed protocols.</t>
    </section>

    <section anchor="conv" title="Conventions and Definitions">
      <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", etc.,
      are to be interpreted as described in <xref target="RFC2119"/>.</t>
      <t>IBN, LLM, Agent, Intent, Tool, Artifact, and Task are used as defined
      in <xref target="I-D.ietf-opsawg-ibn-terminology"/>.</t>
    </section>

    <section anchor="gap" title="Gap Analysis per Protocol">

    <section anchor="mcp" title="MCP">         

    <section anchor="mcp-gap" title="Gap Analysis">
  <t>
    The design goal of MCP is to give a single Large Language Model (LLM)
    a "plug-and-play" tool-calling capability. When deployed directly between
    a network controller and the devices, however, the following structural
    gaps are exposed.
  </t>

  <section title="Lack of Network-Level Transaction Semantics" anchor="gap-txn">
    <t>
      MCP's tools/call is a stateless, one-shot JSON-RPC invocation.
      Network changes normally require the multi-stage semantics
      "candidate → validate → commit → rollback." MCP has neither a
      candidate datastore nor two-phase-commit primitives. Consequently,
      cross-device bulk deployments cannot guarantee "all-or-nothing"
      atomicity. When partial failures occur, the controller must supply
      its own compensation logic, lengthening the LLM's reasoning chain
      and increasing uncertainty.
    </t>
  </section>

  <section title="No YANG Semantics Discovery Mechanism" anchor="gap-yang">
    <t>
      Today MCP tool descriptors are written by hand. Network-device
      capabilities are authoritatively defined by YANG models; whenever
      a model is updated, the tool list must be manually re-synchronized.
      Without an automated pipeline "YANG → JSON-Schema → tool descriptor,"
      maintaining the tool catalogue in a large multi-vendor environment
      becomes a bottleneck.
    </t>
  </section>

  <section title="Encoding and Bandwidth Bottlenecks" anchor="gap-bw">
    <t>
      Network-ops scenarios often involve high-frequency telemetry
      (5–60 s sampling, 10 k metrics per node). MCP specifies only
      JSON-RPC over HTTP/1.1, resulting in highly redundant messages
      and no streaming push primitive. When an LLM needs real-time
      anomaly detection, frequent polling consumes excessive bandwidth
      and CPU, violating data-center goals of low latency and high
      throughput.
    </t>
  </section>

  <section title="Missing Multi-Device Context Correlation" anchor="gap-context">
    <t>
      MCP's invocation context is confined to a single connection; it
      cannot natively carry network-level intent such as "change the
      same VLAN across three leaf switches while keeping the STP root
      bridge unchanged." The LLM must repeat the constraints in the
      prompt, wasting tokens and raising the error rate.
    </t>
  </section>

  <section title="Lack of Network Rollback and Audit Hooks" anchor="gap-audit">
    <t>
      Network operations require audit logs that "trace down to the
      leaf node." MCP's tool return body contains only a JSON result;
      there are no standardized fields for rollback-point, commit-id,
      or syslog-severity. Root-cause analysis and compliance audits
      therefore require additional integration with device syslog or
      NETCONF logs, increasing cost.
    </t>
  </section>

  <section title="Incompatibility with Existing Device Security Models" anchor="gap-sec">
    <t>
      Devices commonly enforce certificate-based mutual-TLS plus NACM
      path-level permissions. MCP currently defines only a Bearer-token
      header and offers no mapping between a tool call and the
      read/write/exec permissions on a YANG node. If a tool-descriptor
      file leaks, an LLM could combine calls to bypass existing ACLs,
      creating a privilege-escalation risk.
    </t>
  </section>
  <section title="Lifecycle and State-Management Gap" anchor="gap-lifecycle">
  <t>
    Network changes often last several minutes (waiting for BGP convergence or MAC migration). Once an MCP call completes, its context is discarded immediately, so there is no way to stream intermediate updates such as “convergence 90 %.” The LLM has no choice but to poll repeatedly, increasing load on both itself and the device while still failing to achieve a true state-machine-driven closed loop.
  </t>
</section>


</section>

        <section anchor="mcp-solution" title="Solution Considerations">
          <t>
    For MCP to serve as "the universal glue between LLMs
    and devices" in production networks, an upper layer must supply a
    transactional state machine, a YANG self-description channel,
    streaming encodings, and fine-grained audit semantics. Without
    these additions, MCP will remain confined to labs or single-device
    scripting and will be unable to close the loop on production-grade
    intent.
  </t>
          <!-- <t>Define MCP profile mcp-network-2025 with auto-generated YANG tool descriptors.</t>
          <t>Add ToolRequest field transaction-id and Commit/Rollback tools.</t>
          <t>Provide streaming transport mapping using gNMI Subscribe.</t>
          <t>Provide A2A gateway exposing MCP tools as Agent Cards.</t> -->
        </section>

      </section>

      <section anchor="a2a" title="A2A">
        <section anchor="a2a-problem" title="Gap Analysis">
  <t>
    Positioned as a "multi-agent collaboration layer," the Agent-to-Agent (A2A) protocol was created so that any two LLM-Agents can discover each other, negotiate, and jointly finish long-running tasks. When it is dropped straight into "network-controller ↔ network-device" or "controller ↔ controller" settings, however, the following deep gaps surface:
  </t>

  <section title="Task Granularity Mismatch with Network Atomic Operations" anchor="a2a-granularity">
    <t>
      A2A Tasks target macro-level business goals (e.g., "relocate a DC"). The smallest deliverable is an Artifact. Network changes, by contrast, must touch a single YANG leaf (e.g., "set interface X MTU = 9216"). The spec offers no "micro-task" primitive, so one Task either carries thousands of lines and becomes bloated, or is split into hundreds of Tasks that explode the state machine and raise LLM-orchestration complexity.
    </t>
  </section>

  <section title="No Network-Wide Transaction or Roll-back Semantics" anchor="a2a-txn">
    <t>
      A2A's state machine is limited to pending → working → completed/failed. On failure the controller only gets a free-text Task.statusMessage. Network ops demand cross-device atomic commit plus a rollback tag. The protocol today defines no:</t>
      <ul spacing="normal">
        <li>two-phase-commit token (transaction-id),</li>
        <li>distributed lock or conflict detection,</li>
        <li>unified rollback API (rollback-on-failure).</li>
      </ul>
    <t>
      Controllers must therefore implement compensation themselves, forcing the LLM to reason about "how to write a rollback script," which violates intent-based principles.
    </t>
  </section>

  <section title="Poor Encoding and Bandwidth Efficiency" anchor="a2a-encoding">
    <t>
      A2A mandates JSON for Artifact payloads and runs over HTTP/1.1. For high-frequency telemetry (5 s interval, 10 k metrics/node) or bulk config pushes, JSON's textual redundancy causes:</t>
      <ul spacing="normal">
        <li>controller-device link congestion,</li>
        <li>wasted LLM-context tokens,</li>
        <li>repetitive header parsing and higher CPU load.</li>
      </ul>
      <t>The protocol lacks a binary or streaming encoding option and offers no back-pressure mechanism.</t>
    
  </section>

  <section title="Missing Multi-Device Context Correlation" anchor="a2a-context">
    <t>
      A2A Task context is scoped to a single "conversation"; there is no standard field to express topology-level constraints such as "change the same VLAN on three leaf switches while keeping the STP root bridge unchanged." The LLM must repeat inter-device relations in the prompt, burning tokens and risking truncation that produces configurations which are syntactically valid but topologically wrong.
    </t>
  </section>

  <section title="Incompatibility with Existing Device Security Models" anchor="a2a-sec">
    <t>
      Devices generally enforce certificate-based mutual TLS plus NACM path-level access control. A2A currently specifies only an OAuth2 delegation token and provides no mapping from "Task-level role" to YANG node read/write/exec permissions, nor per-Artifact fine-grained ACLs. Once an Artifact is cached or forwarded it may bypass the certificate chain, leading to privilege escalation or configuration pollution.
    </t>
  </section>

  <section title="No Network-Semantics Discovery Mechanism" anchor="a2a-discovery">
    <t>
      Skills are advertised in the Agent Card, but the Card is free text. There are no standard fields saying "I support OpenConfig BGP 4.0 YANG" or "I manage AS 65001-65500." LLMs must rely on fuzzy matching, often selecting the wrong partner and raising Task failure rates.
    </t>
  </section>

  <section title="Life-Cycle and Intermediate-State Reporting Gap" anchor="a2a-lifecycle">
    <t>
      Network changes can last minutes (waiting for BGP convergence, MAC moves). After a Task enters "working," A2A only mandates a final Artifact; there is no standard way to push interim states such as "convergence 70 %" or "MTU changed, waiting for LLDP neighbor re-discovery." The LLM must poll or wait until timeout, increasing load and preventing a true state-machine-driven closed loop.
    </t>
  </section>


</section>
        <section anchor="a2a-solution" title="Solution Considerations">
          <t>
    To act as a "multi-agent collaboration bus" in network environments, A2A must be systematically extended in task granularity, transaction semantics, binary encoding, topology context, security mapping, life-cycle management, and intermediate-state push. Otherwise it will remain suited only for macro business flows and will be unable to close the fine-grained, reliable, and roll-backable network-intent loop required in production.
  </t>
          <!-- <t>Define micro-task type with YANG-patch Artifact.</t>
          <t>Register media-type application/yang-patch+protobuf.</t>
          <t>Add extension field underlying-protocol.</t>
          <t>Allow certificate-bound access tokens (RFC 8705).</t> -->
        </section>
      </section>


      <section anchor="netconf" title="NETCONF">
        <section anchor="netconf-problem" title="Gap Analysis">
          <t>NETCONF <xref target="RFC6241"/> provides transactional, XML-encoded RPCs over SSH.
          It lacks:</t>
          <ul spacing="normal">
            <li>Semantic discovery: YANG models are not self-describing for LLMs; no runtime tool list.</li>
            <li>Session context: no standard place to store intent-id, LLM prompt, or multi-device correlation.</li>
            <li>Streaming telemetry: &lt;notification&gt; is push-style but insufficient for high-frequency KPI.</li>
            <li>Function-level audit: &lt;commit&gt; is atomic, but per-leaf authorization is out-of-scope.</li>
          </ul>
        </section>
        <section anchor="netconf-solution" title="Solution Considerations">
        <t> TBD </t>
          <!-- <t>Define a YANG module ietf-llm-netconf@2025 that augments &lt;capabilities&gt; with tool-descriptor leaves.</t>
          <t>Add an &lt;intent-id&gt; RPC element and a new &lt;intent-session&gt; subtree.</t>
          <t>Profile existing &lt;notification&gt; to support protobuf encoding.</t>
          <t>Re-use NACM with LLM-specific tool path prefix.</t> -->
        </section>

      </section>

      <section anchor="restconf" title="RESTCONF">
        <section anchor="restconf-problem" title="Gap Analysis">
          <t>RESTCONF <xref target="RFC8040"/> maps YANG to HTTP URIs.  Gaps include:</t>
          <ul spacing="normal">
            <li>No candidate datastore—every PUT/PATCH is immediate.</li>
            <li>No server-side discovery document for LLMs.</li>
            <li>Stateless: no place to store multi-request intent.</li>
            <li>Encoding flexibility may confuse LLM prompt consistency.</li>
          </ul>
        </section>
        <section anchor="restconf-solution" title="Solution Considerations">
        <t> TBD </t>
          <!-- <t>Define query parameter ?dry-run=true returning validation report.</t>
          <t>Register well-known URI /.well-known/llm-tools.json.</t>
          <t>Add Intent-Id header; servers MAY maintain 30-second intent context.</t>
          <t>Recommend JSON as default and publish JSON-Schema for each path.</t> -->
        </section>

      </section>

      <section anchor="gnmi" title="gNMI">
        <section anchor="gnmi-problem" title="Gap Analysis">
          <t>gNMI delivers high-speed telemetry but:</t>
          <ul spacing="normal">
            <li>No semantic metadata for LLMs.</li>
            <li>Set() is non-transactional across multiple paths.</li>
            <li>No multi-agent signalling—gNMI is 1:1.</li>
            <li>No standardized error ontology.</li>
          </ul>
        </section>
        <section anchor="gnmi-solution" title="Solution Considerations">
        <t> TBD </t>
          <!-- <t>Define gNMI extension GetSchemaDescription() returning JSON-Schema.</t>
          <t>Add Transaction-extension message for two-phase-commit.</t>
          <t>Use gRPC reflection to expose tool names to MCP registries.</t>
          <t>Map gNMI error codes to RFC 9457 Problem Details.</t> -->
        </section>
      </section>

      
    </section>

    <section anchor="summary" title="Summary">
      <t>No single protocol satisfies all IBN-LLM requirements.  NETCONF/RESTCONF/gNMI need semantic and transactional extensions; MCP/A2A need networking-specific profiling.  A companion document will define unified data models and security frameworks to close the identified gaps.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <reference anchor="RFC2119"><front><title>Key words for use in RFCs to Indicate Requirement Levels</title>
        <author initials="S." surname="Bradner"/><date year="1997"/></front></reference>
      <reference anchor="RFC6241"><front><title>Network Configuration Protocol (NETCONF)</title>
        <author initials="R." surname="Enns"/><date year="2011"/></front></reference>

      <reference anchor="RFC8040"><front><title>RESTCONF Protocol</title>
        <author initials="A." surname="Bierman"/><date year="2017"/></front></reference>
      <reference anchor="RFC9457"><front><title>Problem Details for HTTP APIs</title>
        <author initials="M." surname="Nottingham"/><date year="2023"/></front></reference>
      <reference anchor="OpenConfig-gNMI"><front><title>gNMI Specification</title>
        <author initials="OpenConfig" surname="Team"/><date year="2022"/></front></reference>
      <reference anchor="MCP-spec"><front><title>Model Context Protocol</title>
        <author initials="Anthropic" surname="Inc"/><date year="2024"/></front></reference>
      <reference anchor="A2A-spec"><front><title>Agent-to-Agent Protocol</title>
        <author initials="Google" surname="LLC"/><date year="2025"/></front></reference>
    </references>

    <references title="Informative References">
      <reference anchor="I-D.ietf-opsawg-ibn-terminology">
    <front>
      <title>Intent-Based Networking Terminology</title>
      <author><organization>IETF</organization></author>
      <date year="2025"/>
    </front>
  </reference>
    </references>
  </back>
</rfc>