<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<rfc category="exp"
     ipr="trust200902"
     submissionType="independent"
     docName="draft-gaikwad-aps-profile-00">

  <front>
    <title>Agent Persistent State Profile</title>

    <author fullname="Madhava Gaikwad"
            initials="M."
            surname="Gaikwad">
      <address>
        <email>gaikwad.madhav@gmail.com</email>
      </address>
    </author>

    <date year="2025" month="December" day="01"/>

    <abstract>
      <t>
        Autonomous agents increasingly maintain durable persistent state
        containing user preferences, embedding vectors, safety logs,
        intermediate reasoning steps, and audit traces. Today, agent
        frameworks treat storage as a generic file system, while storage
        administrators treat agents as stateless virtual machines. This
        "layer mismatch" leads to fragility, poor performance, and privacy
        risks.
      </t>

      <t>
        The Agent Persistent State (APS) Profile defines an experimental,
        vendor-neutral storage service class for durable agent state.
        APS emphasizes <spanx style="strong">compliance</spanx>: ensuring
        that memory associated with a specific user or agent identity can
        be retained, audited, and cryptographically erased. APS also
        addresses high-frequency small I/O, vector index workloads, crash
        consistency, and Kubernetes/CSI <xref target="CSI"/> integration.
      </t>

      <t>
        APS introduces a Usage Class ("AgentPersistentState"), a
        versioned PersistentStateLineOfService schema, guidance for
        container orchestration systems, non-normative bindings for
        Swordfish <xref target="Swordfish"/> and Redfish <xref target="Redfish"/>, and considerations for multi-tenancy.
        APS is intended as an Experimental RFC to gather implementation
        feedback prior to any standards-track work.
      </t>
    </abstract>
  </front>

  <middle>

    <!-- ========================================================== -->
    <!-- 1. Introduction                                            -->
    <!-- ========================================================== -->
    <section anchor="intro" title="Introduction">
      <t>
        AI agents are entering a new operational phase. Early systems focused
        on prompt engineering and model context. Modern deployments involve
        several of autonomous agents with distinct identities and long-lived
        state. Persistent state stores embeddings, preferences, memory,
        safety constraints, and reasoning artifacts that must remain durable
        and compliant across failures.
      </t>

      <t>
        Current practice reveals a structural mismatch: agent developers assume
        implicitly that "somewhere" there is a low-latency, compliant store;
        storage administrators assume short-lived, stateless applications.
        This mismatch leads to data loss, privacy violations, tail-latency
        spikes, and unpredictable behavior.
      </t>

      <t>
        Reviewers have observed that this draft is premature but can be possibly timely:
        as deployments scale to thousands of agents, storage semantics will
        become critical infrastructure. By the time APS reaches maturity,
        the need is expected to be urgent.
      </t>

      <t>
        The key words <spanx style="verb">MUST</spanx>,
        <spanx style="verb">MUST NOT</spanx>, <spanx style="verb">SHOULD</spanx>,
        <spanx style="verb">SHOULD NOT</spanx>, and <spanx style="verb">MAY</spanx>
        in this document are to be interpreted as described in <xref target="RFC2119"/> and
        <xref target="RFC8174"/> when, and only when, they appear in all capitals.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 2. Terminology                                             -->
    <!-- ========================================================== -->
    <section anchor="terminology" title="Terminology">
      <t><list style="hanging">

        <t hangText="Agent:">
          An autonomous software entity with a distinct identity and a
          multi-session lifecycle. Agents maintain durable state across
          restarts or container rescheduling, motivating specialized
          persistent storage semantics.
        </t>

        <t hangText="Persistent State:">
          Durable data associated with an agent identity, including
          embeddings, preferences, caches, audit logs, and vector indexes.
        </t>

        <t hangText="APS:">
          Agent Persistent State Profile defined in this document.
        </t>

        <t hangText="Usage Class:">
          A tag applied to storage resources indicating the expected
          workload. This document defines
          <spanx style="verb">AgentPersistentState</spanx>.
        </t>

        <t hangText="Line of Service (LoS):">
          A structured description of storage behavior.
        </t>

        <t hangText="Forgetting Policy:">
          A policy that causes persistent state to be deleted or scrubbed
          according to age, event triggers, or external legal/compliance
          requests (for example, scheduled deletion of records older than
          a given retention period).
        </t>

      </list></t>
    </section>


    <!-- ========================================================== -->
    <!-- 3. Landscape of Agent Frameworks                           -->
    <!-- ========================================================== -->
    <section anchor="agent-landscape"
             title="Landscape of Agent Frameworks and Persistent State Needs">

      <t>
        Agent frameworks today implement persistence at the application layer,
        leading to fragility. The following table summarizes common patterns
        as of 2025; it is illustrative and non-exhaustive.
      </t>

      <texttable anchor="frameworks-table">
        <ttcol>Framework</ttcol>
        <ttcol>Current Mechanism</ttcol>
        <ttcol>Pain Point</ttcol>
        <ttcol>APS Benefit</ttcol>

        <c>LangChain (LangGraph)</c>
        <c>Postgres/SQLite/Redis checkpoints</c>
        <c>Locking, tail latency from high-frequency checkpointing</c>
        <c>SmallRandom profile; advisory locking hints</c>

        <c>AutoGPT/BabyAGI</c>
        <c>JSON files and local workspace directories</c>
        <c>Crash can corrupt state; agent "forgets" everything</c>
        <c>CrashConsistency and journaling recommendations</c>

        <c>LlamaIndex</c>
        <c>Vector stores (e.g., HNSW) plus document stores</c>
        <c>Noisy-neighbor throttling during bulk index I/O</c>
        <c>Vector-phase-aware profiles; tail-latency awareness</c>

        <c>CrewAI</c>
        <c>Local SQLite or similar shared state</c>
        <c>Multi-agent persistence fragile across failures</c>
        <c>ReplicationScope for AZ/region resilience</c>
      </texttable>

      <t>
        The structural problems across frameworks validate the need for APS:
        requirements for durability, performance, and compliance must move
        downward into the infrastructure layer in a standard form.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 4. APS Data Model                                          -->
    <!-- ========================================================== -->
    <section anchor="datamodel" title="APS Data Model">
      <t>
        APS defines a new Usage Class:
      </t>

      <figure><artwork><![CDATA[
UsageClass: "AgentPersistentState"
      ]]></artwork></figure>

      <t>
        APS applies to storage abstractions that present block or filesystem
        semantics to agents and their orchestrators, including volumes
        provisioned via Kubernetes PersistentVolumes or surfaces exposed
        through SNIA/DMTF management schemas. Applying APS directly to
        pure object stores or HTTP-based APIs is outside the scope of
        the normative requirements in this document.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 5. PersistentStateLineOfService Schema                     -->
    <!-- ========================================================== -->
    <section anchor="los" title="PersistentStateLineOfService Schema">
      <t>
        The PersistentStateLineOfService structure describes a single
        APS profile. Fields marked OPTIONAL are advisory and MAY be
        omitted. A subset of fields constitutes the core APS compliance
        baseline and MUST be present and honored for any profile that
        claims APS compliance.
      </t>

      <figure>
        <artwork><![CDATA[
PersistentStateLineOfService {

  ApsVersion: string           # e.g., "1.0"

  Id: string
  Name: string
  Purpose: "AgentPersistentState"

  IOProfile: enum {
      SmallRandom,
      MixedRandom,
      MetadataHeavy,
      VectorIndex
  }

  # Optional refinement for vector workloads:
  VectorPhase: enum {
      Ingest,
      Query,
      Mixed
  } OPTIONAL

  CapacityRangeGiB {
    MinGiB: integer
    MaxGiB: integer
    RecommendedStepGiB: integer
  }

  Performance {
    MinIOPS: integer
    MaxLatencyMsP95: number
    MaxLatencyMsP99: number

    RecommendedIOSizeKiB: integer
        # Embeddings and vectors may use 16-64KiB I/O sizes
  }

  Durability {
    DurabilityClass: enum { Standard, High, VeryHigh }
    ReplicationScope: enum { LocalAZ, MultiAZ, MultiRegion }
  }

  CrashConsistency {
    RequiresWriteOrdering: boolean
  }

  Retention {
    RetentionDays: integer
    ForgettingPolicySupported: boolean
  }

  Privacy {
    ContainsUserFacingDataExpected: boolean

    MinDeletionSemantics: enum {
      BestEffort,
      CryptographicErase,
      MediaScrub
    }

    ErasureScope: enum {
      Volume,          # Whole LUN / filesystem
      FilesystemTree,  # Directory / inode subtree
      ObjectPrefix,    # Prefix subset for object-backed systems
      ApplicationKey   # Crypto handled above storage
    }

    MaxEraseLatencyHours: integer
  }

  ConcurrencyHints {        # OPTIONAL, advisory only
    AdvisoryLockingSupported: boolean
  }
}
        ]]></artwork>
      </figure>

      <t>
        The core APS compliance baseline consists of the following fields,
        which <spanx style="strong">MUST</spanx> be present and reflect
        behavior meaningful to the abstraction being exposed:
      </t>

      <t>
        <list style="symbols">
          <t><spanx style="verb">IOProfile</spanx></t>
          <t><spanx style="verb">CapacityRangeGiB</spanx> (all members)</t>
          <t><spanx style="verb">Performance.MinIOPS</spanx></t>
          <t><spanx style="verb">Performance.MaxLatencyMsP95</spanx></t>
          <t><spanx style="verb">Durability.DurabilityClass</spanx></t>
          <t><spanx style="verb">Durability.ReplicationScope</spanx></t>
          <t><spanx style="verb">CrashConsistency.RequiresWriteOrdering</spanx></t>
          <t><spanx style="verb">Retention.RetentionDays</spanx></t>
          <t><spanx style="verb">Retention.ForgettingPolicySupported</spanx></t>
          <t><spanx style="verb">Privacy.MinDeletionSemantics</spanx></t>
          <t><spanx style="verb">Privacy.ErasureScope</spanx></t>
          <t><spanx style="verb">Privacy.MaxEraseLatencyHours</spanx></t>
        </list>
      </t>

      <t>
        Providers <spanx style="strong">MUST NOT</spanx> advertise an
        APS-compliant profile that omits or ignores these core fields.
        Additional fields MAY be omitted if they are not meaningful for
        the abstraction (for example, advisory locking hints on raw block
        volumes).
      </t>

      <t>
        In environments that provision storage in terms of throughput
        (for example, MiB/s) rather than IOPS, implementers MAY derive
        an effective <spanx style="verb">MinIOPS</spanx> from documented
        throughput and average I/O size. One reasonable approach is:
      </t>

      <figure>
        <artwork><![CDATA[
MinIOPS ≈ (ThroughputMiBps * 1024) / RecommendedIOSizeKiB
        ]]></artwork>
      </figure>

      <t>
        APS does not constrain how this effective value is calculated, but
        providers SHOULD document the relationship between throughput and
        MinIOPS.
      </t>

      <section anchor="aps-ioprofile-quant"
               title="IOProfile Behavior (Non-Normative)">
        <t>
          IOProfile values describe expected workload envelopes. This section
          is non-normative guidance for implementers and consumers. APS
          profiles are provisioned at volume-creation time, while agent
          workloads may evolve dynamically. Consumers SHOULD treat IOProfile
          as a coarse signal for provisioning decisions, not a real-time
          optimization mechanism.
        </t>

        <t>
          Some IOProfile values (for example, VectorIndex) include an
          OPTIONAL VectorPhase field. VectorPhase refines expected workload
          behavior but does not override the baseline characteristics of
          the underlying IOProfile.
        </t>

        <t>
          <list style="hanging">

            <t hangText="SmallRandom:">
              Dominated by 4-16 KiB random reads and writes, typical for
              agent checkpointing, key-value state blocks, preference updates,
              and log fragments. Storage SHOULD provision enough low-latency
              IOPS to avoid tail-latency amplification during frequent agent
              state flushes.
            </t>

            <t hangText="MetadataHeavy:">
              Characterized by high rates of metadata changes (directory
              operations, inode updates, journal commits). Suitable for agents
              producing many small files or maintaining structured reasoning
              traces. Storage SHOULD support efficient metadata journaling and
              crash-consistent mounts.
            </t>

            <t hangText="MixedRandom:">
              A balanced workload combining small random I/O with occasional
              larger sequential batches. Appropriate when agents interleave
              checkpointing, replay logs, and batched vector updates.
            </t>

            <t hangText="VectorIndex:">
              Represents workloads dominated by embedding ingestion, vector
              construction, nearest-neighbor search, or index compaction.
              Providers SHOULD document which vector phase(s) their
              configuration optimizes. The OPTIONAL VectorPhase enum has the
              following meanings:

              <list style="symbols">
                <t><spanx style="verb">Ingest</spanx> — optimized for
                   high-throughput sequential or semi-sequential writes during
                   bulk embedding generation, HNSW edge linking, or batch
                   import.</t>

                <t><spanx style="verb">Query</spanx> — optimized for
                   low-latency random reads and small-range fetches typical of
                   approximate nearest neighbor search workloads such as
                   DiskANN-style index queries.</t>

                <t><spanx style="verb">Mixed</spanx> — balanced for deployments
                   where ingest, query, and compaction phases interleave or
                   shift over time. Providers SHOULD state which tradeoffs are
                   made (write-friendly vs. read-friendly).</t>
              </list>

              Because vector workloads are phase-shifting, consumers SHOULD NOT
              assume that a single VectorPhase value can optimize all stages of
              the index lifecycle.
            </t>

          </list>
        </t>
      </section>

      <section anchor="durability-guidance"
               title="DurabilityClass Guidance (Non-Normative)">
        <t>
          DurabilityClass values are intentionally abstract. As non-normative
          guidance, providers might interpret them as:
        </t>

        <t>
          <list style="symbols">
            <t><spanx style="verb">Standard</spanx>: durability comparable
              to typical single-region cloud block volumes.</t>
            <t><spanx style="verb">High</spanx>: durability comparable to
              replicated volumes or erasure-coded storage (for example, multiple
              independent copies within a region or across zones).</t>
            <t><spanx style="verb">VeryHigh</spanx>: durability approaching
              or exceeding widely replicated object storage classes.</t>
          </list>
        </t>

        <t>
          Exact "nines" targets are out of scope and MAY vary by provider.
        </t>
      </section>

    </section>


    <!-- ========================================================== -->
    <!-- 6. Crash Consistency                                       -->
    <!-- ========================================================== -->
    <section anchor="crash" title="Crash Consistency Requirements">
      <t>
        All APS-compliant resources <spanx style="strong">MUST</spanx> provide
        metadata consistency after power loss or crash. For the purposes of
        this document, metadata consistency means that the filesystem or
        block namespace remains mountable without manual repair tools and
        that directory and inode structures do not exhibit silent corruption
        that would cause loss of reachability for existing files.
      </t>

      <t>
        The <spanx style="verb">RequiresWriteOrdering</spanx> flag indicates
        that applications rely on storage honoring write ordering or barrier
        semantics (for example, flush/fence operations) to maintain
        write-ahead logging or similar invariants. When true, providers
        <spanx style="strong">MUST</spanx> document the ordering guarantees
        and the mechanisms available to applications.
      </t>

      <t>
        Implementations supporting high-frequency agent checkpointing SHOULD
        document whether O_DIRECT, O_SYNC, or equivalent durability hints are
        honored by the underlying storage stack.
      </t>

      <t>
        Applications are expected to rely on standard filesystem atomic
        operations (such as POSIX rename) for batched updates. APS does not
        define or negotiate additional transactional semantics.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 7. Privacy, Replication, and Erasure                       -->
    <!-- ========================================================== -->
    <section anchor="privacy" title="Privacy, Replication, and Erasure">
      <t>
        The <spanx style="verb">MinDeletionSemantics</spanx>,
        <spanx style="verb">ErasureScope</spanx>, and
        <spanx style="verb">MaxEraseLatencyHours</spanx> fields describe the
        minimum deletion and erasure behavior for data stored under the
        profile. When combined with <spanx style="verb">ReplicationScope</spanx>,
        providers <spanx style="strong">MUST</spanx> interpret these fields
        as applying to all replicas of the primary data in scope.
      </t>

      <t>
        The <spanx style="verb">MaxEraseLatencyHours</spanx> field
        <spanx style="strong">MUST</spanx> be a strictly positive integer.
        A value of zero is reserved and MUST NOT be used. Deployments that
        effectively provide synchronous erasure MAY represent this with a
        small non-zero value and SHOULD document the expected behavior.
      </t>

      <t>
        For example, if <spanx style="verb">MinDeletionSemantics</spanx> is
        set to <spanx style="verb">CryptographicErase</spanx> and
        <spanx style="verb">ReplicationScope</spanx> is
        <spanx style="verb">MultiRegion</spanx>, then cryptographic erasure
        of all regions <spanx style="strong">MUST</spanx> complete within
        <spanx style="verb">MaxEraseLatencyHours</spanx>, unless the
        provisioning request explicitly accepts weaker behavior.
      </t>

      <t>
        APS describes behavior for primary data under the profile. Snapshots,
        backups, and other disaster recovery mechanisms MAY retain older
        copies of data beyond <spanx style="verb">MaxEraseLatencyHours</spanx>.
        End-to-end compliance for such copies is outside the scope of this
        document and MUST be addressed by higher-level data lifecycle
        policies.
      </t>

      <t>
        The <spanx style="verb">ErasureScope</spanx> field indicates the
        granularity at which cryptographic or destructive erase is expected
        to act:
      </t>

      <t>
        <list style="symbols">
          <t><spanx style="verb">Volume</spanx>: erasure is performed at the
            level of a whole volume, LUN, or filesystem. Destroying the
            associated key or media affects all data hosted on that volume.</t>

          <t><spanx style="verb">FilesystemTree</spanx>: erasure is performed
            at the level of a directory or inode subtree. For example,
            implementations MAY use per-directory filesystem encryption
            mechanisms (such as Linux fscrypt) where destroying the key
            renders only that subtree unreadable.</t>

          <t><spanx style="verb">ObjectPrefix</spanx>: erasure is scoped to
            a subset of objects under a prefix or bucket namespace. This is
            primarily informative for object-backed systems that expose
            filesystem views or may be the basis of future APS profiles for
            object-native agent state. Profiles that target purely block or
            filesystem abstractions MUST NOT use ObjectPrefix.</t>

          <t><spanx style="verb">ApplicationKey</spanx>: cryptographic erase
            is implemented above the storage layer. The storage system
            provides generic persistence, and an application, sidecar, or
            agent runtime manages keys and performs crypto-shredding by
            destroying them.</t>
        </list>
      </t>

      <t>
        When <spanx style="verb">MinDeletionSemantics</spanx> is set to
        <spanx style="verb">CryptographicErase</spanx> and
        <spanx style="verb">ErasureScope</spanx> is
        <spanx style="verb">Volume</spanx>,
        <spanx style="verb">FilesystemTree</spanx>, or
        <spanx style="verb">ObjectPrefix</spanx>, the storage or filesystem
        layer is responsible for ensuring that destruction of the relevant
        cryptographic material renders the targeted scope unreadable without
        affecting adjacent scopes beyond what is documented.
      </t>

      <t>
        When <spanx style="verb">MinDeletionSemantics</spanx> is
        <spanx style="verb">CryptographicErase</spanx> and
        <spanx style="verb">ErasureScope</spanx> is
        <spanx style="verb">ApplicationKey</spanx>, the storage layer is not
        aware of per-agent keys. In this case, APS simply records that
        cryptographic erase is expected to be achieved via application- or
        sidecar-managed key lifecycles. Operators MUST NOT interpret such a
        profile as evidence of storage-controller-level key separation.
      </t>

      <t>
        Storage backing APS MAY be multi-tenant. Providers <spanx style="strong">
        SHOULD</spanx> document whether cryptographic erase is implemented via
        volume-level keys, per-tenant keys, filesystem-level keys, or other
        mechanisms, and what the effective erase granularity is.
      </t>

      <t>
        There is intentionally no single mandatory minimum for
        <spanx style="verb">MinDeletionSemantics</spanx>. A provider that
        cannot implement <spanx style="verb">CryptographicErase</spanx>
        MAY still advertise an APS profile using
        <spanx style="verb">BestEffort</spanx> semantics, but such profiles
        are generally unsuitable for compliance-critical deployments and
        SHOULD be clearly documented as such.
      </t>

      <section anchor="erasure-patterns"
               title="Non-Normative Implementation Patterns">
        <t>
          The following patterns illustrate how different implementations
          might realize finer-grained erasure scopes in practice:
        </t>

        <t>
          <list style="hanging">

            <t hangText="FilesystemTree via fscrypt:">
              A CSI driver or agent runtime mounts a shared volume and
              creates per-agent directories (for example, /data/agent-a,
              /data/agent-b). Each directory is protected with a distinct
              filesystem encryption key (such as a Linux fscrypt policy).
              To erase a specific agent's data, the orchestrator destroys
              that directory's key in a KMS. The APS profile would set
              <spanx style="verb">ErasureScope=FilesystemTree</spanx> and
              <spanx style="verb">MinDeletionSemantics=CryptographicErase</spanx>.
            </t>

            <t hangText="Sub-volume Constructs (vVols, Qtrees):">
              Enterprise arrays that support lightweight logical containers
              inside a physical pool can assign unique keys to those
              constructs. An APS profile might then represent each logical
              container as having <spanx style="verb">ErasureScope=Volume</spanx>
              from the perspective of the host filesystem, while the array
              internally multiplexes many such logical volumes.
            </t>

            <t hangText="ApplicationKey Sidecar:">
              When the underlying storage provides only coarse-grained
              encryption, an application or sidecar may intercept I/O,
              encrypt data with per-agent keys, and write to a shared
              volume. Destroying the per-agent key achieves cryptographic
              erase for that agent's data. An APS profile for such a
              deployment would use
              <spanx style="verb">ErasureScope=ApplicationKey</spanx>.
            </t>

          </list>
        </t>

        <t>
          These patterns are examples only. APS does not mandate a particular
          implementation, but seeks to make the erasure scope explicit so
          operators and frameworks understand where compliance responsibilities
          sit.
        </t>
      </section>
    </section>


    <!-- ========================================================== -->
    <!-- 8. Multi-Tenancy and Volume Scalability                    -->
    <!-- ========================================================== -->
    <section anchor="multi-tenancy" title="Multi-Tenancy and Volume Scalability">
      <t>
        APS does <spanx style="strong">not</spanx> require a 1:1 mapping
        between agents and volumes. Volume-per-agent designs risk exhausting
        controller limits at scale. Instead, APS is intended to describe the
        behavior of shared storage classes that MAY host many agents.
      </t>

      <t>
        Implementations are encouraged to separate cryptographic and policy
        granularity from volume granularity. For example, a single APS
        volume may host multiple agents while still satisfying erasure and
        isolation requirements at a filesystem-tree or application-key
        level, as indicated by <spanx style="verb">ErasureScope</spanx>.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 9. Block and File Applicability                            -->
    <!-- ========================================================== -->
    <section anchor="layers" title="Block and File Applicability">
      <t>
        APS is intended for storage abstractions that present block or
        filesystem semantics to agents. Some fields are primarily relevant
        to block devices (for example,
        <spanx style="verb">RecommendedIOSizeKiB</spanx>), and others are
        primarily relevant to filesystems (for example,
        <spanx style="verb">AdvisoryLockingSupported</spanx>).
      </t>

      <t>
        Providers <spanx style="strong">MAY</spanx> omit non-core fields
        that are not meaningful for the abstraction they expose. Core fields
        listed in Section <xref target="los"/> MUST be honored.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 10. Identity and Control Plane Role                        -->
    <!-- ========================================================== -->
    <section anchor="identity" title="Identity and Control Plane Role">
      <t>
        APS discusses persistent state in terms of "agent identity", but does
        not require storage protocols (for example, NVMe, iSCSI, NFS) to
        carry per-agent identifiers on individual I/O operations. In most
        deployments, storage controllers see host or initiator identifiers,
        not agent IDs.
      </t>

      <t>
        Identity mapping and scoping are therefore the responsibility of the
        control plane and agent platform (for example, an orchestrator that
        binds particular agents or pods to specific volumes, filesystems, or
        filesystem subtrees). APS describes the properties of those storage
        constructs; it does not define new protocol headers or on-the-wire
        identity fields.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 11. Kubernetes and CSI Mapping                             -->
    <!-- ========================================================== -->
    <section anchor="k8s-mapping"
             title="Kubernetes and CSI Mapping">
      <t>
        Kubernetes and CSI are expected to be primary consumers of APS.
        This section provides guidance for CSI driver integrations. It is
        informative for the CSI specification itself.
      </t>

      <t>
        A CSI driver that claims to be "APS-aware" <spanx style="strong">
        MUST</spanx> support a documented mechanism to pass APS hints from
        higher-level configuration (for example, StorageClass parameters) to
        backend profiles. One possible mapping is:
      </t>

      <t>
        <list style="symbols">
          <t><spanx style="verb">parameters["aps.storage.k8s.io/ioProfile"]</spanx> →
              <spanx style="verb">IOProfile</spanx></t>
          <t><spanx style="verb">parameters["aps.storage.k8s.io/privacy.minDeletion"]</spanx> →
              <spanx style="verb">MinDeletionSemantics</spanx></t>
          <t><spanx style="verb">parameters["aps.storage.k8s.io/privacy.erasureScope"]</spanx> →
              <spanx style="verb">ErasureScope</spanx></t>
          <t><spanx style="verb">parameters["aps.storage.k8s.io/privacy.maxEraseHours"]</spanx> →
              <spanx style="verb">MaxEraseLatencyHours</spanx></t>
          <t><spanx style="verb">parameters["aps.storage.k8s.io/durability.replicationScope"]</spanx> →
              <spanx style="verb">ReplicationScope</spanx></t>
        </list>
      </t>

      <t>
        The recommended "aps.storage.k8s.io/" prefix reduces the likelihood
        of collisions with vendor-defined parameters. Alternative key names
        MAY be used if they are clearly documented by the CSI driver and its
        consumers.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 12. Bindings and Profile Discovery                         -->
    <!-- ========================================================== -->
    <section anchor="bindings" title="Bindings and Profile Discovery">
      <t>
        APS MAY be expressed in existing schemas as follows:
      </t>

      <t>
        <list style="hanging">

          <t hangText="Swordfish:">
            Implementations expressing APS profiles in Swordfish SHOULD use
            ClassOfService with UsageClass="AgentPersistentState", and attach
            a vendor-defined or standardized PersistentStateLineOfService
            extension carrying APS fields. Discovery of available APS profiles
            then follows Swordfish mechanisms for enumerating lines of service.
          </t>

          <t hangText="Redfish:">
            Implementations exposing APS via Redfish SHOULD use OEM extensions
            on StorageService and Volume resources to represent
            PersistentStateLineOfService fields, together with
            UsageClass="AgentPersistentState" where applicable. Discovery
            would use Redfish resource enumeration.
          </t>

          <t hangText="CSI:">
            APS hints MAY be conveyed through StorageClass parameters by
            convention, as described in Section <xref target="k8s-mapping"/>.
            A higher-level control plane (for example, a storage operator)
            is expected to map available backend profiles to supported
            StorageClasses.
          </t>

        </list>
      </t>

      <t>
        The details of profile discovery are intentionally left to existing
        management and orchestration mechanisms; APS focuses on the shape
        and semantics of individual profiles.
      </t>

      <t>
        Future work MAY explore bindings for object-native stores, for
        example, using object prefixes and S3 Select-like mechanisms to
        host agent state directly in object storage, based on the
        ObjectPrefix erasure scope.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 13. Versioning and Extensibility                           -->
    <!-- ========================================================== -->
    <section anchor="versioning" title="Versioning and Extensibility">
      <t>
        The <spanx style="verb">ApsVersion</spanx> field in
        <spanx style="verb">PersistentStateLineOfService</spanx> identifies
        the APS version the profile conforms to (for example, "1.0").
      </t>

      <t>
        Future versions of APS <spanx style="strong">MAY</spanx> add new
        OPTIONAL fields. Profiles that set <spanx style="verb">ApsVersion</spanx>
        to "1.x" MUST remain backward compatible for all fields defined in
        version 1.0, and consumers <spanx style="strong">MUST</spanx> ignore
        unknown fields rather than treating them as errors.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 14. Security Considerations                                -->
    <!-- ========================================================== -->
    <section anchor="security" title="Security Considerations">
      <t>
        APS-compliant resources are expected to host sensitive agent state,
        including user-identifiable data and behavior-derived embeddings.
        Implementations <spanx style="strong">SHOULD</spanx> support
        encryption at rest, strong identity scoping in the control plane,
        and cryptographic erasure according to policy.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 15. Privacy Considerations                                 -->
    <!-- ========================================================== -->
    <section anchor="privacy-considerations" title="Privacy Considerations">
      <t>
        Enterprises may need to demonstrate that agent memory for a specific
        user was deleted in accordance with regulation (for example, GDPR).
        APS provides explicit deletion semantics, an erasure scope, and a
        latency bound to support such compliance, but implementation details
        and audits remain the responsibility of the operator and any
        higher-level governance systems.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 16. Limitations and Future Work                            -->
    <!-- ========================================================== -->
    <section anchor="limitations" title="Limitations and Future Work">
      <t>
        This document does not define a conformance test suite or reference
        implementation. Future work MAY include tests for APS-aware CSI
        drivers, managed vector databases, and storage arrays.
      </t>

      <t>
        APS does not provide guidance on how agents should degrade gracefully
        when APS-compliant storage is unavailable. Agent frameworks are
        responsible for fallback behavior (for example, operating in a
        stateless mode).
      </t>

      <t>
        APS does not model cost trade-offs explicitly. Operators are expected
        to balance erase latency, replication scope, erasure scope, and
        storage-class cost.
      </t>

      <t>
        APS does not attempt to model differential privacy, federated
        learning, or higher-level alignment techniques. Those may rely on
        APS semantics but are outside the scope of this profile.
      </t>

      <t>
        A normative JSON Schema for PersistentStateLineOfService is intended
        for a future revision to enable automated validation and tooling.
      </t>
    </section>


    <!-- ========================================================== -->
    <!-- 17. IANA Considerations                                    -->
    <!-- ========================================================== -->
    <section anchor="iana" title="IANA Considerations">
      <t>
        This document requests that IANA create an "APS UsageClass and
        Profile Registry" containing the following initial values:
      </t>

      <t>
        <list style="hanging">
          <t hangText="UsageClass:">
            <list style="symbols">
              <t>AgentPersistentState</t>
            </list>
          </t>

          <t hangText="IOProfile:">
            <list style="symbols">
              <t>SmallRandom</t>
              <t>MixedRandom</t>
              <t>MetadataHeavy</t>
              <t>VectorIndex</t>
            </list>
          </t>

          <t hangText="ErasureScope:">
            <list style="symbols">
              <t>Volume</t>
              <t>FilesystemTree</t>
              <t>ObjectPrefix</t>
              <t>ApplicationKey</t>
            </list>
          </t>
        </list>
      </t>

      <t>
        No additional IANA actions are required by this document.
      </t>
    </section>

  </middle>


  <back>

    <references title="Normative References">
      <reference anchor="RFC2119">
        <front>
          <title>Key words for use in RFCs to Indicate Requirement Levels</title>
          <author initials="S." surname="Bradner" fullname="Scott Bradner">
            <organization>Harvard University</organization>
          </author>
          <date year="1997" month="March"/>
        </front>
        <seriesInfo name="BCP" value="14"/>
        <seriesInfo name="RFC" value="2119"/>
      </reference>
      <reference anchor="RFC8174">
        <front>
          <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
          <author initials="B." surname="Leiba" fullname="Barry Leiba">
            <organization>Huawei Technologies</organization>
          </author>
          <date year="2017" month="May"/>
        </front>
        <seriesInfo name="BCP" value="14"/>
        <seriesInfo name="RFC" value="8174"/>
      </reference>
    </references>

    <references title="Informative References">
      <reference anchor="Swordfish">
        <front>
          <title>Swordfish Storage Management Specification</title>
          <author>
            <organization>SNIA</organization>
          </author>
          <date year="2024"/>
        </front>
      </reference>

      <reference anchor="Redfish">
        <front>
          <title>Redfish Scalable Platforms Management API</title>
          <author>
            <organization>DMTF</organization>
          </author>
          <date year="2024"/>
        </front>
      </reference>

      <reference anchor="CSI">
        <front>
          <title>Container Storage Interface (CSI)</title>
          <author>
            <organization>Kubernetes Community</organization>
          </author>
          <date year="2024"/>
        </front>
      </reference>
    </references>

    <!-- ========================================================== -->
    <!-- Appendix A: Example APS Profiles                           -->
    <!-- ========================================================== -->
    <section anchor="appendix-profiles" title="Example APS Profiles (Non-Normative)">
      <t>
        This appendix provides three example PersistentStateLineOfService
        instances. These are illustrative and non-normative. They are
        intended to show how capacity, performance, durability, and privacy
        settings can be combined for common agent scenarios.
      </t>

      <section anchor="example-standard" title="APS-Standard (Checkpoint-Oriented)">
        <t>
          APS-Standard is intended for general agent checkpointing and
          short- to medium-term memory. It favors small random I/O, moderate
          capacity, and relatively tight tail latency so that frequent
          state saves do not break conversational responsiveness. The 8 KiB
          recommended I/O size and 3000 IOPS target are representative of
          a busy checkpointing workload where many agents periodically flush
          small state blobs or logs.
        </t>

        <figure>
          <artwork><![CDATA[
PersistentStateLineOfService {
  ApsVersion: "1.0"
  Id: "aps-standard-1"
  Name: "APS-Standard-Checkpoint"
  Purpose: "AgentPersistentState"

  IOProfile: SmallRandom

  CapacityRangeGiB {
    MinGiB: 5
    MaxGiB: 100
    RecommendedStepGiB: 5
  }

  Performance {
    MinIOPS: 3000
    MaxLatencyMsP95: 20
    MaxLatencyMsP99: 50
    RecommendedIOSizeKiB: 8
  }

  Durability {
    DurabilityClass: High
    ReplicationScope: MultiAZ
  }

  CrashConsistency {
    RequiresWriteOrdering: true
  }

  Retention {
    RetentionDays: 30
    ForgettingPolicySupported: true
  }

  Privacy {
    ContainsUserFacingDataExpected: true
    MinDeletionSemantics: CryptographicErase
    ErasureScope: Volume
    MaxEraseLatencyHours: 24
  }

  ConcurrencyHints {
    AdvisoryLockingSupported: true
  }
}
          ]]></artwork>
        </figure>
      </section>

      <section anchor="example-vector" title="APS-Vector (Embedding-Heavy)">
        <t>
          APS-Vector is intended for embedding-heavy and index-heavy agent
          workloads. It assumes larger capacity, higher IOPS, and a
          RecommendedIOSizeKiB of 32 KiB to reflect typical vector
          segment sizes during scan-oriented operations. It disables
          advisory locking hints because many vector index implementations
          use append-only or copy-on-write designs and do not rely on
          traditional filesystem locking semantics for concurrency control.
          The VectorPhase field is set to indicate that this profile is
          tuned primarily for read-heavy query traffic.
        </t>

        <figure>
          <artwork><![CDATA[
PersistentStateLineOfService {
  ApsVersion: "1.0"
  Id: "aps-vector-1"
  Name: "APS-Vector-Index"
  Purpose: "AgentPersistentState"

  IOProfile: VectorIndex
  VectorPhase: Query

  CapacityRangeGiB {
    MinGiB: 50
    MaxGiB: 2000
    RecommendedStepGiB: 50
  }

  Performance {
    MinIOPS: 8000
    MaxLatencyMsP95: 25
    MaxLatencyMsP99: 80
    RecommendedIOSizeKiB: 32
  }

  Durability {
    DurabilityClass: VeryHigh
    ReplicationScope: MultiRegion
  }

  CrashConsistency {
    RequiresWriteOrdering: true
  }

  Retention {
    RetentionDays: 90
    ForgettingPolicySupported: true
  }

  Privacy {
    ContainsUserFacingDataExpected: true
    MinDeletionSemantics: CryptographicErase
    ErasureScope: FilesystemTree
    MaxEraseLatencyHours: 48
  }

  ConcurrencyHints {
    AdvisoryLockingSupported: false
  }
}
          ]]></artwork>
        </figure>
      </section>

      <section anchor="example-best-effort"
               title="APS-Light (BestEffort / ApplicationKey)">
        <t>
          APS-Light illustrates deployments that rely on application-managed
          keys or do not require strict retention or erase guarantees.
          Suitable for experimentation, low-risk workloads, or sidecar-based
          encryption designs where the storage layer itself does not enforce
          cryptographic erase.
        </t>

        <figure>
          <artwork><![CDATA[
PersistentStateLineOfService {
  ApsVersion: "1.0"
  Id: "aps-light-1"
  Name: "APS-Light-Experiment"
  Purpose: "AgentPersistentState"

  IOProfile: MixedRandom

  CapacityRangeGiB {
    MinGiB: 1
    MaxGiB: 200
    RecommendedStepGiB: 1
  }

  Performance {
    MinIOPS: 500
    MaxLatencyMsP95: 40
    MaxLatencyMsP99: 120
    RecommendedIOSizeKiB: 4
  }

  Durability {
    DurabilityClass: Standard
    ReplicationScope: LocalAZ
  }

  CrashConsistency {
    RequiresWriteOrdering: false
  }

  Retention {
    RetentionDays: 7
    ForgettingPolicySupported: false
  }

  Privacy {
    ContainsUserFacingDataExpected: false
    MinDeletionSemantics: BestEffort
    ErasureScope: ApplicationKey
    MaxEraseLatencyHours: 24
  }

  ConcurrencyHints {
    AdvisoryLockingSupported: false
  }
}
          ]]></artwork>
        </figure>
      </section>
    </section>

  </back>

</rfc>

