<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="info"
  docName="draft-ramadan-mboned-sonar-01"
  ipr="trust200902"
  obsoletes=""
  updates=""
  submissionType="IETF"
  xml:lang="en"
  version="3">

  <front>
    <title abbrev="SONAR">SONAR: Statistical Observation Network for Attestation and Reach</title>
    <seriesInfo name="Internet-Draft" value="draft-ramadan-mboned-sonar-01"/>
    <author fullname="Omar Ramadan" initials="O." surname="Ramadan">
      <organization>Blockcast Inc</organization>
      <address>
        <postal>
          <street></street>
          <city>Berkeley</city>
          <region>CA</region>
          <country>US</country>
        </postal>
        <email>omar@blockcast.network</email>
        <uri>https://blockcast.network</uri>
      </address>
    </author>
    <date year="2025" month="November" day="2"/>
    <area>ops</area>
    <workgroup>mboned</workgroup>
    <keyword>multicast</keyword>
    <keyword>authentication</keyword>
    <keyword>coverage</keyword>
    <keyword>proof of delivery</keyword>
    
    <abstract>
      <t>This document specifies SONAR (Statistical Observation Network for Attestation and Reach), a protocol for verifiable multicast delivery claims without trusted intermediaries. SONAR combines: (1) O(1) IP multicast efficiency versus O(N) unicast to detect cheating, (2) cryptoeconomic accountability via on-chain stake deposits, VRF-based unpredictable sampling, and blockchain attestations, and (3) ALTA-based real-time multicast authentication.
SONAR separates content authentication from coverage verification: ALTA authenticates all packets with ~6% bandwidth overhead, while statistical coverage verification adds minimal overhead (320 KB challenge messages per 15-60 minute test period, 0.7-2.8 Kbps). Coverage estimation samples 0.1% of receivers using German Tank Problem inference. For privacy and cost efficiency at scale, zkSNARK proof aggregation (recommended for >1,000 sampled users) maintains O(1) on-chain verification cost, enabling populations exceeding 10^8 receivers.
</t>
    </abstract>
  </front>

  <middle>
    <section anchor="introduction">
      <name>Introduction</name>
      <t>Multicast distribution offers significant efficiency advantages over unicast 
      for large-scale content delivery, reducing bandwidth costs by 99.99% or more. 
      However, the lack of verifiable delivery mechanisms prevents widespread commercial 
      adoption. Content providers cannot verify that infrastructure operators actually 
      delivered content to claimed receivers, while infrastructure operators cannot prove 
      delivery to enable billing. This bilateral trust deficit blocks the formation of 
      liquid markets for multicast capacity.</t>
      
      <t>Existing multicast authentication schemes (<xref target="RFC4082"/>, 
      <xref target="I-D.ietf-mboned-ambi"/>, <xref target="I-D.krose-mboned-alta"/>) 
      address content authentication but do not provide per-receiver coverage proof. 
      Per-receiver encryption defeats multicast efficiency by requiring O(N) bandwidth 
      where N is the number of receivers.</t>
      
      <t>SONAR solves this problem through statistical sampling: rather than proving 
      delivery to every receiver, SONAR proves delivery to a random sample with known 
      statistical confidence. This enables verification of populations exceeding 10^7 
      receivers with constant bandwidth overhead.</t>

      <section anchor="requirements">
        <name>Requirements Notation</name>
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 
        "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this 
        document are to be interpreted as described in BCP 14 <xref target="RFC2119"/> 
        <xref target="RFC8174"/> when, and only when, they appear in all capitals, as 
        shown here.</t>
      </section>
    </section>

    <section anchor="terminology">
      <name>Terminology</name>
      <t>This document uses the following terms:</t>
      <dl>
        <dt>Coverage Group:</dt>
        <dd>A set of receivers that can be addressed with a single multicast 
        transmission, typically defined by geographic location or network topology.</dd>
        
        <dt>Attestation:</dt>
        <dd>A cryptographically signed statement by a receiver asserting successful 
        reception of specific content with packet statistics.</dd>
        
        <dt>Sample Size:</dt>
        <dd>The number of receivers randomly selected to provide attestations in a 
        given test window. Denoted as m.</dd>
        
        <dt>Population Size:</dt>
        <dd>The total number of receivers in the coverage group. Denoted as N.</dd>
        
        <dt>German Tank Estimate:</dt>
        <dd>A statistical estimator for population size based on maximum observation 
        in a sampled sequence.</dd>
        
        <dt>Test Window:</dt>
        <dd>A time period during which packet reception is tracked for coverage 
        verification. Duration denoted as P.</dd>
        
        <dt>zkSNARK:</dt>
        <dd>Zero-Knowledge Succinct Non-Interactive Argument of Knowledge. A 
        cryptographic proof system enabling verification of aggregate statements 
        with constant-size proofs.</dd>
      </dl>
    </section>

    <section anchor="architecture">
      <name>Architecture Overview</name>
      
      <section anchor="design-principles">
        <name>Design Principles</name>
        <t>SONAR is designed around the following principles:</t>
        <ul>
          <li>Content authentication is independent of coverage verification</li>
          <li>Statistical sampling provides O(1) verification cost regardless of population size</li>
          <li>Broadcast bandwidth overhead MUST be &lt;10% to preserve multicast efficiency</li>
          <li>Return path (user attestations) uses existing internet infrastructure</li>
          <li>Cryptographic security complemented by economic incentives</li>
        </ul>
      </section>

      <section anchor="protocol-layers">
        <name>Protocol Layers</name>
        
        <section anchor="layer1-content-auth">
          <name>Layer 1: Content Authentication</name>
          <t>All receivers verify content authenticity using ALTA protocol 
          <xref target="I-D.krose-mboned-alta"/>. This provides:</t>
          <ul>
            <li>Non-repudiation: Content provider cannot deny transmission</li>
            <li>Source authentication: Receivers verify authorized origin</li>
            <li>Integrity: Content not modified in transit</li>
            <li>Low latency: 1-10ms authentication delay</li>
          </ul>
          <t>ALTA is chosen over alternatives (TESLA, AMBI) because:</t>
          <ul>
            <li>No time synchronization required (vs TESLA)</li>
            <li>Real-time authentication (vs TESLA 100-6000ms delay)</li>
            <li>Broadcast-only (vs AMBI unicast manifests)</li>
            <li>Strong non-repudiation (periodic Ed25519 signatures)</li>
          </ul>
          <t>Bandwidth overhead: Approximately 6% for typical configurations.</t>
        </section>

        <section anchor="layer2-statistical">
          <name>Layer 2: Statistical Coverage Verification</name>
          <t>Random sample of m receivers (typically 0.1% of population) provide 
          attestations via internet return path. Statistical inference provides 
          population coverage estimate with confidence interval.</t>
          <t>Sample selection uses Verifiable Random Function (VRF) to prevent 
          adversarial selection. Attestations include packet statistics enabling 
          loss rate estimation and fraud detection.</t>
          <t>Broadcast overhead: Only sample selection message (320 KB per test).</t>
          <t>Return path overhead: Distributed across m users (128 bps per selected user).</t>
        </section>

        <section anchor="layer3-aggregation">
          <name>Layer 3: Zero-Knowledge Aggregation (Recommended for m &gt; 1,000)</name>
          <t>zkSNARK proofs SHOULD be employed when sample size m exceeds 1,000 users, 
          primarily for privacy protection. Individual viewing patterns become correlatable
	  on-chain, enabling de-anonymization attacks. 
	  zkSNARKs provide aggregated proof of coverage statistics while hiding individual user
	  attestations.</t>
          <t>Additional benefits: 80-90% cost reduction via off-chain storage, constant-size 
          verification (328 bytes regardless of m), and scalability to populations exceeding 10^8.</t>
          <t>Challenge protocol enables spot-checking of individual attestations via Merkle 
          proof while maintaining aggregate privacy.</t>
        </section>
      </section>
    </section>

    <section anchor="content-authentication">
      <name>Content Authentication</name>
      
      <section anchor="alta-configuration">
        <name>ALTA Protocol Configuration</name>
        <t>SONAR employs ALTA with the following parameters:</t>
        <dl>
          <dt>Scheme Parameters:</dt>
          <dd>
            <ul>
              <li>a = 3 (backward reference interval)</li>
              <li>p = 5 (redundancy factor)</li>
              <li>K = 50 (signature interval)</li>
            </ul>
          </dd>
          
          <dt>Algorithms:</dt>
          <dd>
            <ul>
              <li>MAC: HMAC-SHA256 truncated to 128 bits</li>
              <li>Signature: Ed25519 (64 bytes)</li>
            </ul>
          </dd>
        </dl>
      </section>

      <section anchor="packet-format">
        <name>Packet Format</name>
        <t>Each multicast packet MUST include ALTA authentication data:</t>
        <artwork><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Sequence Number (32 bits)                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S| Reserved  |  MAC Count    |                                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|                                                               |
+                   Previous Packet Hash (256 bits)            +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                   MAC 1 (128 bits)                           +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                   MAC 2 (128 bits)                           +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~                   ... (additional MACs)                       ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                                                               +
|                   Ed25519 Signature (512 bits)                |
+                   (present if S=1, every Kth packet)          +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
~                   Content Payload                             ~
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        ]]></artwork>
        <dl>
          <dt>Sequence Number:</dt>
          <dd>Monotonically increasing packet identifier</dd>
          
          <dt>S (Signature Present):</dt>
          <dd>1-bit flag indicating Ed25519 signature included</dd>
          
          <dt>Reserved:</dt>
          <dd>7 bits reserved for future use, MUST be zero</dd>
          
          <dt>MAC Count:</dt>
          <dd>Number of MACs included (typically 3-5)</dd>
          
          <dt>Previous Packet Hash:</dt>
          <dd>SHA-256 hash of previous packet for chain verification</dd>
          
          <dt>MAC n:</dt>
          <dd>HMAC-SHA256 of packet at offset (i - n*a) truncated to 128 bits</dd>
          
          <dt>Ed25519 Signature:</dt>
          <dd>Signature of packets i through i-K+1 (when S=1, every Kth packet)</dd>
          
          <dt>Content Payload:</dt>
          <dd>Application data</dd>
        </dl>
        
        <t>Total overhead calculation:</t>
        <ul>
          <li>Base: 4 + 32 = 36 bytes</li>
          <li>MACs: 16 * MAC_Count bytes</li>
          <li>Signature (amortized): 64 / K bytes</li>
          <li>For MAC_Count=4, K=50: 36 + 64 + 1.28 = 101.28 bytes</li>
          <li>Percentage for 1500-byte packets: 6.75%</li>
        </ul>
      </section>

      <section anchor="verification-procedure">
        <name>Verification Procedure</name>
        <t>Upon receiving packet i, receiver performs the following steps:</t>
        <ol>
          <li>Verify sequence number is monotonically increasing</li>
          <li>Compute SHA-256(packet_{i-1}) and compare with Previous Packet Hash field</li>
          <li>For each MAC_j in packet, retrieve stored packet at offset (i - j*a)</li>
          <li>Recompute HMAC-SHA256 for each referenced packet and compare</li>
          <li>If S=1, verify Ed25519 signature over packets [i-K+1, i]</li>
          <li>If all verifications pass, accept packet as authentic</li>
        </ol>
        <t>Receiver MUST buffer packets until sufficient MACs received for verification 
        (depth = p packets).</t>
      </section>
    </section>

    <section anchor="statistical-sampling">
      <name>Statistical Coverage Verification</name>
      
      <section anchor="sample-selection">
        <name>Sample Selection Protocol</name>
        
        <section anchor="vrf-selection">
          <name>VRF-Based Random Selection</name>
          <t>Content provider generates verifiable random sample using VRF to prevent 
          adversarial selection:</t>
          <ol>
            <li>Obtain blockchain randomness source (e.g., block hash at height H)</li>
            <li>Apply VRF with content provider private key: seed = VRF_prove(sk, blockhash || session_id)</li>
            <li>Use seed for Fisher-Yates shuffle of registered user public keys</li>
            <li>Select first m users from shuffled list</li>
          </ol>
          
          <t>VRF properties ensure:</t>
          <ul>
            <li>Unpredictability: Adversary cannot predict selection before blockhash revealed</li>
            <li>Verifiability: Anyone can verify selection was computed correctly</li>
            <li>Uniqueness: Only one valid output for given input</li>
          </ul>
        </section>

        <section anchor="challenge-message">
          <name>Challenge Message Format</name>
          <artwork><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Message Type = 0x01         |        Reserved               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Session ID (64 bits)                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Test Window Start (32 bits)                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Test Window End (32 bits)                   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Number of Selected Users (32 bits)          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                   VRF Proof (80 bytes)                        +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
~      Selected User Public Keys (32 bytes each, m total)      ~
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                                                               +
|                   Ed25519 Signature (512 bits)                |
+                   (signs entire message)                     +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          ]]></artwork>
          
          <t>Size calculation for m=10,000 users:</t>
          <ul>
            <li>Header: 24 bytes</li>
            <li>VRF Proof: 80 bytes</li>
            <li>User pubkeys: 32 * 10,000 = 320,000 bytes</li>
            <li>Signature: 64 bytes</li>
            <li>Total: 320,168 bytes ≈ 320 KB</li>
          </ul>
          
          <t>Broadcast frequency: Once per test period P (recommended: P = 900-7200 seconds)</t>
          <t>Bandwidth: 320 KB / P seconds</t>
          <ul>
            <li>P = 900s (15 min): 356 bytes/s = 2.8 Kbps</li>
            <li>P = 3600s (1 hour): 89 bytes/s = 0.7 Kbps</li>
          </ul>
        </section>
      </section>

      <section anchor="user-attestation">
        <name>User Attestation Protocol</name>
        
        <section anchor="attestation-format">
          <name>Attestation Message Format</name>
          <t>Selected users MUST respond within response window T_response (recommended: 60 seconds):</t>
          <artwork><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Message Type = 0x02         |        Reserved               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                   User Public Key (256 bits)                 +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Session ID (64 bits)                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Packet Min Observed (32 bits)               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Packet Max Observed (32 bits)               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Packets Received (32 bits)                  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                   Sample Content Hash (256 bits)             +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   Response Timesonar (64 bits)                |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                                                               +
|                   Ed25519 Signature (512 bits)                |
+                   (signs entire message)                     +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          ]]></artwork>
          
          <dl>
            <dt>Packet Min/Max Observed:</dt>
            <dd>First and last sequence numbers received in test window</dd>
            
            <dt>Packets Received:</dt>
            <dd>Total count of packets successfully received and authenticated</dd>
            
            <dt>Sample Content Hash:</dt>
            <dd>SHA-256 hash of concatenated payload from sampled packets (e.g., every 100th packet) 
            to prove actual content reception</dd>
            
            <dt>Response Timesonar:</dt>
            <dd>Unix timesonar when attestation created</dd>
          </dl>
          
          <t>Total size: 160 bytes per attestation</t>
        </section>

        <section anchor="attestation-submission">
          <name>Submission Methods</name>
          
          <section anchor="direct-submission">
            <name>Direct Blockchain Submission</name>
            <t>Each selected user submits attestation as blockchain transaction:</t>
            <ul>
              <li>Attestation payload: 160 bytes</li>
              <li>Transaction overhead: ~40 bytes</li>
              <li>Total: ~200 bytes per user</li>
              <li>For m=10,000: 2 MB per test</li>
              <li>Cost at $0.0001/tx: $1.00 per test</li>
            </ul>
            <t>Advantages: Simple, immediate verification</t>
            <t>Disadvantages: High on-chain cost for large m</t>
          </section>

          <section anchor="aggregated-submission">
            <name>Aggregated Submission via zkSNARK</name>
            <t>Users submit to off-chain aggregator that creates zkSNARK proof:</t>
            <ol>
              <li>Users submit 160-byte attestations to off-chain storage</li>
              <li>Aggregator collects m attestations</li>
              <li>Aggregator builds Merkle tree with root R</li>
              <li>Aggregator computes aggregate statistics</li>
              <li>Aggregator generates zkSNARK proof π</li>
              <li>Aggregator submits {R, statistics, π} on-chain</li>
            </ol>
            <t>On-chain size: 328 bytes (constant regardless of m)</t>
            <t>Cost reduction: 99.998% vs direct submission for m=10,000</t>
          </section>
        </section>
      </section>

      <section anchor="coverage-estimation">
        <name>Coverage Estimation</name>
        
        <section anchor="german-tank">
          <name>German Tank Problem Estimator</name>
          <t>Given sender transmitted N packets and user j reports packet_max_j, 
          population size estimate:</t>
          <artwork><![CDATA[
N_hat = max(packet_max_1, ..., packet_max_m) 
        + max(...) / m - 1
          ]]></artwork>
          
          <t>Loss rate estimate for user j:</t>
          <artwork><![CDATA[
span_j = packet_max_j - packet_min_j + 1
loss_j = 1 - (packets_received_j / span_j)
          ]]></artwork>
          
          <t>Aggregate loss rate:</t>
          <artwork><![CDATA[
loss_avg = (1/m) * sum(loss_j for j in sample)
          ]]></artwork>
          
          <t>If packet_max_j ≈ N, user kept pace with real-time stream (multicast reception). 
          If packet_max_j &lt;&lt; N, user lagged significantly (potential unicast forwarding).</t>
        </section>

        <section anchor="confidence-intervals">
          <name>Confidence Intervals</name>
          <t>For sample size m from population N, coverage estimate has confidence interval:</t>
          <artwork><![CDATA[
Standard error: SE = sqrt(p_hat * (1 - p_hat) / m)
95% confidence: CI_95 = p_hat ± 1.96 * SE
Population coverage: Coverage = N * p_hat
                     Coverage_CI = N * (p_hat ± 1.96 * SE)
          ]]></artwork>
          
          <t>Example: N = 10,000,000 users, m = 10,000 sample, p_hat = 0.95:</t>
          <artwork><![CDATA[
SE = sqrt(0.95 * 0.05 / 10000) = 0.00218
CI_95 = 0.95 ± 0.00427 = [0.946, 0.954]
Coverage = 9,500,000 ± 42,700 users
          ]]></artwork>
          
          <t>Minimum sample size for desired margin of error E:</t>
          <artwork><![CDATA[
m_min = (1.96^2 * p_hat * (1 - p_hat)) / E^2
          ]]></artwork>
          
          <t>For E = 0.001 (0.1% margin) with p_hat = 0.95:</t>
          <artwork><![CDATA[
m_min = (3.84 * 0.95 * 0.05) / 0.000001 = 18,240 samples
          ]]></artwork>
        </section>
      </section>
    </section>

    <section anchor="zksnark-aggregation">
      <name>Zero-Knowledge Proof Aggregation</name>
      
      
      <t>Zero-knowledge proof aggregation via zkSNARKs SHOULD be employed when sample 
      size m exceeds 1,000 users. This threshold is determined by three factors:</t>
      
      <ol>
        <li>Privacy Protection: Individual attestations become correlatable on-chain, 
        enabling de-anonymization attacks. For small communities (N &lt; 100,000), users 
        are more easily identifiable, making privacy protection critical.</li>
        
        <li>Cost Efficiency: Off-chain storage via Data Anchor costs $0.00001 per 
        attestation versus $0.0001 for direct on-chain submission. For m=1,000, this 
        represents 80-90% cost reduction: $0.02 (zkSNARK aggregated) versus $0.10 (direct).</li>
        
        <li>Constant Verification: zkSNARK proofs maintain 200-byte size and O(1) 
        verification cost regardless of m, enabling scalability to populations exceeding 10^8.</li>
      </ol>
      
      <t>Implementation:</t>
      <ul>
        <li>User attestations sent to aggregator (off-chain)</li>
        <li>zkSNARK proof generated in Trusted Execution Environment (TEE)</li>
        <li>Proof verified on-chain via smart contract</li>
        <li>Challenge protocol enables spot-checking via Merkle proofs</li>
      </ul>
      <section anchor="merkle-construction">
        <name>Merkle Tree Construction</name>
        <t>Aggregator constructs binary Merkle tree from attestations:</t>
        <ol>
          <li>Collect m attestations: A_1, A_2, ..., A_m</li>
          <li>Compute leaf hashes: L_j = SHA256(A_j)</li>
          <li>Build tree bottom-up: H_parent = SHA256(H_left || H_right)</li>
          <li>Compute root: R</li>
        </ol>
        
        <t>Merkle proof for attestation A_j:</t>
        <ul>
          <li>Path: Sibling hashes from leaf to root</li>
          <li>Length: ceil(log2(m)) hashes</li>
          <li>For m=10,000: 14 hashes * 32 bytes = 448 bytes</li>
        </ul>
      </section>

      <section anchor="zksnark-generation">
        <name>zkSNARK Proof Generation</name>
        <t>Aggregator generates proof π for statement S:</t>
        <artwork><![CDATA[
Statement S:
  "I know m attestations {A_1, ..., A_m} such that:
   1. MerkleRoot({SHA256(A_j)}) = R
   2. Each A_j contains valid Ed25519 signature
   3. Aggregate loss rate < threshold (e.g., 0.05)
   4. Median(packet_max) > threshold (e.g., 0.95 * N)"
        ]]></artwork>
        
        <t>Public inputs: {R, m, aggregate_stats, thresholds}</t>
        <t>Witness: {A_1, ..., A_m, Merkle_paths, signatures}</t>
        <t>Proof size: Approximately 200 bytes (constant regardless of m)</t>
      </section>

      <section anchor="on-chain-verification">
        <name>On-Chain Verification</name>
        <t>Smart contract verifies zkSNARK proof:</t>
        <ol>
          <li>Extract public inputs: {R, m, statistics, π}</li>
          <li>Verify proof: valid = Verify(vk, public_inputs, π)</li>
          <li>If valid: Accept coverage claim for m users</li>
          <li>If invalid: Reject and slash aggregator stake</li>
        </ol>
        
        <t>Verification cost: ~100,000 gas (constant regardless of m)</t>
      </section>

      <section anchor="challenge-protocol">
        <name>Challenge Protocol</name>
        <t>Any party may challenge aggregator by requesting Merkle proof for specific user:</t>
        <ol>
          <li>Challenger submits challenge: "Prove user_j is in tree R"</li>
          <li>Aggregator MUST respond within T_challenge (recommended: 24 hours)</li>
          <li>Aggregator provides: {A_j, merkle_path}</li>
          <li>Challenger verifies: MerkleVerify(A_j, path, R) and signature validity</li>
          <li>If verification fails: Aggregator stake slashed, challenger rewarded</li>
          <li>If verification succeeds: Challenge bond returned to challenger</li>
        </ol>
      </section>
    </section>

    <section anchor="test-period-optimization">
      <name>Test Period Optimization</name>
      
      <section anchor="unicast-detection">
        <name>Unicast Replication Detection</name>
        <t>A malicious relay might receive content via unicast forwarding and claim 
        multicast reception. Detection relies on bandwidth constraints:</t>
        
        <t>For unicast replication to B recipients:</t>
        <artwork><![CDATA[
Required bandwidth: B * R_content
If B * R_content > R_relay, relay must lag
Lag accumulates at rate: (B * R_content - R_relay)
        ]]></artwork>
        
        <t>Minimum test period for detection:</t>
        <artwork><![CDATA[
P_min = L / (alpha - 1)
where alpha = (B * R_content) / R_relay
      L = acceptable loss rate
        ]]></artwork>
        
        <t>Example: B=1M recipients, R_content=25 Mbps, R_relay=10 Gbps, L=0.05:</t>
        <artwork><![CDATA[
alpha = (1,000,000 * 25) / 10,000 = 2,500
P_min = 0.05 / (2,500 - 1) ≈ 0.00002 seconds
        ]]></artwork>
        
        <t>Conclusion: For typical broadcast scenarios, any test period P &gt; 1 second 
        provides overwhelming detection certainty. Optimal P is determined by cost-benefit 
        tradeoff, not detection requirements.</t>
      </section>

      <section anchor="recommended-periods">
        <name>Recommended Test Periods</name>
        <t>Test period selection based on use case:</t>
        <table>
          <thead>
            <tr>
              <th>Use Case</th>
              <th>Test Period P</th>
              <th>Tests/Hour</th>
              <th>Cost/Hour*</th>
              <th>Detection Latency</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td>Live Events</td>
              <td>300s (5 min)</td>
              <td>12</td>
              <td>$12</td>
              <td>&lt;5 min</td>
            </tr>
            <tr>
              <td>Prime Time TV</td>
              <td>900s (15 min)</td>
              <td>4</td>
              <td>$4</td>
              <td>&lt;15 min</td>
            </tr>
            <tr>
              <td>Off-Peak Content</td>
              <td>3600s (1 hour)</td>
              <td>1</td>
              <td>$1</td>
              <td>&lt;1 hour</td>
            </tr>
            <tr>
              <td>ISP SLA Reporting</td>
              <td>7200s (2 hours)</td>
              <td>0.5</td>
              <td>$0.50</td>
              <td>&lt;2 hours</td>
            </tr>
          </tbody>
        </table>
        <t>*Assumes m=10,000 users, $0.0001 per transaction</t>
        
        <t>Maximum recommended P: 7200 seconds (2 hours)</t>
        <t>Rationale: Beyond 2 hours, network state staleness reduces actionable value 
        of coverage data.</t>
      </section>
    </section>

    <section anchor="security">
      <name>Security Considerations</name>
      
      <section anchor="threat-model">
        <name>Threat Model</name>
        <t>SONAR must resist the following adversarial behaviors:</t>
        <dl>
          <dt>Sybil Attacks:</dt>
          <dd>Attacker creates multiple fake receiver identities to inflate coverage claims. 
          Mitigated by stake requirements and VRF-based random sampling.</dd>
          
          <dt>Replay Attacks:</dt>
          <dd>Adversary captures authenticated content and replays to different receivers. 
          Mitigated by sequence numbers, timesonars, and hash chaining.</dd>
          
          <dt>Man-in-the-Middle:</dt>
          <dd>Intermediate node modifies content while maintaining valid authentication. 
          Prevented by ALTA MAC chains and periodic signatures.</dd>
          
          <dt>DoS Attacks:</dt>
          <dd>Attacker floods network with fake packets to exhaust receiver buffers. 
          Mitigated by instant ALTA authentication enabling immediate rejection.</dd>
          
          <dt>Sample Manipulation:</dt>
          <dd>Adversary attempts to influence which users are selected for sampling. 
          Prevented by VRF unpredictability.</dd>
        </dl>
      </section>

      <section anchor="economic-security">
        <name>Economic Security</name>
        <t>Cryptographic security is complemented by economic incentives:</t>
        <ul>
          <li>Stake requirements: $10-$100K deposits create accountability</li>
          <li>Slashing penalties: Fraudulent attestations result in stake loss</li>
          <li>Fraud detection rewards: 5-10x multiplier encourages honest reporting</li>
          <li>Rational behavior: Cost of fraud exceeds expected benefit</li>
        </ul>
        
        <t>Game-theoretic analysis shows honest participation is Nash equilibrium when 
        fraud detection probability exceeds 0.001% (easily achieved through random spot 
        checks).</t>
      </section>

      <section anchor="privacy">
        <name>Privacy Considerations</name>
        <t>SONAR reveals the following information:</t>
        <ul>
          <li>Which users are registered for coverage verification (public keys on-chain)</li>
          <li>Which users were selected for sampling (challenge message)</li>
          <li>Aggregate statistics about selected users (loss rates, packet counts)</li>
        </ul>
        
        <t>SONAR does NOT reveal:</t>
        <ul>
          <li>Content of multicast stream (encrypted separately if needed)</li>
          <li>Individual user consumption patterns (when zkSNARK aggregation used)</li>
          <li>Non-selected users' reception status</li>
        </ul>
        
        <t>Privacy Attack Vector for Small Communities:</t>
        <t>Without zkSNARK aggregation, direct on-chain attestations enable correlation attacks. 
        For small communities (N &lt; 100,000), attackers can:</t>
        <ul>
          <li>Link public keys to known wallet addresses</li>
          <li>Correlate viewing times with other on-chain activity</li>
          <li>Cross-reference with geographic or demographic data</li>
        </ul>
        
        <t>Privacy decreases inversely with community size. For N=5,000, individual 
        identification probability exceeds 75% through cross-referencing. For N=10,000,000, 
        crowd anonymity provides natural protection.</t>
        
        <t>RECOMMENDATION: zkSNARK aggregation MUST be used when sample size m &gt; 1,000, 
        SHOULD be used when m &gt; 100. This protects small community viewers from 
        de-anonymization while maintaining cryptographic coverage proof.</t>
      </section>
    </section>

    <section anchor="iana">
      <name>IANA Considerations</name>
      <t>This document requests IANA to create a new registry for SONAR message types:</t>
      
      <t>Registry Name: SONAR Message Types</t>
      <t>Registration Procedure: IETF Review</t>
      <t>Reference: This document</t>
      
      <t>Initial allocations:</t>
      <table>
        <thead>
          <tr>
            <th>Value</th>
            <th>Description</th>
            <th>Reference</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td>0x01</td>
            <td>Sample Challenge</td>
            <td>Section 5.1.2</td>
          </tr>
          <tr>
            <td>0x02</td>
            <td>User Attestation</td>
            <td>Section 5.2.1</td>
          </tr>
          <tr>
            <td>0x03</td>
            <td>zkSNARK Aggregated Proof</td>
            <td>Section 6.3</td>
          </tr>
        </tbody>
      </table>
    </section>
  </middle>

  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        
        <reference anchor="I-D.krose-mboned-alta" target="https://datatracker.ietf.org/doc/html/draft-krose-mboned-alta-01">
          <front>
            <title>Asymmetric Loss-Tolerant Authentication</title>
            <author initials="K." surname="Rose" fullname="Kyle Rose">
              <organization>Akamai Technologies</organization>
            </author>
            <author initials="J." surname="Holland" fullname="Jake Holland">
              <organization>Akamai Technologies</organization>
            </author>
            <date year="2019" month="July"/>
          </front>
        </reference>
      </references>
      
      <references>
        <name>Informative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4082.xml"/>
        
        <reference anchor="I-D.ietf-mboned-ambi" target="https://datatracker.ietf.org/doc/html/draft-ietf-mboned-ambi-04">
          <front>
            <title>Asymmetric Manifest Based Integrity</title>
            <author initials="J." surname="Holland" fullname="Jake Holland">
              <organization>Akamai Technologies</organization>
            </author>
            <author initials="K." surname="Rose" fullname="Kyle Rose">
              <organization>Akamai Technologies</organization>
            </author>
            <author initials="M." surname="Franke" fullname="Max Franke">
              <organization>Akamai Technologies</organization>
            </author>
            <date year="2022" month="March"/>
          </front>
        </reference>
      </references>
    </references>

    <section anchor="appendix-example">
      <name>Example Deployment</name>
      
      <section anchor="nyc-scenario">
        <name>NYC Television Station Scenario</name>
        <t>This appendix provides a concrete deployment example for a New York City 
        television station broadcasting to 10 million concurrent viewers.</t>
        
        <section anchor="network-config">
          <name>Network Configuration</name>
          <ul>
            <li>Content Provider: NBC New York</li>
            <li>Content: Live sports broadcast</li>
            <li>Bitrate: 25 Mbps H.265 video</li>
            <li>Target Coverage: 10,000,000 concurrent viewers</li>
            <li>Geographic Area: NYC metropolitan area</li>
          </ul>
        </section>

        <section anchor="sonar-config">
          <name>SONAR Configuration</name>
          <t>Content Authentication (ALTA):</t>
          <ul>
            <li>MAC algorithm: HMAC-SHA256 (128-bit)</li>
            <li>Signature: Ed25519 every 50th packet</li>
            <li>Overhead: 6.75% (1.69 Mbps)</li>
          </ul>
          
          <t>Statistical Sampling:</t>
          <ul>
            <li>Sample size: m = 10,000 users (0.1%)</li>
            <li>Test period: P = 900 seconds (15 minutes)</li>
            <li>Confidence: 95%</li>
            <li>Margin of error: ±0.3%</li>
          </ul>
          
          <t>Broadcast Overhead:</t>
          <ul>
            <li>ALTA: 1.69 Mbps</li>
            <li>Challenge message: 320 KB / 900s = 2.8 Kbps</li>
            <li>Total: 1.69 Mbps (6.76%)</li>
          </ul>
        </section>

        <section anchor="cost-analysis">
          <name>Cost-Benefit Analysis</name>
          <t>Per-Hour Costs:</t>
          <ul>
            <li>User sampling: 10,000 users × 4 tests/hour × $0.0001 = $4.00</li>
            <li>zkSNARK aggregation: 4 tests/hour × $0.0001 = $0.0004</li>
            <li>Total: $4.00 per hour</li>
          </ul>
          
          <t>Revenue Impact:</t>
          <ul>
            <li>Traditional CPM (unverified): $10 per 1000 impressions</li>
            <li>Verified CPM: $15 per 1000 impressions (50% premium)</li>
            <li>Traditional revenue: 10M × $10/1000 = $100,000/hour</li>
            <li>Verified revenue: 10M × $15/1000 = $150,000/hour</li>
            <li>Additional revenue: $50,000/hour</li>
            <li>Net benefit: $50,000 - $4 = $49,996/hour</li>
            <li>ROI: 1,249,900%</li>
          </ul>
        </section>

        <section anchor="performance-metrics">
          <name>Performance Metrics</name>
          <ul>
            <li>Broadcast bandwidth: 25 Mbps + 1.69 Mbps = 26.69 Mbps</li>
            <li>Overhead: 6.76%</li>
            <li>Authentication latency: 1-10ms (ALTA)</li>
            <li>Detection latency: &lt;15 minutes</li>
            <li>Coverage confidence: 95% (9,458,000 - 9,542,000 users)</li>
            <li>Blockchain TPS: 0.011 (well below capacity)</li>
          </ul>
        </section>
      </section>
    </section>

    <section anchor="acknowledgments" numbered="false">
      <name>Acknowledgments</name>
      <t>The author thanks Jake Holland and Kyle Rose (Akamai) for the ALTA protocol 
      specification and insights on multicast authentication. A special thanks to Lenny Giuliano (Juniper Networks), Chris Lenart (Verizon), Neil Chatterjee (DAWN Internet) for real-world deployment experience with decentralized multicast networks and feedback on earlier revisions of this work.</t>
    </section>
  </back>
</rfc>
