<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>  <!-- Required for schema validation and schema-aware editing -->
<!-- <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> -->
<!-- This third-party XSLT can be enabled for direct transformations in XML processors, including most browsers -->
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<!-- If further character entities are required then they should be added to the DOCTYPE above.
     Use of an external entity file is not recommended. -->
<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="info"
  docName="draft-nurpmeso-dkim-hash-adaptivity-02"
  ipr="trust200902"
  obsoletes=""
  updates="6376"
  submissionType="IETF"
  xml:lang="en"
  version="3">
<!--
     [CHECK]  FIXME
       * category should be one of std, bcp, info, exp, historic
       * ipr should be one of trust200902, noModificationTrust200902,
         noDerivativesTrust200902, pre5378Trust200902
       * updates can be an RFC number as NNNN
       * obsoletes can be an RFC number as NNNN
-->
  <front>

   <title>DKIM Hash Algorithm Adaptivity</title>

   <seriesInfo name="Internet-Draft" value="draft-nurpmeso-dkim-hash-adaptivity-02"/>

    <author fullname="Steffen Nurpmeso" initials="S" role="editor" surname="Nurpmeso">
      <address><email>steffen@sdaoden.eu</email></address>
    </author>

    <date year="2025" month="02" day="03"/>

    <area>General</area>
    <workgroup>Internet Engineering Task Force</workgroup>

    <keyword>DKIM</keyword>

    <abstract><t>
      DKIM (RFC 6376, section 3.7)
      defines how "data-hash" is generated as input to a "sig-alg" for the
      purpose of generating a cryptographic signature.
      Different to the
      RSA algorithm (RFC 8017)
      solely defined for and by DKIM at the time of its creation,
      modern signature algorithms, for example
      EdDSA (RFC 8032),
      include extensive data hashing as part of the signing process.
      For these algorithms it may make sense not to create a "data-hash",
      but to use the entire data as input to "sig-alg".
      This specification allows DKIM signing algorithms "data-hash" adaptivity,
      taking advantage of algorithm design, and digital signature API reality.
    </t></abstract>

  </front>
  <middle>

    <section>
      <name>Introduction</name>
      <t>
        In section 3.7
        DKIM<xref target="RFC6376"/>
        specifies the algorithm how
        "Computing the Message Hashes"
        for
        IMF<xref target="RFC5322"/>
        messages has to be performed,
        and notes that real life digital signature APIs often combine hashing
        and signing into a single call that performs both,
        "hash-alg" as well as "sig-alg",
        in what appears as a single operation.
        However, it only appears like that, these APIs, as mentioned,
        are interfaces designed for convenience:
        they allow specification of the message digest algorithm to be used,
        they can be called repeatedly until input data is fully consumed,
        to be finalized thereafter in order to create the signature;
        under the hood the complete message digest calculation plus signature
        creation is performed in full.
      </t><t>
        Modern algorithms, like for example
        EdDSA<xref target="RFC8032"/>,
        however, include extensive data hashing via an algorithm-internal
        message digest as part of digital signature creation.
        They perform several passes over the entire data, 
        which needs to be available in full:
        no repeated input data feeding is possible.
        The algorithms may use the internal message digest multiple times
        at different steps of result creation,
        incorporating secret key data in later steps, for example.
        The final digest may be the signature.
      </t><t>
        The above convenience interfaces cannot be used for these
        algorithms, or only in adopted form, dependent on the used
        cryptographic library.
        Specification of an additional message digest algorithm is impossible.
        Therefore the introduction of the DKIM algorithm
        Ed25519-SHA256<xref target="RFC8463"/>
        required code path changes, because the DKIM "data-hash" (SHA-256)
        now needed to be created first, in an extra step, in order to feed in
        the generated data as input to "sig-alg".
      </t><blockquote>
        INFORMATIVE NOTE:
        EdDSA was adapted to DKIM as Ed25519-SHA256 in 2018,
        but has not gained much traction in the seven years since its
        introduction.
        A survey of DKIM implementations which adopted revealed enthusiastic
        code comments along the extra code paths that had to be introduced.
      </blockquote><t>
        <em>Analysis:</em>
        in its current form DKIM defines the generated "data-hash" as the sole
        input of "sig-alg".
        Modern signing algorithms perform one to multiple digest operations on
        their input data, which must therefore be available in full for the
        single invocation of the cryptographic operation;
        the DKIM "data-hash" must therefore be created specifically.
        Also, the DKIM "data-hash" algorithm may be weaker than the one used by
        the signature algorithm:
        with the mentioned Ed25519-SHA256, for example, a 64-byte SHA-256 input
        is prehashed with SHA-512 to a 128-byte output.
        The conclusion is that currently the standard
        complicates implementations,
        fosters data processing redundancy,
        and potentially weakens security attributes of algorithms by feeding in
        only data subsets, prefiltered by potentially weak(er) algorithms.
      </t>
    </section>

    <section>
      <name>Algorithm Adaptivity</name>
      <t>
        The computation described in
        DKIM<xref target="RFC6376"/>
        section 3.7 is modified so that the described input to "sig-alg",
        the "data-hash",
        can adapt to standardized algorithms as appropriate.
        If an algorithm chooses adaption,
        "hash-alg" is only used to produce the "body-hash",
        whereas the input formerly used to create the "data-hash"
        is fed in full into "sig-alg", instead of to "hash-alg".
        More formally,
        the new pseudo-code for the signature algorithm is:
      </t><sourcecode><![CDATA[
body-hash = hash-alg (canon-body, l-param)
data-hash = hash-alg (h-headers, D-SIG, body-hash)
signature = sig-alg (d-domain, selector,
                     data-hash / (h-headers, D-SIG, body-hash))
      ]]></sourcecode>
    </section>

    <section anchor="IANA">
      <name>IANA Considerations</name>
      <t>This memo includes no request to IANA.</t>
    </section>

    <section anchor="Security">
      <name>Security Considerations</name>
      <t>
        This specification should reduce implementation burden and complexity,
        aids in hash hardening of affected algorithms to a certain extend,
        and potentially increases, dependent upon the algorithm, data volume
        and API optimization efforts, processing performance.
      </t>
    </section>

  </middle>
  <back>

    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6376.xml"/>
      </references>

      <references>
        <name>Informative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5322.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8032.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8463.xml"/>
      </references>
    </references>

 </back>
</rfc>
<!-- vim:set tw=1000:s-ts-mode -->
