DKIM Hash Algorithm Adaptivity

Introduction In section 3.7 DKIM specifies the algorithm how "Computing the Message Hashes" for IMF messages has to be performed, and notes that real life digital signature APIs often combine hashing and signing into a single call that performs both, "hash-alg" as well as "sig-alg", in what appears as a single operation. However, it only appears like that, these APIs, as mentioned, are interfaces designed for convenience: they allow specification of the message digest algorithm to be used, they can be called repeatedly until input data is fully consumed, to be finalized thereafter in order to create the signature; under the hood the complete message digest calculation plus signature creation is performed in full. Modern algorithms, like for example EdDSA, however, include extensive data hashing via an algorithm-internal message digest as part of digital signature creation. They perform several passes over the entire data, which needs to be available in full: no repeated input data feeding is possible. The algorithms may use the internal message digest multiple times at different steps of result creation, incorporating secret key data in later steps, for example. The final digest may be the signature. The above convenience interfaces cannot be used for these algorithms, or only in adopted form, dependent on the used cryptographic library. Specification of an additional message digest algorithm is impossible. Therefore the introduction of the DKIM algorithm Ed25519-SHA256 required code path changes, because the DKIM "data-hash" (SHA-256) now needed to be created first, in an extra step, in order to feed in the generated data as input to "sig-alg".

INFORMATIVE NOTE: EdDSA was adapted to DKIM as Ed25519-SHA256 in 2018, but has not gained much traction in the seven years since its introduction. A survey of DKIM implementations which adopted revealed enthusiastic code comments along the extra code paths that had to be introduced.

Analysis: in its current form DKIM defines the generated "data-hash" as the sole input of "sig-alg". Modern signing algorithms perform one to multiple digest operations on their input data, which must therefore be available in full for the single invocation of the cryptographic operation; the DKIM "data-hash" must therefore be created specifically. Also, the DKIM "data-hash" algorithm may be weaker than the one used by the signature algorithm: with the mentioned Ed25519-SHA256, for example, a 64-byte SHA-256 input is prehashed with SHA-512 to a 128-byte output. The conclusion is that currently the standard complicates implementations, fosters data processing redundancy, and potentially weakens security attributes of algorithms by feeding in only data subsets, prefiltered by potentially weak(er) algorithms.