<?xml version="1.0" encoding="utf-8"?>
<!-- name="GENERATOR" content="github.com/mmarkdown/mmark Mmark Markdown Processor - mmark.miek.nl" -->
<rfc version="3" ipr="trust200902" docName="draft-newton-how-do-you-do-00" submissionType="IETF" category="exp" xml:lang="en" xmlns:xi="http://www.w3.org/2001/XInclude" indexInclude="true" consensus="true">

<front>
<title abbrev="howdy">The How Do You Do Protocol</title><seriesInfo value="draft-newton-how-do-you-do-00" stream="IETF" status="experimental" name="Internet-Draft"></seriesInfo>
<author initials="A." surname="Newton" fullname="Andy Newton"><organization>ICANN</organization><address><postal><street></street>
</postal><email>andy@hxr.us</email>
</address></author><date/>
<area>Applications and Real-Time Area (ART)</area>
<workgroup>Network Working Group</workgroup>

<abstract>
<t>This document describes a system to discover the identifiers of natural
persons while preserving their privacy.</t>
</abstract>

</front>

<middle>

<section anchor="background"><name>Background</name>
<t>In a song written by children's author Shel Silverstein and made famous by
Johnny Cash, a young man seeks out and identifies his long, astranged father
using an old, worn photograph. Upon finding his father, the young man gives his name
and asks of his father, &quot;how do you do?&quot;</t>
<t>This document describes a system, called the &quot;How Do You Do&quot; protocol or &quot;Howdy&quot; for brevity,
to aid in the discovery of identifers for
natural persons while preserving their privacy. The system uses a publicly
visible <xref target="Bloom"></xref> filter along with an exchange of messages between agents acting
on behalf of users. When an agent suspects that another agent possesses the
identifier of a user, the agents may interrogate each other to either confirm
or deny the suspicion. Upon confirmation, the agents notify their respective
users so that the users may exchange other information, such as new or different
identifiers.</t>
<t>Confirmation of identifiers can either be one-way or two-way. Consider the
following scenario.</t>
<t>User Alice has an old acquaintance, User Bob. Many years ago, Bob was a frequent
reader and commenter on Alice's blog at <eref target="https://example.com/alices_restaurant">https://example.com/alices_restaurant</eref>.</t>
<t>Alice has placed into her agent the hashes of several of her identifiers. The
hash value for her blog's URL is 12345. Bob has placed into his agent the hash
values for identifiers of many of his acquaintances, among them that of Alice's
blog URL. Both Alice and Bob are using a hash algorithm prone to collision, and
therefore the hash value 12345 may also be the hash value for another identifier
probably unrelated to either Alice or Bob.</t>
<t>Bob's agent obtains the list of hash values from Alice's agent, and discovers
that the hash value 12345 exists in both Bob's set and Alice's set.</t>
<t>If Alice desires contact from all readers of her blog, even the ones whom
she may not know, Alice's agent let's this be known. Here, Bob's agent can seek
one-way confirmation of Alice's identifier
through a series of further exchanges using different hash values and/or stronger
hash algorithms.</t>
<t>If Alice desires contact from only known commenters of her blog, then
Bob's agent must engange in two-way confirmation. Here, Bob has placed into
his agent the hash value for his email address, bob@example.org, which is 67890.
As Bob was a known commenter on Alice's blog, Alice has also placed
this hash value in her agent. The hash algorithm used to calculate this value
is the same as the one used to calculate the hash value of Alice's blog URL,
and therefore has a probability of generating the same value for
other, unrelated identifiers. When Bob's agent initiates confirmation, it
does so by offering the hash value 67890. As with one-way confirmation, the agent's
exchange a series of messages to seek further confirmation of the identifiers using
different hash values and/or stronger hash algorithms.</t>
</section>

<section anchor="functional-components"><name>Functional Components</name>
<t>The agents holding the hash values and conducting identifier confirmation
communications are known as Exchange Agents (EA). Users interact with
User Agents (UA). User Agents hold the identifiers of the user and the identifiers
being sought by the user, and User Agents calculate the hash values for
identifiers which are then placed into Exchange Agents.</t>
<t>For convenience and quicker exchanges, User Agents and Exchange Agents
may be combined into one operating entity.  Conversely, there is no explicit
one-to-one match between User Agents and Exchange Agents. Exchange Agents
may serve many User Agents, and User Agents may utilize more than one
Exchange Agent.  This document does not define the communications between User Agents
and Exchange Agents.</t>
<t>An Exchange Agent initiating a one-way (see <xref target="one_way"></xref>) or two-way (see <xref target="two_way"></xref>)
confirmation communication flow are known as Requesting Exchange Agents (REA) and
the other agent is known as a Confirming Exchange Agent (CEA).</t>
<table>
<thead>
<tr>
<th>Action</th>
<th>Type</th>
<th>Name</th>
</tr>
</thead>

<tbody>
<tr>
<td>Requesting</td>
<td>User Agent</td>
<td>RUA</td>
</tr>

<tr>
<td>Requesting</td>
<td>Exchange Agnet</td>
<td>REA</td>
</tr>

<tr>
<td>Confirming</td>
<td>User Agent</td>
<td>CUA</td>
</tr>

<tr>
<td>Confirming</td>
<td>Exchange Agnet</td>
<td>CEA</td>
</tr>
</tbody>
</table><t>The identifier of a user is called a UID. For each UID, the UA calculates a hash value
from the UID using a collision-prone hash algorithm such as FNV (see <xref target="I-D.eastlake-fnv"></xref>). This hash value is called
UH1.</t>
<t>A second hash value, called UH2 is also calculated for each UID. This hash value uses a
separate, different collision-prone hash algorithm such as Murmur (see <xref target="Murmur3"></xref>).</t>
<t>Finally, for each UID a third hash value is also calculated. This value is called UH3 and is
calculated with a hash algorithm that is more collision-resistent such as SHA-512 (<xref target="RFC6234"></xref>).</t>
<t>All three hash values, UH1, UH2, and UH3, are provisioned into the CEA by the CUA. The UID is not.</t>
<table>
<thead>
<tr>
<th>Hash of</th>
<th>Type</th>
<th>Name</th>
<th>Example</th>
</tr>
</thead>

<tbody>
<tr>
<td>UID</td>
<td>Collision Prone</td>
<td>UH1</td>
<td>fnv( UID )</td>
</tr>

<tr>
<td>UID</td>
<td>Collision Prone</td>
<td>UH2</td>
<td>murmur3( UID )</td>
</tr>

<tr>
<td>UID</td>
<td>Collision Resistent</td>
<td>UH3</td>
<td>sha512( UID )</td>
</tr>
</tbody>
</table><t>The identifiers of a user's acquaintances, perhaps held in a contact database,
have the same set of values generated by the UA and provisioned into the EA. Each
acquaintance identifier is called an AID, and the hash values
are AH1, AH2, and AH3.</t>
<table>
<thead>
<tr>
<th>Hash of</th>
<th>Type</th>
<th>Name</th>
<th>Example</th>
</tr>
</thead>

<tbody>
<tr>
<td>AID</td>
<td>Collision Prone</td>
<td>AH1</td>
<td>fnv( AID )</td>
</tr>

<tr>
<td>AID</td>
<td>Collision Prone</td>
<td>AH2</td>
<td>murmur3( AID )</td>
</tr>

<tr>
<td>AID</td>
<td>Collision Resistent</td>
<td>AH3</td>
<td>sha512( AID )</td>
</tr>
</tbody>
</table></section>

<section anchor="flows"><name>Flows</name>
<t>Preceding confirmation flows, the CEA should make UH1 values known through a publication
mechanism. Confirmation may then take place where an REA has a suspicion that a CEA
contains the same UID. Publication of UH1 MUST include a signal determining the type
of confirmation to be used with that value.</t>
<figure><name>High Level Flow
</name>
<sourcecode type="ascii-art">SEQUENCE: 1. Provision Hashses

+-----+       +-----+           +-----+       +-----+
| CUA |       | CEA |           | RUA |       | REA |
+-----+       +-----+           +-----+       +-----+
   |             |                 |             |
   |~~~hashes~~~&gt;|                 |             |
   |             |                 |             |
   |&lt;------------|                 |             |
   |             |                 |             |
   |             |                 |~~~hashes~~~&gt;|
   |             |                 |             |
   |             |                 |&lt;------------|
   |             |                 |             |
   |          ------------------   |             |
   |          | Publishes UH1s |   |             |
   |          ------------------   |             |
   |             |                 |             |
+-----+       +-----+           +-----+       +-----+
| CUA |       | CEA |           | RUA |       | REA |
+-----+       +-----+           +-----+       +-----+


SEQUENCE: 2. Confirmation

+-----+  +-----+
| REA |  | CEA |
+-----+  +-----+
   |        |
----------------
| confirmation |
----------------
   |        |
   |~~~~~~~&gt;|
   |        |
   |&lt;-------|
   |        |
+-----+  +-----+
| REA |  | CEA |
+-----+  +-----+
</sourcecode>
</figure>
<t>Identifier confirmation occurs in two types of communication flows: one-way and two-way.
Each flow type has a set of steps specific to that flow. One-way confirmation is a flow
in which a UID is confirmed. Two-way confirmation is a flow in which both a UID and an AID
are confirmed.</t>

<section anchor="one_way"><name>One Way Confirmation</name>
<t>These are the steps for one-way confirmation:</t>
<t><em>Step 1:</em> The REA sends a message containing a UH1 and UH2.
Preceding this step, the REA could have learned of the UH1 value via publication by the CEA.
UH1 and UH2 MUST be hashed with different algorithms, both being collision prone. By providing
a message with UH1 and UH2, the REA is suggesting it may possess the value of UID.</t>
<t><em>Step 2:</em> If both the UH1 and UH2 are associated with the same UID, the CEA confirms this assertion
by responding with the UH3 of that UID. As UH3 is more collision resistent, the value of UID is confirmed.
This step also includes a message to the user associated with the AID.</t>
<t><em>Step 3:</em> Having established confirmation, subsequent messages are possible.
Optionally, the REA may followup with messages for the user using UH1 and UH3 as credentials.
These messages would then be passed by the CEA to the CUA.</t>
<figure><name>One Way Confirmation
</name>
<sourcecode type="ascii-art">SEQUENCE: One-Way Confirmation

+-----+                                +-----+
| REA |                                | CEA |
+-----+                                +-----+
   |                                      |
   |~~~step 1: UH1 + UH2~~~~~~~~~~~~~~~~~&gt;|
   |                                      |
   |&lt;--step 2: UH3 + msg_for_AID----------|
   |                                      |
   |~~~step 3: UH1 + UH3 + msg_for_UID~~~&gt;|
   |                                      |
   |&lt;--OK---------------------------------|
   |                                      |
+-----+                                +-----+
| REA |                                | CEA |
+-----+                                +-----+


SEQUENCE: Message to Users

+-----+            +-----+ +-----+            +-----+
| REA |            | RUA | | CEA |            | CUA |
+-----+            +-----+ +-----+            +-----+
   |                  |       |                  |
   |~~~msg_for_AID~~~&gt;|       |                  |
   |                  |       |                  |
   |&lt;-----------------|       |                  |
   |                  |       |                  |
   |                  |       |~~~msg_for_UID~~~&gt;|
   |                  |       |                  |
   |                  |       |&lt;-----------------|
   |                  |       |                  |
+-----+            +-----+ +-----+            +-----+
| REA |            | RUA | | CEA |            | CUA |
+-----+            +-----+ +-----+            +-----+
</sourcecode>
</figure>
</section>

<section anchor="two_way"><name>Two Way Confirmation</name>
<t>These are the steps for two-way confirmation:</t>
<t><em>Step 1:</em> The REA sends a message containing a UH1 and AH1. As this is two-way confirmation,
the AH1 is associated with an AID believed to be known by the user identified by the UID.</t>
<t><em>Step 2:</em> If the CEA can associate the AH1 with the UH1
(that is, the AID is associated with the UID), it responds with a simple positive acknowledgement of the match.</t>
<t><em>Step 3:</em> Next, the REA sends the UH2 and AH2 corresponding to the UH1 and AH1 in the first message. UH1 and UH2
MUST be hashsed with different algorithms, and AH1 and AH2 MUST be hashed with different algorithms.</t>
<t><em>Step 4:</em> If the CEA can associate the UH2 with the UH1 of the first message and the AH2 with
the AH1 of the first message, the CEA confirms by sending the UH3 and AH3 thus confirming the CEAs
knowledge of both association with the UID and AID. This step includes a message for the user associated with the AID.</t>
<t><em>Step 5:</em> Finally, the REA confirms the AID by sending a message to the CEA containing the AH3.
This is similar to step 3 in <xref target="one_way"></xref>.</t>
<figure><name>Two Way Confirmation
</name>
<sourcecode type="ascii-art">SEQUENCE: Two-Way Confirmation

+-----+                                +-----+
| REA |                                | CEA |
+-----+                                +-----+
   |                                      |
   |~~~step 1: UH1 + AH1~~~~~~~~~~~~~~~~~&gt;|
   |                                      |
   |&lt;--step 2: OK-------------------------|
   |                                      |
   |~~~step 3: UH2 + UH2~~~~~~~~~~~~~~~~~&gt;|
   |                                      |
   |&lt;--step 4: UH3 + AH3 + msg_for_AID----|
   |                                      |
   |~~~step 5: AH3 + msg_for_UID~~~~~~~~~&gt;|
   |                                      |
   |&lt;--OK---------------------------------|
   |                                      |
+-----+                                +-----+
| REA |                                | CEA |
+-----+                                +-----+


SEQUENCE: Message to Users

+-----+            +-----+ +-----+            +-----+
| REA |            | RUA | | CEA |            | CUA |
+-----+            +-----+ +-----+            +-----+
   |                  |       |                  |
   |~~~msg_for_AID~~~&gt;|       |                  |
   |                  |       |                  |
   |&lt;-----------------|       |                  |
   |                  |       |                  |
   |                  |       |~~~msg_for_UID~~~&gt;|
   |                  |       |                  |
   |                  |       |&lt;-----------------|
   |                  |       |                  |
+-----+            +-----+ +-----+            +-----+
| REA |            | RUA | | CEA |            | CUA |
+-----+            +-----+ +-----+            +-----+
</sourcecode>
</figure>
</section>
</section>

<section anchor="identifier-canonicalization"><name>Identifier Canonicalization</name>
<t>Some identifiers have extraneous characters that are insignificant to the usage of those
identifiers. For such identifiers, User Agents MUST remove insignificant characters from those
identifiers before creating hashes.</t>
<t>For phone numbers, User Agents MUST remove all non-digit characters. For email addresses,
the display name and any characters used to distinguish the display name from the email address
must be removed (e.g. Bob &lt;bob@example.com&gt; should be bob@example.com).</t>
</section>

<section anchor="internet-http-binding"><name>Internet HTTP Binding</name>
<t>The following sections describe a binding of Howdy to Internet
HTTP (<xref target="RFC9110"></xref>) where Exchange Agents are accessible as publicly available
resources. <xref target="other_environments"></xref> discusses other environments and
deployment scenarios for Howdy.</t>
<t>Exchange Agents communicate by issuing HTTP requests using the paths and
query parameters defined in this document. They are configured to communicate
with each other using a &quot;base URL&quot; upon which the request componets defined herein
are then appended.</t>

<section anchor="use-of-http-signatures"><name>Use of HTTP Signatures</name>
<t>Exchange Agents use HTTPS to communicate, and use
HTTP Signatures <xref target="I-D.ietf-httpbis-message-signatures"></xref> to sign requests, with an exception being the HTTP GET request to
fetch the HTTP signing key of a requester.</t>
<t>The keyId used in the HTTP signature MUST be an HTTPS URL, and the URL MUST
resolve to a resource that is a PEM-encoded public key. Exchange Agents
MAY cache keys according to HTTP caching headers.</t>
<t>Both the <tt>host</tt> and <tt>date</tt> HTTP headers MUST be signed, and the value of the <tt>host</tt>
header SHOULD be equivalent to the host of the authority component of the URL that
is the value of the keyId (e.g. <tt>host: foo.example</tt> matches <tt>https://foo.example/key.pem</tt>).</t>

<aside><t>TODO: There is probably a need to specify the key types and signature algorithms
to use.</t>
</aside>
</section>

<section anchor="json_vocabulary"><name>JSON Vocabulary</name>
<t>The messages described in the following section contain JSON. The meaning of these JSON values
are:</t>

<ul spacing="compact">
<li><tt>version</tt> - this is a simple integer and MUST be 1.</li>
<li><t>bit set values:</t>

<ul spacing="compact">
<li><tt>bitset_base64</tt> - a JSON string containg a bit set in Base 64 (<xref target="RFC4648"></xref>) format.</li>
<li><tt>bitset_length</tt> - the length of the bit set.</li>
<li><tt>bitset_alg</tt> - the algoritm of the UH1 values used in calculating the bit set.</li>
</ul></li>
<li><t>paging values:</t>

<ul spacing="compact">
<li><tt>next</tt> - a string value which maybe used in subsequent requests as the <tt>next</tt> query parameter.</li>
<li><tt>prev</tt> - a string value which maybe used in subsequent requests as the <tt>prev</tt> query parameter.</li>
</ul></li>
<li><tt>hash_values</tt> - an array of JSON objects, each containing hashes.</li>
<li><tt>two_way</tt> - indicates that confirmation of an identifier requires the two-way process.</li>
<li><tt>created</tt> - an RFC 3339 data and time in UTC with no more than seconds resolution indicating when an associated item was placed in the Exchange Agent.</li>
<li><t>UID related values:</t>

<ul spacing="compact">
<li><tt>uh1_u32</tt> - an unsigned 32-bit integer that is a UH1.</li>
<li><tt>uh1_alg</tt> - the name of the hash algorithm used to generate UH1.</li>
<li><tt>uh2_u32</tt> - an unsigned 32-bit integer that is a UH2.</li>
<li><tt>uh2_alg</tt> - the name of the hash algorithm used to generate UH2.</li>
<li><tt>uh3_base64</tt> - a Base64 (<xref target="RFC4648"></xref>) string containing a UH3.</li>
<li><tt>uh3_alg</tt> - the name of the hash algorithm used to generate UH3.</li>
</ul></li>
<li><t>AID related values:</t>

<ul spacing="compact">
<li><tt>ah1_u32</tt> - an unsigned 32-bit integer that is a AH1.</li>
<li><tt>ah1_alg</tt> - the name of the hash algorithm used to generate AH1.</li>
<li><tt>ah2_u32</tt> - an unsigned 32-bit integer that is a AH2.</li>
<li><tt>ah2_alg</tt> - the name of the hash algorithm used to generate AH2.</li>
<li><tt>ah3_base64</tt> - a Base64 (<xref target="RFC4648"></xref>) string containing a AH3.</li>
<li><tt>ah3_alg</tt> - the name of the hash algorithm used to generate AH3.</li>
</ul></li>
<li><t>message values:</t>

<ul spacing="compact">
<li><tt>msgs</tt> - an array containing message objects.</li>
<li><tt>msg_type</tt> - a string signifying the type of the message content.</li>
<li><tt>msg_content</tt> - the JSON type of this value is dependent on the accompanying <tt>msg_type</tt> value.</li>
</ul></li>
<li><tt>exchange_agents</tt> - an array containg exchange agent location objects.</li>
<li><tt>agent_url</tt> - a string containing a URL of an agent.</li>
<li><tt>notifications_accepted</tt> - a boolean indicating if an Exchange Agent accepts notifications.</li>
</ul>

<aside><t>TODO: specify an IANA registry of algorithms to use.</t>
</aside>
</section>

<section anchor="values_request"><name>Publication</name>
<t>Publication occurs by first querying for a bit set, which is a space-efficient
set of flags indicating if an Exchange Agent has probable knowledge of specific hash values. This is
done by sending an HTTP GET to the <tt>/bitset</tt> path (i.e. &lt;base URL&gt;/bitset). This query returns JSON
of the form:</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;notifications_accepted&quot;: true,
  &quot;bitset_base64&quot;: &quot;AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh8gISIjJCUmJygpKjQ1Njc4OTo7PD0+Pg0&quot;
  &quot;bitset_length&quot;: 1024,
  &quot;bitset_alg&quot;: &quot;FNV&quot;
}
</artwork>
<t>The <tt>bitset_base64</tt> value is a bit set encoded using <xref target="RFC4648"></xref>. The bit set is calculated by
bitwise ORing the modulo of each UH1 value by the bit set length. <tt>bitset_length</tt> indicates the
bitset length and MUST be a multiple of 8. <tt>bitset_alg</tt> signifies the algorithm used to calculate UH1.
All UH1 values MUST use the same algorithm.</t>
<t>An REA must then construct a bit set using its AH1 values in the same manner. For each bit set positively
in both sets, the REA then requests a list of UH1 matches and their corresponding confirmation type. This
is done use the <tt>/hashes</tt> path (i.e. &lt;base URL&gt;/hashes). This request must have the query parameters <tt>uh1_u32</tt>
and <tt>uh1_alg</tt>.</t>
<t>This request may optionally have the <tt>next</tt> and <tt>prev</tt>
query parameters, the value of each being a string. These parameters are used
to paginate the values in the request, where <tt>next</tt> is an indicator that only
hash values created before its value are to be in the returned result, and <tt>prev</tt>
is an indicator that only hash value created after its value are to be in the
returned result. The format for neither value is defined, and agents should omit
the use of both parameter to request the latest values.</t>
<t>The values are returned as a JSON object with the following form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;next&quot; : &quot;abcdefg&quot;,
  &quot;prev&quot; : &quot;hijklmn&quot;,
  &quot;hash_values&quot;: [
    {
      &quot;uh1_u32&quot;: 11112,
      &quot;uh1_alg&quot;: &quot;FNV&quot;,
      &quot;two_way&quot;: false,
      &quot;created&quot;: &quot;20230102T12:59:00Z&quot;
    },
    {
      &quot;uh1_u32&quot;: 22221,
      &quot;uh1_alg&quot;: &quot;FNV&quot;,
      &quot;two-way&quot;: true,
      &quot;created&quot;: &quot;20230101T12:59:00Z&quot;
    },
  ]
}
</artwork>
<t>If the <tt>next</tt> string is not given, this indicates there are no more hash values available using the <tt>next</tt> query parameter.
If the <tt>prev</tt> string is not given, this indicates there are no more hash values available using the <tt>prev</tt> query parameter.
All other values MUST be given, but the <tt>hash_values</tt> array MAY be empty. However, if it is not empty each JSON object
MUST be in reverse chronological order according to the value in <tt>created</tt>.</t>
<t>If the REA finds a UH1 value matching one of its AH1 values, it may begin an identifier confirmation based on the
<tt>two_way</tt> value for that hash.</t>
</section>

<section anchor="http_one_way"><name>One-Way Confirmation</name>
<t>One-way confirmation begins with the REA sending an HTTP POST to the CEA at the path <tt>/one_way</tt> (i.e. &lt;base URL&gt;/one_way).
The data posted is a JSON object of the form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;uh1_u32&quot;: 11112,
  &quot;uh1_alg&quot;: &quot;FNV&quot;,
  &quot;created&quot;: &quot;20230102T12:59:00Z&quot;,
  &quot;uh2_u32&quot;: 44443,
  &quot;uh2_alg&quot;: &quot;Murmur3&quot;
}
</artwork>
<t>The CEA uses the UH1 value and the <tt>created</tt> value to reference a set of user identifier hashes (UH1, UH2, and UH3).
If the UH2 value given matches the UH2 value in that set of hashes, the CEA responds with the UH3 in this form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;uh3_base64&quot;: &quot;b6fb35773416f37e51eb893a0b0682e23b4f758d5004542450be61607253f899&quot;,
  &quot;uh3_alg&quot;: &quot;SHA-512&quot;,
  &quot;msgs&quot; : [
    {
      &quot;msg_type&quot;: &quot;json_text&quot;,
      &quot;msg_content&quot;: &quot;I moved my blog, it is at https://example.com/alices/diner&quot;
    }
  ]
}
</artwork>
<t>If the REA can match this UH3 value to its corresponding value, then the REA may send a
subsequent HTTP POST to the path <tt>/one_way_msg</tt> (i.e. &lt;base URL&gt;/one_way_msg) with data of
the form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;uh1_u32&quot;: 11112,
  &quot;uh1_alg&quot;: &quot;FNV&quot;,
  &quot;created&quot;: &quot;20230102T12:59:00Z&quot;,
  &quot;uh3_base64&quot;: &quot;b6fb35773416f37e51eb893a0b0682e23b4f758d5004542450be61607253f899&quot;,
  &quot;uh3_alg&quot;: &quot;SHA-512&quot;,
  &quot;msgs&quot; : [
    {
      &quot;msg_type&quot;: &quot;json_text&quot;,
      &quot;msg_content&quot;: &quot;hi Alice, it's Bob.&quot;
    }
  ]
}
</artwork>
</section>

<section anchor="http_two_way"><name>Two-Way Confirmation</name>
<t>Two-way confirmation begins with the REA sending an HTTP POST to the CEA at the path <tt>two_way_step1</tt>
(i.e. &lt;base URL&gt;/two_way_step1). The data posted is a JSON object of the form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;uh1_u32&quot;: 22221,
  &quot;uh1_alg&quot;: &quot;FNV&quot;,
  &quot;created&quot;: &quot;20230102T12:59:00Z&quot;,
  &quot;ah1_u32&quot;: 33331,
  &quot;ah1_alg&quot;: &quot;FNV&quot;,
}
</artwork>
<t>If CEA allows two-way confirmation for the given UH1 and it has an AH1 associated with the UH1 value,
it replys with a simple HTTP OK.</t>
<t>If the REA receives an OK, it may continue by sending an HTTP POST to the CEA at the path
<tt>two_way_step3</tt> (i.e. &lt;base URL&gt;/two_way_step3) (see step 3 in <xref target="two_way"></xref>). The data posted is a JSON object of the form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;uh2_u32&quot;: 44443,
  &quot;uh2_alg&quot;: &quot;Murmur3&quot;,
  &quot;created&quot;: &quot;20230102T12:59:00Z&quot;,
  &quot;ah1_u32&quot;: 33331,
  &quot;ah1_alg&quot;: &quot;FNV&quot;,
  &quot;ah2_u32&quot;: 55556,
  &quot;ah2_alg&quot;: &quot;Murmur3&quot;,
}
</artwork>
<t>If the CEA matches the given UH2 with its UH2 and <tt>created</tt> values, and the AH1 and AH2 values match
to a set of AH1 and AH2 values associated with the user of UH2, then the CEA responds
with a JSON object of the form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;uh3_base64&quot;: &quot;b6fb35773416f37e51eb893a0b0682e23b4f758d5004542450be61607253f899&quot;,
  &quot;uh3_alg&quot;: &quot;SHA-512&quot;,
  &quot;ah3_base64&quot;: &quot;b6fb35773416f37e51eb893a0b0682e23b4f758d5004542450be61607253f899&quot;,
  &quot;ah3_alg&quot;: &quot;SHA-512&quot;,
  &quot;msgs&quot; : [
    {
      &quot;msg_type&quot;: &quot;json_text&quot;,
      &quot;msg_content&quot;: &quot;Hi Bob. I moved my blog, it is at https://example.com/alices_diner&quot;
    }
  ]
}
</artwork>
<t>If the REA can match this UH3 value to its corresponding value, then the REA may send a
subsequent HTTP POST to the path <tt>/two_way_msg</tt> (i.e. &lt;base URL&gt;/two_way_msg) with data of
the form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;uh1_u32&quot;: 11112,
  &quot;uh1_alg&quot;: &quot;FNV&quot;,
  &quot;created&quot;: &quot;20230102T12:59:00Z&quot;,
  &quot;ah3_base64&quot;: &quot;b6fb35773416f37e51eb893a0b0682e23b4f758d5004542450be61607253f899&quot;,
  &quot;ah3_alg&quot;: &quot;SHA-512&quot;,
  &quot;msgs&quot; : [
    {
      &quot;msg_type&quot;: &quot;json_text&quot;,
      &quot;msg_content&quot;: &quot;hi Alice, it's Bob. My new email is bobthebob@example.net&quot;
    }
  ]
}
</artwork>
</section>

<section anchor="message-types"><name>Message Types</name>
<t>In the previous sections, the examples used the message type <tt>json_text</tt> to convey simple, JSON compatible strings.
However, the protocol supports multiple messages of various types. Each type is to be listed in an IANA registry.</t>

<section anchor="simple_message_types"><name>Simple Message Types</name>
<t>This document defines the following &quot;simple&quot; message types.</t>

<ul spacing="compact">
<li><tt>json_text</tt> - the <tt>msg_content</tt> is a JSON compatible string.</li>
<li><tt>email</tt> - the <tt>msg_content</tt> is a JSON string containing an <xref target="RFC6530"></xref> compliant email address.</li>
<li><tt>web</tt> - the <tt>msg_content</tt> is a JSON string containing an <xref target="RFC3982"></xref> URL of a resource intended to be used with a web browser.</li>
<li><tt>profile</tt> - the <tt>msg_content</tt> is a JSON string containing an <xref target="RFC3982"></xref> URL that resolves to a message set (see <xref target="message_sets"></xref>).</li>
</ul>
</section>

<section anchor="cryptographic-message-types"><name>Cryptographic Message Types</name>
<t>If the message type is <tt>pem_pubkey</tt>, the <tt>msg_content</tt> is an array of JSON strings containing an <xref target="RFC1422"></xref> PEM encoded <xref target="RFC5280"></xref> public key.
Each string of the array represents a &quot;line&quot; of text of the PEM structure.</t>

<artwork>&quot;msg_type&quot;: &quot;pem_pubkey&quot;,
&quot;msg_content&quot;: [
  &quot;-----BEGIN PUBLIC KEY-----&quot;,
  &quot;MIIBCgKCAQEA1e8xKwbqeUNyCKMjsiGhIAgQ8KG6dHBmt10QcDszctb64Fb5lju&quot;,
  &quot;rNwzOJ4ue4cbXRfD66ZvWzBHXsJmJCk5qkLEcdbZZ4zGz2N4wf7GxIiJXSfviH+&quot;,
  &quot;3tt2Fd+/YcqGsyTZtjYyvcE6b1eighG8JKl15c7tq9lSFxz0PvshNEWXXEhML8n&quot;,
  &quot;wDqIRnPqKIw4v3dDFd4rqzVNGKhMQ0DaKplmbRRLavdrsgBOZhhyanZEQKBL3/8&quot;,
  &quot;+3rQ7vMSc7/3FBUIncu5rvFgoT10Pv4KDDjv3UMXeC669RNmOjgJQQ0Y2o1k1h2&quot;,
  &quot;56meaojisqZ59Fr2YQYqg24F3Tzu5yKLgIDAQAB&quot;,
  &quot;&quot;-----END RSA PUBLIC KEY-----&quot;
]
</artwork>
<t>If the message type is <tt>pem_cms</tt>, the <tt>msg_content</tt> is an array of JSON strings containing an <xref target="RFC1422"></xref> PEM encoded CMS (<xref target="RFC5280"></xref>) object
in the form described above.</t>
<t>If the message type is <tt>self_pem_cms</tt>, the <tt>msg_content</tt> is an array of JSON strings of the same type as <tt>pem_cms</tt>, however the key material
used for the CMS is either the UID or AID, or known to be associated with the UID or AID (such as key id).</t>

<aside><t>TODO: This needs some more work, obviously.</t>
</aside>
</section>
</section>

<section anchor="message_sets"><name>Message Sets</name>
<t>A set of messages can be grouped together for access using the following form:</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;msgs&quot; : [
    {
      &quot;msg_type&quot;: &quot;email&quot;,
      &quot;msg_content&quot;: &quot;bobthebob@example.net&quot;
    },
    {
      &quot;msg_type&quot;: &quot;web&quot;,
      &quot;msg_content&quot;: &quot;https://example.com/blog/bob&quot;
    }
  ]
}
</artwork>
<t>Howdy defines no URL pattern for access to message sets, however Exchange Agents MAY
make them publicly accessible based on permissions from a User Agent (e.g. <eref target="https://example.com/profile/bob">https://example.com/profile/bob</eref>).
See the <tt>profile</tt> message type in <xref target="simple_message_types"></xref>.</t>
</section>

<section anchor="distribution"><name>Distribution</name>
<t>An Exchange Agent may request the locations of other agents from a known agent
using the <tt>/exchange_agents</tt> path (i.e. &lt;base URL&gt;/exchange_agents).
This request may optionally have the <tt>next</tt> and <tt>prev</tt>
query parameters, the value of each being a string. These parameters are used
to paginate the values in the request, where <tt>next</tt> is an indicator that only
agents created before its value are to be in the returned result, and <tt>prev</tt>
is an indicator that only agents created after its value are to be in the
returned result. The format for neither value is defined, and agents should omit
the use of both parameter to request the latest values.</t>
<t>The values are returned as a JSON object with the following form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;next&quot; : &quot;abcdefg&quot;,
  &quot;prev&quot; : &quot;hijklmn&quot;,
  &quot;notifications_accepted&quot;: true
  &quot;exchange_agents&quot; : [
    {
      &quot;agent_url&quot;: &quot;https://agent1.example&quot;,
      &quot;created&quot;: &quot;20230102T12:59:00Z&quot;
    },
    {
      &quot;agent_url&quot;: &quot;https://agent2.example&quot;,
      &quot;created&quot;: &quot;20230101T12:59:00Z&quot;
    }
  ]
}
</artwork>
</section>

<section anchor="notifications"><name>Notifications</name>
<t>Exchange Agents may provide each other with unsolicited notifications. An Exchange Agent
indicates if it is willing to receive notifications using the <tt>notifications_accepted</tt> value
found in the JSON messages of <xref target="distribution"></xref> and <xref target="values_request"></xref>.</t>
<t>Notifications are received using the <tt>/notifications</tt> path (i.e. &lt;base URL/notifications) and
have the following form (see <xref target="json_vocabulary"></xref>):</t>

<artwork>{
  &quot;version&quot;: 1,
  &quot;hash_values&quot;: [
    {
      &quot;uh1_u32&quot;: 11112,
      &quot;alg&quot;: &quot;FNV&quot;,
      &quot;two_way&quot;: false,
      &quot;created&quot;: &quot;20230102T12:59:00Z&quot;
    },
    {
      &quot;uh1_u32&quot;: 22221,
      &quot;alg&quot;: &quot;FNV&quot;,
      &quot;two-way&quot;: true,
      &quot;created&quot;: &quot;20230101T12:59:00Z&quot;
    },
  ],
  &quot;exchange_agents&quot; : [
    {
      &quot;agent_url&quot;: &quot;https://agent1.example&quot;,
      &quot;created&quot;: &quot;20230102T12:59:00Z&quot;
    },
    {
      &quot;agent_url&quot;: &quot;https://agent2.example&quot;,
      &quot;created&quot;: &quot;20230101T12:59:00Z&quot;
    }
  ]
}
</artwork>
<t>The <tt>hash_values</tt> array is the same as is used <xref target="values_request"></xref> and the <tt>exchange_agents</tt> array
is the same is used in <xref target="distribution"></xref>.</t>
</section>
</section>

<section anchor="security-considerations"><name>Security Considerations</name>

<section anchor="all-bits-set"><name>All Bits Set</name>
<t>The bit set defined in <xref target="values_request"></xref> can be manipulated to allow an exchange agent
to be an attractive nuisance by simply setting all the bits. REAs SHOULD limit contact
to CEAs with every bit set. Likewise, REAs SHOULD limit contact with CEAs which yield
a large number of
false positive matches, where a false positive match is a failed identifier confirmation.</t>
</section>

<section anchor="shallow-bit-set"><name>Shallow Bit Set</name>
<t>The bit set discussed in <xref target="values_request"></xref> creates only one flag per UID, created from
the UH1 value. Experimentation, implementation, and experience may require additional bits
to set for each UID to prevent an abundance of false positives. It may be necessary to
encorporate the modulo of the UH2 value into the bit set, thus providing more than one
bit indicating the possible knowledge of a UID.</t>
</section>

<section anchor="well-known-identifiers"><name>Well Known Identifiers</name>
<t>Identifiers of users are not generally secrets and are sometimes very well known.
This invites a type of attack where an Exchange Agent may purposefully be populated
with hashes of well-known identifiers for the purposes of attracting victims. For
Howdy, this is especially easy to accomplish with one-way confirmation. When using one-way confirmation,
User Agents MUST inform users to take additional measures of confirmation using out-of-band
communications if possible.</t>
</section>

<section anchor="strengthened-acknowledgement"><name>Strengthened Acknowledgement</name>
<t>In the confirmation flows, the exchange of the UH2/AH2 values and then the UH3/AH3 values
makes both exchange agents express association with the UID/AID. Without this double exchange,
the CEA can falsely profess to association with a UID.  However, the hash algorithm used
for UH2/AH2 may need to be strengthened. Experimentation, implementation, and experience
may determine this need.</t>
</section>
</section>

</middle>

<back>
<references><name>Normative References</name>
<reference anchor="Bloom" target="">
  <front>
    <title>Space/time trade-offs in hash coding with allowable errors.</title>
    <author fullname="Burton H. Bloom"></author>
    <date year="1970"></date>
  </front>
  <seriesInfo name="Communications of the ACM" value="ACM 13"></seriesInfo>
</reference>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml-ids/reference.I-D.eastlake-fnv.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml-ids/reference.I-D.ietf-httpbis-message-signatures.xml"/>
<reference anchor="Murmur3" target="https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp">
  <front>
    <title>Murmur Hash 3</title>
    <author fullname="Austin Appleby"></author>
  </front>
</reference>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.1422.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3982.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4648.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5280.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6234.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6530.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9110.xml"/>
</references>

<section anchor="other_environments"><name>Other Environments</name>

<section anchor="fediverse"><name>Fediverse</name>
<t>This document describes how Howdy works over the Internet by layering it over
HTTP. However, there may be other environments where parts of Howdy can be used
to achieve similar purposes. For example, the Activity Pub protocol and the
conventions established around the Fediverse provide enough properties to
to layer Howdy in Activity Pub itself, making the integration of Howdy more
seemless with that ecosystem.</t>
</section>

<section anchor="nostr"><name>NOSTR</name>
<t>NOSTR is another environment similar to the Fediverse but with different goals.
NOSTR uses cryptographic public keys as user identifiers. Howdy could be used
in NOSTR to map NOSTR public keys to email addresses and other user identifiers,
including other NOSTR public keys.
Additionally, a binding of Howdy to the NOSTR protocol could make exchange of
user identifiers more seamless in that environment.</t>
</section>

<section anchor="lans"><name>LANS</name>
<t>Other environments, such as local area networks, may take parts of Howdy and
use them with mDNS or Bluetooth to facilitate the discovery of Exchange Agents.
This scenario being that a user with a smart phone containing both a User Agent
and Exchange Agent may be notified that an acquaintance is physically nearby using Howdy.</t>
</section>

<section anchor="mimi"><name>MIMI</name>
<t>The IETF's More Instant Messaging Interoperability (MIMI) working group is
defining protocols for the interchange of instant messages across protocol boundaries.
Howdy could be used for the discovery of identifiers and mapping of identifiers
between the varous instant messaging systems.</t>
</section>
</section>

<section anchor="design-issues"><name>Design Issues</name>

<section anchor="murmur-and-known-salts"><name>Murmur and Known Salts</name>
<t>Notes on the Murmu3 algorithm suggest that it is fast, but produces different results
depending on CPU architecture. If true, this algorithm would present significant interoperability
issues and could not be used.</t>
<t>Instead of using multiple algorithms, another approach might be to use
&quot;known salts&quot; (salts are typically random), which are appended to identifiers before hashing.
Such a scheme might be that all agents create hashes with a large, predefined set of known salts
but only use a small set during confirmation.</t>
</section>

<section anchor="jose"><name>JOSE</name>
<t>More effort needs to be given towards the use of JOSE standards. The PEM based approach was selected
because it is well-known but also that canonicalization is unnecessary when hashing public keys.
Being able to use a public key as an identifier that is then used to sign a message seems particularly
useful.</t>
</section>
</section>

<section anchor="acknowledgements"><name>Acknowledgements</name>
<t>A conversation had with libations and Eric Osterweil was the inspiration for Howdy.</t>
</section>

</back>

</rfc>
