<?xml version="1.0" encoding="utf-8"?>
<!-- name="GENERATOR" content="github.com/mmarkdown/mmark Mmark Markdown Processor - mmark.miek.nl" -->
<rfc version="3" ipr="trust200902" docName="draft-happel-mailmaint-pdparchive-01" submissionType="IETF" category="std" xml:lang="en" xmlns:xi="http://www.w3.org/2001/XInclude" indexInclude="true" consensus="true">

<front>
<title>Personal Data Portability Archive</title><seriesInfo value="draft-happel-mailmaint-pdparchive-01" stream="IETF" status="standard" name="Internet-Draft"></seriesInfo>
<author role="editor" initials="L." surname="Dusseault" fullname="Lisa Dusseault"><organization>tbd</organization><address><postal><street>tbd</street>
<city>tbd </city>
<code>tbd</code>
<country>USA</country>
</postal><email>lisa.dusseault@gmail.com</email>
<uri>tbd</uri>
</address></author><author role="editor" initials="H.J." surname="Happel" fullname="Hans-Joerg Happel"><organization>audriga</organization><address><postal><street>Alter Schlachthof 57</street>
<city>Karlsruhe </city>
<code>76137</code>
<country>Germany</country>
</postal><email>hans-joerg@audriga.com</email>
<uri>https://www.audriga.com</uri>
</address></author><author role="editor" initials="A." surname="Melnikov" fullname="Alexey Melnikov"><organization>Isode Ltd</organization><address><postal><street>14 Castle Mews</street>
<city>Hampton, Middlesex </city>
<code>TW12 2NP</code>
<country>United Kingdom</country>
</postal><email>Alexey.Melnikov@isode.com</email>
</address></author><date/>
<area>ART</area>
<workgroup>Network Working Group</workgroup>

<abstract>
<t>This document proposes the Personal Data Portability Archive format (PDPA), suitable for import/export, backup/restore, and data transfer scenarios for personal data.</t>
</abstract>

</front>

<middle>

<section anchor="introduction"><name>Introduction</name>
<t>As part of communication protocols, the IETF has standardized a number of data formats such as the Internet Message Format <xref target="RFC5322"></xref>, vCard <xref target="RFC6350"></xref>, iCalendar <xref target="RFC5545"></xref>, or, more recently, JSContact <xref target="RFC9553"></xref> and JSCalendar <xref target="RFC8984"></xref>.</t>
<t>While mainly designed for interoperability, many of these data formats have also become popular for data portability, i.e., the import/export of data across different services. The growing importance of data portability however demands for an open standard archive format which can deal with different types of personal data in a homogeneous fashion.</t>
<t>To this end, this document proposes the Personal Data Portability Archive format (PDPA), suitable for import/export, backup/restore, and data transfer scenarios for personal data. It is compatible with both IMAP and JMAP and should be suitable as an interchange format between related software and services such as for email, contacts, calendaring, tasks, or files.</t>
<t>The approach is to define JSON formats (using CDDL), folder structure, and a common compression format.  Additional specifications will likely define a protocol how these files can be requested from, imported into, or transferred between servers.</t>
</section>

<section anchor="conventions-used-in-this-document"><name>Conventions Used in This Document</name>
<t>The term &quot;personal data&quot; refers to persistent data which users created and managed within applications or services. Classic examples are emails, contacts, calendars, tasks, notes, or files. Other examples might be fitness tracking records, energy bills, or location history.</t>
<t>The term &quot;data portability&quot; refers to the right or technical procedure to use or transfer personal data across different applications or services.</t>
<t>The terms &quot;message&quot; and &quot;email message&quot; refer to &quot;electronic mail messages&quot; or &quot;emails&quot; as specified in <xref target="RFC5322"></xref>. The term &quot;Message User Agent&quot; (MUA) denotes an email client application as per <xref target="RFC5598"></xref>.</t>
<t>The key words &quot;MUST&quot;, &quot;MUST NOT&quot;, &quot;REQUIRED&quot;, &quot;SHALL&quot;, &quot;SHALL NOT&quot;, &quot;SHOULD&quot;, &quot;SHOULD NOT&quot;, &quot;RECOMMENDED&quot;, &quot;NOT RECOMMENDED&quot;, &quot;MAY&quot;, and &quot;OPTIONAL&quot; in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"></xref> <xref target="RFC8174"></xref> when, and only when, they appear in all capitals, as shown here.</t>
</section>

<section anchor="goals"><name>Goals</name>
<t>The core goal is to provide an open and extensible archive and transfer format for personal data. This includes data types that are subject of existing IETF protocols, such as email and groupware data (e.g., contacts, calendars, tasks) but may also include further types of personal data.</t>
<t>Goals will be refined by use cases and cross-cutting technical goals in the following sub-sections.</t>
<t>JSON is used as a widely interoperable base format that systems can easily translate their internal data representation into. Wherever possible, fields, values and data structures are derived from existing IMAP/JMAP specifications to remain compatible with both.</t>

<section anchor="use-cases"><name>Use cases</name>

<section anchor="use_cases"><name>Data portability</name>
<t>A main use case for the novel format is to allow exporting the full user data managed by services or software products into a simple file (or set of files) which is under full control of the user.</t>
<t>The user might use such export for backup, archiving, or for importing when switching to another service or software (i.e., migration).</t>
<t>Depending on the type of data, exporting/importing can be a time-consuming process. Particularly for the case of switching services, PDPA should allow to minimize the time period during which a user cannot use the origin system but also the destination system is not yet ready.</t>
</section>

<section anchor="incremental-backup"><name>Incremental backup</name>
<t>Beyond snapshot backups/exports, the format should optionally allow for incremental backups.</t>
<t>There are at least two scenarios for this:</t>

<ul spacing="compact">
<li>Software which wants to keep a permanent, incremental mirror of user data (e.g., for instant export or restore)</li>
<li>Users regularly exporting changes in data managed by services or software products</li>
</ul>
</section>

<section anchor="synchronization"><name>Synchronization</name>
<t>Data portability does not just allow users to switch from one service to another one, but to let users benefit from 3rd party services getting a copy of their data  (at the request of the user).  Simple synchronization features could make this much better.</t>
<t>For example, current online systems that allow importing contacts are not often suited to maintaining one's address book on two systems. Re-importing a contact into a system that already has that contact often results in duplicating the exact same contact, whether or not there have been edits, making repeated synchronization practically infeasible. It should be easy to do a significantly better job of this with some attention to object IDs and modification timestamps.</t>
<t>We however do not attempt to solve two-way synchronization via export files.  It would require significant additional work to allow two systems, neither of which is agreed-upon to be the source of truth, to reliably synchronize changes from both. In comparison, solving one-way synchronization only requires agreed-upon usage of existing fields and values.</t>
</section>

<section anchor="dataset-exchange"><name>Dataset exchange</name>
<t>PDPA should be usable to exchange and share larger data sets than just one user, or to share a single user's data outside the context where the user knows what it is and where it came from.</t>
<t>Potential applications of this are:</t>

<ul spacing="compact">
<li>The ability to exchange test data and known mailstore state, e.g. for conformance testing or internal functional tests</li>
<li>Legal discovery and forensics use cases may benefit from a standard export format, such that investigators can expect a great deal of consistency when collecting data from different systems.</li>
<li>Researchers may be able to collect archive files through data donations and use as input to research.</li>
</ul>
<t>While these use cases may suggest small features that would help these use cases be successful, supporting more advanced features required by these use cases is not a priority. For example, we do not attempt to cryptographically solve provenance for use in legal and forensics use cases.</t>
</section>

<section anchor="data-persistence"><name>Data persistence</name>
<t>The format MAY be used as a development-time active persistence layer for user data in, e.g., email clients or applications. It is not intended as, or suitable for, a production-level persistence layer.</t>
</section>
</section>

<section anchor="technical-goals"><name>Technical Goals</name>
<t>Besides actual use cases, there are a number of side requirements and goals for PDPA.</t>

<section anchor="email-standards-compatibility"><name>Email standards compatibility</name>
<t>Data formats should aim for compatibility with JMAP data formats for the sake of interoperability and synergies in software libraries.</t>
<t>Dedicated JMAP API methods for exporting and importing the format described here, or for related server-to-server transfer protocols are out of the scope of this document.</t>
<t>Due to its specifics and ubiquitous usage, the Internet Message Format <xref target="RFC5322"></xref>; latest revision of <xref target="RFC2822"></xref>/<xref target="RFC822"></xref>) should be the core of representing individual email data.</t>
<t>This specification should ideally describe mappings between PDPA and existing mailbox persistence schemes such as Maildir or MBOX <xref target="RFC4155"></xref>.</t>
</section>

<section anchor="interoperability"><name>Interoperability</name>
<t>It should be mostly possible to use personal data exports from one system with different software or services.  When a source exports personal data it can include all the information it would need for a fully-functional import, <em>however</em> destination systems running different software may not be able to import all of that information (especially if it includes non-standard features) and use it exactly the same way.  This specification does not attempt to achieve perfect interoperability between diverse systems, but instead to make reasonable trade-offs.</t>
</section>

<section anchor="extensibility"><name>Extensibility</name>
<t>This format should be extensible to accommodate types of personal data not explicitly mentioned or foreseen when writing these specs.</t>
</section>

<section anchor="flexible-granularity"><name>Flexible granularity</name>
<t>This format should allow flexible granularity in two ways:</t>

<ol type="%d)">
<li><t>It should enable easy access to separate types of data (e.g., emails vs. contacts), e.g. to allow for partial imports or exports</t>
</li>
<li><t>While ideally representable as a single file, archives may also span several files due to reasons such as file size restrictions or incremental generation logic.</t>
</li>
</ol>
<t>(The ability for a user to export and/or backup an entire email account requires some accommodation of large amounts of data and risks of interruptions in downloads.  Splitting exports into multiple files during export is one possible solution.)</t>
</section>

<section anchor="accessible-for-local-tooling"><name>Accessible for local tooling</name>
<t>PDPA should allow easy access for local tools (e.g., CLIs). While this may sound obvious, it is a key factor for the intended versatility of the format.</t>
</section>

<section anchor="efficiency"><name>Efficiency</name>
<t>Since certain kinds of personal data might involve large quantities of data, major use cases for PDPA should be realizable in an efficient manner.</t>
<t>For now, this is stated as an abstract guiding principle. Its actual dimensions and trade-offs need to be refined while evolving this specification.</t>
</section>
</section>

<section anchor="related-work"><name>Related work</name>
<t>Many email server implementors have found it desirable to have one or more file formats for storing email in a file system even when the primary active email storage is more commonly a database.  Examples include <xref target="PST"></xref> files (Outlook), NSF (Notes), <xref target="GoogleTakeout"></xref>, Maildir, MBOX.  File formats are already used for interoperability in many cases even when not standardized.</t>
<t>This specification follows that pattern in order to build on these partial successes.  By standardizing one format, we expect to be able to satisfy use cases that are harder to satisfy with a plurality of formats, such as use cases for server-to-server transfer of email account data during account migrations.  Specifications that explain how to create these archives in different situations can refer to this specification.</t>
</section>
</section>

<section anchor="approach"><name>Approach</name>

<section anchor="using-json-mostly"><name>Using JSON (mostly)</name>
<t>JSON is used in this spec for new metadata and for objects including contacts, tasks,  events and notes.  However, the Email Message Format <xref target="RFC5322"></xref> is used for email message content. Individual items are stored in individual files, which are referenced in collection metadata. Finally these JSON and other file formats are packaged and compressed together in a standard but flexible way.</t>
<t>Our rationale for using JSON as much as it is reasonable:</t>

<ul spacing="compact">
<li>We envision an export format being used not just by developers of full IMAP servers but also by developers building task management systems, calendar systems that don't include email, etc.</li>
<li>We should minimize requiring multiple libraries to parse different formats. If the Metadata is going to be in JSON, it would really help to have the item data in JSON.</li>
<li>Personal data formats must be extensible and extensibility for JSON is well-understood.</li>
</ul>
<t>Using JSON for all the data <em>except</em>  EML files was carefully considered.  EML files are rather specialized and more challenging to replace.  Much of email is not structured data but content and involves MIME.  EML is more likely to be a system's native data store, unlike VCARD and VTODO which are most commonly transformed for use in a relational data store for active use.  Finally, email signature implementations like S/MIME and OpenPGP would be less disrupted by keeping EML.</t>
<t>Information not covered by these existing file formats, but still necessary for migration or backup/restore of an email account, is packaged into JSON files. JSON structure and values are defined with CDDL <xref target="RFC8610"></xref>.  The JSON files for email folders and other collections contain references to individual resources by unique ID and filename.</t>
</section>

<section anchor="approach-to-partial-updates"><name>Approach to partial updates</name>
<t>Our use case requirements <eref target="#use_cases">above</eref> included some very common personal use cases that motivate exporting only a time-limited set of items, but retaining the ability to use that subset export to update a previously acquired or maintained snapshot.  These are partial updates - partial exports able to update a full copy.  Our approach is to define a version of the archive format that includes a subset of a repository's content, and additionally some change markers necessary to maintain consistency.</t>
<t>A partial-update export</t>

<ul spacing="compact">
<li>Contains new items just like a full export.</li>
<li>Omits unchanged items from before the time cutoff.</li>
<li>Does NOT list unchanged items in folder listings either.</li>
<li>Allows updated items to be identified so they can replace previous versions</li>
<li>Allows deleted items to be identified so they could be removed in the updated snapshot</li>
</ul>
<t>The existing definitions for UIDs and 'updated' in JMAP and IMAP should make this quite possible. Also IMAP MODSEQs <xref target="RFC7162"></xref> can be used for flag changes.</t>
</section>

<section anchor="approach-to-synchronization"><name>Approach to synchronization</name>
<t>Email and calendaring services appear to already do synchronization just fine, but this is via a client-server model.  The client drives the read and write requests until content is synchronized, using mostly UIDs and timestamps.</t>
<t>Archive and export formats aren't part of this client-server model, and the work to make server-to-server or peer-to-peer synchronization work perfectly is substantial (involving features such as version numbers or change logs -- features that aren't commonly standardized for the objects handled in this spec).  Still, there are some things possible with archive formats and the current object definitions that are sensible.  Our approach is to describe what is possible with the fields that exist, and mandate those fields be used, so that users aren't left with multiple locations for their data and no way to repeatedly synchronize them.</t>
<t>As an illustrative use case, let the user have contacts stored in one email service and also in one mobile device platform doing online backups.  The email platform creates contacts when the user emails new recipients, or receives contact information over email.  The mobile device platform synchronizes contact information from a phone.  The email service is not a client of the mobile device platform, nor is the mobile device platform a client of the email service, so client-to-server protocols cannot directly synchronize the data between these two services.  A user attempting to solve this by repeatedly importing contacts into one system from the other may find this works poorly - for example, it might create new contact objects over and over for the same contact data even if it is unchanged, and deleted objects may re-animate after being re-copied.</t>
<t>Both servers ought to allow exporting of contact data (along with any other data covered in this specification), including especially the UID and updated (timestamp) fields.   This would allow at least some sensible personal workflows or for 3rd party tools to make synchronization work better.</t>
<t>When importing contact data (along with any other data covered in this specification), a service needs to take note of the UID in the import, and use it as the UID for creating a new object, so that it may be later avoid being recreated. If the service importing already has the said UID, the service should compare 'updated' timestamps and use that to decide how to update the object with new values.  An object that hasn't changed since last imported should remain unchanged and not updated.</t>
<t>This approach to synchronization is imperfect.  Let the user have performed one synchronization by exporting from the email service and importing to the mobile device platform.  If the user updates a contact on the email service (e.g adding a street address) and the mobile platform also updates the contact (e.g. adding an avatar link), then the user attempts to repeat the synchronization, both services will have a new 'updated' timestamp on the same object UID, and the earlier of the two changes will get wiped out.  Nevertheless, a disciplined user can remember to only make changes on the email service and always copy them over to the mobile platform, and avoid many such lost updates.</t>
<t>Based on the approach described here, this specification standardizes some behavior for both exporters and importers, to maximize potential success.</t>
</section>
</section>

<section anchor="solution-requirements"><name>Solution Requirements</name>
<t>These technical requirements on the solution are intended to meet the goals above, and to add more specifics about how those goals are intended to be met with the architecture chosen.   This section does not attempt to translate the solution requirements into implementor requirements.</t>
<t>Folder format requirements</t>

<ul spacing="compact">
<li>can include partial results, showing only a subset of objects within the folder</li>
<li>can include a human-readable representation of the meaning of that subset (e.g. a date filter, or recipient filter)</li>
</ul>
<t>Email object format requirements</t>

<ul spacing="compact">
<li>can maintain full fidelity including preserving character sets, content transfer encoding of body parts and exact MIME structure</li>
</ul>
<t>Compression and packaging of resources</t>

<ul spacing="compact">
<li>Servers can bundle resources together in different ways to be flexible in handling size and network limitations.  Servers must be able to choose optimal file size and organization of information within files.<br />
</li>
</ul>
<t>Synchronization requirements</t>

<ul spacing="compact">
<li>Can see which items are identical in two systems, e.g. one system previously imported an item exported by the other.</li>
<li>Can detect changes made since the last time an item was imported, in a way that supports replacing an older version previously synchronized with a newer version that has edits.</li>
</ul>
</section>

<section anchor="file-format"><name>File format</name>
<t>This section describes the internal &quot;raw&quot; file format of a personal data portability archive. For discussion about a surrounding container format, see section &quot;open issues&quot;.</t>
<t>PDPA in general consists of:</t>

<ul spacing="compact">
<li>A main metadata file (&quot;index.json&quot;)</li>
<li>Top-level folders for each data type (&quot;/mail&quot;)</li>
<li>Subfolders representing actual collections of individual data items plus additional metadata files</li>
</ul>

<section anchor="index-file-s"><name>Index file(s)</name>
<t>The index.json file consists of three main sections:</t>

<ul spacing="compact">
<li>archive: general information about the archive</li>
<li>dataset: characteristics of the dataset itself</li>
<li>datasource: meta-information about the dataset</li>
</ul>
<t>The following sections propose some initial properties which are still subject to discussion.</t>

<section anchor="archive-section"><name>Archive section</name>
<table>
<thead>
<tr>
<th>Key</th>
<th>Description</th>
<th>Example value</th>
</tr>
</thead>

<tbody>
<tr>
<td>id</td>
<td>Archive identifier</td>
<td>123</td>
</tr>

<tr>
<td>name</td>
<td>Human readble label</td>
<td>Jane's data export (2025-10-19)</td>
</tr>

<tr>
<td>note</td>
<td>Note</td>
<td>Personal account export</td>
</tr>

<tr>
<td>legal</td>
<td>Legal desclaimer</td>
<td>Private data</td>
</tr>

<tr>
<td>timestamp</td>
<td>Archive timestamp</td>
<td>2025-10-19-18-00</td>
</tr>

<tr>
<td>version</td>
<td>PDPA spec version</td>
<td>PDPA v1.0</td>
</tr>

<tr>
<td>generator</td>
<td>Archive generator</td>
<td>PDPA exporter v0.9</td>
</tr>
</tbody>
</table></section>

<section anchor="dataset-section"><name>Dataset section</name>
<table>
<thead>
<tr>
<th>Key</th>
<th>Description</th>
<th>Example value</th>
</tr>
</thead>

<tbody>
<tr>
<td>extent</td>
<td>Extent of the archive (full, partial)</td>
<td>FULL</td>
</tr>

<tr>
<td>selector</td>
<td>Select critia for partial datasets (date, folder, size, custom)</td>
<td>NONE</td>
</tr>

<tr>
<td>datatypes</td>
<td>List of data types</td>
<td>MAIL</td>
</tr>

<tr>
<td>langaguetag</td>
<td>BCP 47 language tag for the dominant language in the dataset</td>
<td>en-ca</td>
</tr>

<tr>
<td>timezone</td>
<td>IANA tz identifier for the dataset</td>
<td>&quot;America/Montreal&quot;</td>
</tr>
</tbody>
</table></section>

<section anchor="datasource-section"><name>Datasource section</name>
<table>
<thead>
<tr>
<th>Key</th>
<th>Description</th>
<th>Example value</th>
</tr>
</thead>

<tbody>
<tr>
<td>service</td>
<td>Information about the source service (id, url, ..)</td>
<td>TBD</td>
</tr>

<tr>
<td>account</td>
<td>Information about the source account (id, type, ...)</td>
<td>TBD</td>
</tr>
</tbody>
</table></section>
</section>

<section anchor="folder-structure"><name>Folder structure</name>

<ul spacing="compact">
<li>Folders can be nested.  Any kind of content here can be within nested folders -- this is a necessary feature for email, but extends to content like contacts that aren't always represented in nested folders.   (TODO: elsewhere, describe the requirements for an importing system to preserve folders or not.)</li>
<li>Names of files do not have to be globally unique.   Indexes and folder contents listings can name files relatively to their location in the archive structure, which means that references may not be resolvable if that context is lost.</li>
<li>Individual content items are individual files.  This may not always be the easiest choice for exporters who must generate a large number of files for individually small items (contrast to a JSON stream including all objects) but as an archive format, the individual files allow more clarity in individual handling, transactions and errors.</li>
</ul>

<sourcecode type="asciidoc=">index.json
/mail/
    /Archive/
        folder.json
        m1.eml
        m2.eml
        m3.eml
        ...
    /Archive/2023/
        folder.json
        m1.eml
        m2.eml
        m3.eml
        ...
    /Archive/2024/
        folder.json
        m1.eml
        m2.eml
        m3.eml
        ..
    /INBOX/
        folder.json
        m1.eml
        m2.eml
        m3.eml
        ...
    /Sent Mail/
        folder.json
        m1.eml
        m2.eml
        ...
/contacts/
     contact1.json
     contact2.json
     ...
/calendars/
    /calendar2/
        event1.json
        event2.json
/sieve/
/blob/
    ...?
</sourcecode>
<blockquote><t>TODO: I18n? Special-use?</t>
</blockquote><t>Folder names are defined in RFC9051 (IMAP v4 rev2) with great freedom for servers.  Servers may or may not treat mailbox names as case sensitive.  Folder names may even include non-graphic characters, &quot;%&quot; and &quot;*&quot;. Hierarchy separators may even differ among IMAP servers although &quot;/&quot; is probably most common.</t>
<t>Since this specification is new, it is possible to be more constrained.  This specification only supports &quot;/&quot; as a folder separator.</t>
</section>

<section anchor="data-formats"><name>Data formats</name>

<section anchor="email"><name>Email</name>
<t>Each IMAP/JMAP folder is represented as subdirectory under &quot;mail&quot; directory. For example, the folder INBOX would be represented as &quot;mail/INBOX&quot;, and the folder &quot;Archive/2024/2024-12&quot; would be represented as &quot;mail/Archive/2024/2024-12&quot;.</t>
<t>(In examples above &quot;/&quot; is the IMAP hierarchy delimiter)</t>
<t>Folder names are encoded in UTF-8.</t>
<t>TODO: how to signal removal of a folder in an incremental archive? Need to add some kind of tombstone mechanism.</t>
<t>Each folder metatadata is described by &quot;folder.json&quot;, which has the following format:</t>
<table>
<thead>
<tr>
<th>Attribute Name</th>
<th align="center">Type</th>
<th>Mandatory?</th>
<th>Comment</th>
</tr>
</thead>

<tbody>
<tr>
<td>allowed_keywords</td>
<td align="center">array of strings (IMAP keywords)</td>
<td>No</td>
<td>PERMANENTFLAGS minus &quot;\*&quot; <xref target="RFC3501"></xref></td>
</tr>

<tr>
<td>last_uid</td>
<td align="center">unsigned 32 bit integer</td>
<td>Yes</td>
<td>Last UID assigned in the folder. It is UIDNEXT value minus 1 <xref target="RFC3501"></xref></td>
</tr>

<tr>
<td>recent_uid</td>
<td align="center">unsigned 32 bit integer</td>
<td>No</td>
<td>Lowest UID of a message with the \Recent flag <xref target="RFC3501"></xref></td>
</tr>

<tr>
<td>uidvalidity</td>
<td align="center">unsigned 32 bit integer</td>
<td>Yes</td>
<td>UIDVALIDITY value <xref target="RFC3501"></xref></td>
</tr>

<tr>
<td>is_subscribed</td>
<td align="center">boolean</td>
<td>Yes</td>
<td>Is the folder returned by IMAP LSUB? <xref target="RFC3501"></xref></td>
</tr>

<tr>
<td>myrights</td>
<td align="center">string</td>
<td>No</td>
<td>See Section 3.5 of <xref target="RFC4314"></xref>. For example &quot;rwiptsldaex&quot;</td>
</tr>

<tr>
<td>highest_modseq</td>
<td align="center">unsigned 64 bit integer</td>
<td>No</td>
<td>HIGHESTMODSEQ value <xref target="RFC7162"></xref></td>
</tr>

<tr>
<td>role</td>
<td align="center">string</td>
<td>No</td>
<td><xref target="RFC6154"></xref> SPECIAL-USE value. E.g. &quot;inbox&quot;, &quot;sent&quot;, &quot;drafts&quot;, &quot;junk&quot;, etc.</td>
</tr>

<tr>
<td>sort_order</td>
<td align="center"></td>
<td>No</td>
<td>See Section 2, <xref target="RFC8621"></xref></td>
</tr>

<tr>
<td>uids</td>
<td align="center">map from UIDs to strings (relative filenames of messages)</td>
<td>Yes</td>
<td>Mapping from UIDs to corresponding message files included in the archive</td>
</tr>

<tr>
<td>flags</td>
<td align="center">map from UIDs to array of strings</td>
<td>Yes</td>
<td>IMAP flags assigned to the message, excluding \Recent</td>
</tr>

<tr>
<td>modseqs</td>
<td align="center">map from UIDs to unsigned 64 bit integers (modsequences)</td>
<td>No</td>
<td>Per message MODSEQ values <xref target="RFC7162"></xref></td>
</tr>

<tr>
<td>original_name</td>
<td align="center">string</td>
<td>No</td>
<td>Original folder name (relative to its parent, if any) if the name can't be represented in filesystem, e.g. if it includes special characters</td>
</tr>

<tr>
<td>comment</td>
<td align="center">string</td>
<td>No</td>
<td>Can include information about partial export or filter used in human readable UTF-8 text</td>
</tr>

<tr>
<td>removed</td>
<td align="center">array of unsigned 32 bit integers</td>
<td>No</td>
<td>List of messages (UIDs) removed since the last export</td>
</tr>
</tbody>
</table><t>CDDL description of &quot;folder.json&quot; is included below:</t>

<sourcecode type="asciidoc=">;; /// Or possibly use ranges for 2 types below?
u32 = uint .size 4
u64 = uint .size 8

; uid =&gt; message filename map
uid_map = {* uid =&gt; filename}
uid = u32
filename = tstr

flags_map = {* uid =&gt; flags}
flags = [tstr]

modseq_map = {* uid =&gt; modseqs}
modseqs = u64


folder_info = {
  ? allowed_keywords: [tstr],
  last_uid: u32,
  ? highest_modseq: u64,
  ? recent_uid: u32,
  uidvalidity: u32,

  is_subscribed: bool,

  ; See RFC 6154 (SPECIAL-USE). E.g. &quot;inbox&quot;, &quot;sent&quot;, &quot;drafts&quot;, &quot;junk&quot;, etc.
  ? role: tstr,

  ; See Section 2, RFC 8621.
  ? sort_order: u32,



  ; See Section 3.5 of RFC 4314. For example &quot;rwiptsldaex&quot;
  ? myrights: tstr,

  ; Maps UIDs to the corresponding relative file names (names of .eml files)
  uids: uid_map,

  ; Maps UIDs to the corresponding array of flags/keywords (as strings)
  flags: flags_map,

  ; Maps UIDs to the corresponding modseqs (u64). See RFC 7162.
  ? modseqs: modseq_map,

  ; Original folder name if the name can't be represented in filesystem
  ? original_name: tstr,
  
  ; Can include information about partial export or filter used
  ; in human readable UTF-8 text
  ? comment: tstr,

  ;; The following attributes are only used for incremental exports:
  ? removed: [u32],
}
</sourcecode>
<t>EML (.eml) file for each message, in order to avoid reconstructing them. Names of EML files are referenced from the &quot;folder.json&quot; file.</t>
<t>Example of folder.json (full export):</t>

<sourcecode type="asciidoc=">{
  &quot;allowed_keywords&quot;: [&quot;$Forwarded&quot;, &quot;$MDNSent&quot;, &quot;$ismailinglist&quot;],
  &quot;last_uid&quot;: 16,
  &quot;highest_modseq&quot;: 6371729,
  &quot;recent_uid&quot;: 15,
  &quot;uidvalidity&quot;: 1107190787,
  &quot;is_subscribed&quot;: true,
  &quot;role&quot;: &quot;sent&quot;,
  &quot;sort_order&quot;: 1,
  &quot;myrights&quot;: &quot;rwiptsldaex&quot;,

  &quot;uids&quot;: {
     1: &quot;msg-1.eml&quot;,
     3: &quot;msg-3.eml&quot;,
     15: &quot;imported-ABC.eml&quot;
  },

  &quot;flags&quot;: {
     1: [&quot;$seen&quot;],
     3: [&quot;$seen&quot;, &quot;$flagged&quot;],
     15: [&quot;$answered&quot;, &quot;$forwarded&quot;]
  }
}
</sourcecode>
<t>Example of folder.json that shows incremental changes from the previous export shown above.
In this example 2 messages with UIDs 3 and 15 were removed. Message with UID 1 has updated
flags. Several new messages were added, some of them are with flags set.</t>

<sourcecode type="asciidoc=">{
  &quot;allowed_keywords&quot;: [&quot;$Forwarded&quot;, &quot;$MDNSent&quot;, &quot;$ismailinglist&quot;],
  &quot;last_uid&quot;: 21,
  &quot;highest_modseq&quot;: 6371845,
  &quot;recent_uid&quot;: 20,
  &quot;uidvalidity&quot;: 1107190787,
  &quot;is_subscribed&quot;: true,
  &quot;role&quot;: &quot;sent&quot;,
  &quot;sort_order&quot;: 1,
  &quot;myrights&quot;: &quot;rwiptsldaex&quot;,

  &quot;uids&quot;: {
     1: &quot;msg-1.eml&quot;,
     17: &quot;msg-17.eml&quot;,
     19: &quot;msg-19.eml&quot;,
     20: &quot;20.eml&quot;,
     21: &quot;21.eml&quot;
  },

  &quot;flags&quot;: {
     1: [&quot;$seen&quot;, &quot;$answered&quot;],
     17: [&quot;$seen&quot;, &quot;$answered&quot;, &quot;$forwarded&quot;],
     19: [&quot;$seen&quot;],
     20: [],
     21: []
  },

  &quot;removed&quot;: [3, 15]
}
</sourcecode>
</section>

<section anchor="contacts"><name>Contacts</name>
<t>VCard <xref target="RFC6350"></xref> has long been the basis for address book and contact data representation in
structured data files.  The specifications for JSContact <xref target="RFC9553"></xref> and JMAP for Contacts <xref target="RFC9610"></xref>
do a bunch of the work to explain how to do this in JSON, and in particular RFC9610 explains
how to express references between objects (e.g. an address book and a contact in that address
book) which is useful for a full export that can have its references reconstructed.  This section
explains how to use the fields and structures of those specifications within a PDP Archive export.</t>

<section anchor="individual-contact-items"><name>Individual Contact Items</name>
<t>Individual contact items build on JSContact <xref target="RFC9553"></xref> which builds on VCard<xref target="RFC6350"></xref>.</t>

<ul>
<li>The globally unique <tt>uid</tt> property is mandatory in JSContact and MUST be included in PDP archive.</li>
<li>The <tt>updated</tt> property is optional in JSContact but MUST be included in PDP archive.<br />
</li>
<li><t>The <tt>rev</tt> property defined in VCard (RFC6350), which is not included in JSContact (RFC9553),
may already be available in implementations.  It may also be included as a field on a contact,
in which case it is a simple value field holding a timestamp.</t>
</li>
<li><t>The <tt>@type</tt> property should be &quot;ContactCard&quot;.  Note that JSContact <xref target="RFC9553"></xref> uses a value
of &quot;Card&quot; for @type and registers that in <eref target="https://www.iana.org/assignments/jscontact/jscontact.xhtml">https://www.iana.org/assignments/jscontact/jscontact.xhtml</eref>,
but JMAP for Contacts uses &quot;ContactCard&quot; and registers that in <eref target="https://www.iana.org/assignments/jmap/jmap.xhtml">https://www.iana.org/assignments/jmap/jmap.xhtml</eref>.</t>
</li>
</ul>
<t>We make some specific requirements on the <tt>updated</tt> value so that it can be
useful for synchronization.  See the section on <tt>updated</tt> and <tt>uid</tt> specifically [TODO:
cross-reference other section].</t>
<t>When the structured data is prepared, a contact can be exported in a file with an arbitrary name
using a limited set of characters suitable for interoperability across filesystems. [TODO: we
need to define that elsewhere].</t>
<t>For example, a file called 'contact1.json' could contain:</t>

<artwork>{
   &quot;@type&quot;: &quot;ContactCard&quot;,
   &quot;version&quot;: &quot;1.0&quot;,
   &quot;uid&quot;: &quot;22B2C7DF-9120-4969-8460-05956FE6B065&quot;,
   &quot;id&quot;: 
   &quot;updated&quot;: &quot;2021-10-31T22:27:10Z&quot;,
   &quot;kind&quot;: &quot;individual&quot;,
   &quot;addressBookIds&quot;: [
      &quot;062adcfa-105d-455c-bc60-6db68b69c3f3&quot;
   ]
   &quot;name&quot;: {
       &quot;components&quot;: [
         { &quot;kind&quot;: &quot;given&quot;, &quot;value&quot;: &quot;John&quot; },
         { &quot;kind&quot;: &quot;surname&quot;, &quot;value&quot;: &quot;Doe&quot; }
       ],
       &quot;isOrdered&quot;: true
   },

   &quot;relatedTo&quot;: {
      &quot;urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6&quot;: {
         &quot;relation&quot;: {
           &quot;friend&quot;: true
         }
       }
   },

   &quot;notes&quot;: {
     &quot;n1&quot;: {
       &quot;note&quot;: &quot;Open office hours are 1600 to 1715 EST, Mon-Fri&quot;,
       &quot;created&quot;: &quot;2022-11-23T15:01:32Z&quot;,
       &quot;author&quot;: {
         &quot;name&quot;: &quot;John&quot;
       }
     }
   }
}
</artwork>
<t>For clarity, this example includes:
* How a card can reference address books which are exported as separate files in the overall export
* How a card can reference other cards using <tt>relatedTo</tt>
* A card can contain arbitrary notes - those are not necessarily exported as separate files even
though notes are also an object that can be included as individual files in a PDPArchive export.</t>
<t>TODO:  figure out if these can have RFC9610 &quot;id&quot; field</t>
<t>Because a ContactCard item can reference an AddressBook item, if a system exports contacts
belonging to address books it SHOULD also export the referenced AddressBook objects.  Likewise,
it SHOULD export the other ContactCard objects that are referenced in the 'relatedTo' field.
A permission or scope inconsistency would be one reason why the exporting system would not do so.<br />
For example, if the user chose to export only a public address book containing the &quot;John Doe&quot;
contact, and not the private &quot;Wedding guests&quot; address book that John Doe also belonged to,
then the private address book would either appear as an unresolvable ID or be cleaned up
so that it didn't appear (implementor's choice).  Likewise, when &quot;John Doe&quot; is exported
as part of a single address book export, but the <tt>friend</tt> relation in <tt>relatedTo</tt> is not
exported because they're not in the same address book, the <tt>relatedTo</tt> value may be included
in the export even if not resolvable by some users of the export file.</t>
</section>

<section anchor="group-contact-items"><name>Group Contact Items</name>
<t>Group contact items also refer to other contact items.  A file with an arbitrary name like &quot;contact2.json&quot;
could include:</t>

<artwork>{
   &quot;@type&quot;: &quot;ContactCard&quot;,
   &quot;kind&quot;: &quot;group&quot;,
   &quot;name&quot;: {
     &quot;full&quot;: &quot;The Doe family&quot;
   },
   &quot;uid&quot;: &quot;urn:uuid:ab4310aa-fa43-11e9-8f0b-362b9e155667&quot;,
   &quot;updated&quot;: &quot;2021-10-31T22:27:10Z&quot;,
   &quot;members&quot;: {
     &quot;urn:uuid:03a0e51f-d1aa-4385-8a53-e29025acd8af&quot;: true,
     &quot;urn:uuid:b8767877-b4a1-4c70-9acc-505d3819e519&quot;: true
   }
}
</artwork>
<t>As with individual ContactCard items referencing objects that are not exported at the same time,
a group contact can contain references that are not resolvable within the export.  If the user
chooses to export all address books then presumably the &quot;The Doe family&quot; group members can
all be found somewhere in the export, but if they export only the address book containing
&quot;The Doe family&quot; group and not the address books containing individual members, those IDs
would not be found in the export.</t>
</section>
</section>

<section anchor="using-rfc9610-address-book-objects"><name>Using RFC9610 address book objects</name>
<t>The VCard <xref target="RFC6350"></xref> specifications never defined a representation for address books.  Nor did
JSContact <xref target="RFC9553"></xref>.  JMAP for Contacts <xref target="RFC9610"></xref> does.  Its model is clearly that of
non-exclusive collection membership: a Contact item may appear with the same UID in multiple
Address Books, and if the Contact item with that UID is updated in one it is updated in the other
also.</t>
<t>Individual address book objects are returned in JMAP protocol messages with protocol wrappers.
It is the items inside the &quot;list&quot; element inside &quot;AddressBook/get&quot; that are nearly ready to
be represented as individual files in a PDPArchive. However, some things are missing:
* <tt>uid</tt> is called 'id' in JMAP for Contacts but this specification REQUIRES <tt>uid</tt>.
* <tt>updated</tt> is required
* The <tt>@type</tt> of AddressBook should be included within the data</t>
<t>This example copies the examples in RFC9610 so that interoperability between this spec and that one
is clear.</t>
<t>A file with an arbitrary name like address-book1.json would contain:</t>

<artwork>{
   &quot;@type&quot;: &quot;AddressBook&quot;,
   &quot;uid&quot;: &quot;062adcfa-105d-455c-bc60-6db68b69c3f3&quot;,
   &quot;updated&quot;: &quot;2020-01-09T14:32:01Z&quot;,
   &quot;name&quot;: &quot;Personal&quot;,
   &quot;description&quot;: null,
   &quot;sortOrder&quot;: 0,
   &quot;isDefault&quot;: true,
   &quot;isSubscribed&quot;: true,
   &quot;shareWith&quot;: {
     &quot;3f1502e0-63fe-4335-9ff3-e739c188f5dd&quot;: {
       &quot;mayRead&quot;: true,
       &quot;mayWrite&quot;: false,
       &quot;mayShare&quot;: false,
       &quot;mayDelete&quot;: false
     }
   },
   &quot;myRights&quot;: {
     &quot;mayRead&quot;: true,
     &quot;mayWrite&quot;: true,
     &quot;mayShare&quot;: true,
     &quot;mayDelete&quot;: false
   }
} 
</artwork>
<t>address-book2.json would contain:</t>

<artwork>{
   &quot;@type&quot;: &quot;AddressBook&quot;,
   &quot;uid&quot;: &quot;cd40089d-35f9-4fd7-980b-ba3a9f1d74fe&quot;,
   &quot;updated&quot;: &quot;2020-01-09T14:32:01Z&quot;,
   &quot;name&quot;: &quot;Autosaved&quot;,
   &quot;description&quot;: null,
   &quot;sortOrder&quot;: 1,
   &quot;isDefault&quot;: false,
   &quot;isSubscribed&quot;: true,
   &quot;shareWith&quot;: null,
   &quot;myRights&quot;: {
     &quot;mayRead&quot;: true,
     &quot;mayWrite&quot;: true,
     &quot;mayShare&quot;: true,
     &quot;mayDelete&quot;: false
   }
}
</artwork>
<t>Note that the first example includes a <tt>shareWith</tt> value, showing that the user's AddressBook has been
shared with in this case one other principal with the id &quot;3f1502e0-63fe-4335-9ff3-e739c188f5dd&quot;.<br />
This information can be exported and may be quite useful in case of backup/restore use cases.
However, it may not be useful in other administrative domains where the same concept of principals
does not allow the Principal ID to be resolved against the correct account.  In any case, the
object referred to by this Principal ID is not itself given representation in the PDP Archive export.</t>
</section>

<section anchor="calendar-events-tasks-and-groups"><name>Calendar events, tasks and groups</name>
<t>JSCalendar <xref target="RFC8984"></xref> is the basis for representing events, tasks and groups in JSON.
This section explains how to export individual events and tasks within an archive.
JMAP for Calendars (<eref target="https://datatracker.ietf.org/doc/draft-ietf-jmap-calendars/">https://datatracker.ietf.org/doc/draft-ietf-jmap-calendars/</eref>)
does provide some additional considerations when producing calendar data from a JMAP
system or to be consumed by a JMAP system, so it is also a normative reference.</t>
<t>Note on CalDAV compatibility: Although CalDAV servers are fairly common, they support the
older VEVENT and VTODO syntax.  This specification requires the JSCalendar syntax instead.
Either way, a server building a personal data archive is likely transforming an internal
implementation-specific relational data format to an export format.</t>
<t>Note on ETag: CalDAV servers use the event's UID to identify the same object, and use ETags
to identify changed events, so that a CalDAV client may make sure it has the version a
server has before it updates an item, solving the lost-update problem.  Since this
specification doesn't attempt to solve the lost-update problem as well as client-server
protocols can, and since JSCalendar does not include the ETag of a calendar event in any way,
this specification does not include any requirements for ETags.</t>
<t>Notes on specific fields:</t>

<ul spacing="compact">
<li>The globally unique <tt>uid</tt> property is mandatory in JSCalendar and MUST be included.
See JMAP Calendars [ref todo] section 1.4.1 for when the <tt>uid</tt> property can appear the same for
multiple recurrences of the same underlying event.</li>
<li>The <tt>updated</tt> property is mandatory in JSCalendar and MUST be included.</li>
<li>The <tt>sequence</tt> value is optional in JSCalendar but SHOULD be included if available.</li>
<li>The <tt>@type</tt> property for one of these items MUST be &quot;Event&quot;, &quot;Task&quot; or &quot;Group&quot;.</li>
<li>Recurrence rules SHOULD be fully exported, unless it's clear from the use case
or user request that the
destination for the data wants expanded recurrences within a specific time period.</li>
<li>The <tt>calendarIds</tt> field defined in JMAP Calendars is REQUIRED in order to match up
events to the calendar they are supposed to appear in.<br />
</li>
</ul>
<t>For example, a file called event1.json could contain:</t>

<sourcecode type="json">{
  &quot;@type&quot;: &quot;Event&quot;,
  &quot;uid&quot;: &quot;2a358cee-6489-4f14-a57f-c104db4dc2f2&quot;,
  &quot;updated&quot;: &quot;2020-01-09T14:32:01Z&quot;,
  &quot;title&quot;: &quot;Board Meeting&quot;,
  &quot;start&quot;: &quot;2024-10-25T09:00:00&quot;,
  &quot;timeZone&quot;: &quot;Europe/London&quot;,
  &quot;duration&quot;: &quot;PT1H30M&quot;,
  &quot;participants&quot;: {
    &quot;1&quot;: {
      &quot;@type&quot;: &quot;Participant&quot;,
      &quot;name&quot;: &quot;Jane Doe&quot;,
      &quot;sendTo&quot;: {
        &quot;mailto&quot;: &quot;jane@example.com&quot;
      },
      &quot;roles&quot;: {
        &quot;attendee&quot;: true
      }
    }
  },
  &quot;calendarIds&quot;: {
    &quot;062adcfa-105d-455c-bc60-6db68b69c3f3&quot;: true
  }
}
</sourcecode>
<t>The event object includes a calendarIds property, which links it to the calendar collection it belongs to.</t>
</section>

<section anchor="calendar-collection-items"><name>Calendar Collection Items</name>
<t>Calendar collection items are built using JMAP for Calendars [REF TODO - still I-D].</t>
<t>If a system exports events belonging to calendars, it SHOULD also export the referenced Calendar objects.</t>
<t>A file with an arbitrary name, such as calendar1.json, in a directory (e.g., \calendars\calendar2) would contain the calendar's metadata:</t>

<sourcecode type="json">{
  &quot;@type&quot;: &quot;Calendar&quot;,
  &quot;uid&quot;: &quot;062adcfa-105d-455c-bc60-6db68b69c3f3&quot;,
  &quot;updated&quot;: &quot;2020-01-09T14:32:01Z&quot;,
  &quot;name&quot;: &quot;Work Calendar&quot;,
  &quot;color&quot;: &quot;#123456&quot;,
  &quot;sortOrder&quot;: 0,
  &quot;isDefault&quot;: true,
  &quot;isSubscribed&quot;: true,
  &quot;myRights&quot;: {
    &quot;mayRead&quot;: true,
    &quot;mayWrite&quot;: true,
    &quot;mayShare&quot;: true,
    &quot;mayDelete&quot;: false
  }
}
</sourcecode>
<t>The <tt>uid</tt> value here corresponds to the ID used in the calendarIds property of the individual event item.</t>

<section anchor="tasks"><name>Tasks</name>
<t>Tasks are also defined by the JSCalendar specification <xref target="RFC8984"></xref> using the &quot;Task&quot; object type.
As with events, tasks MUST include the uid and updated fields to support synchronization.</t>
<t>For example, a file called task1.json could contain:</t>

<sourcecode type="json">{
  &quot;@type&quot;: &quot;Task&quot;,
  &quot;uid&quot;: &quot;7b0f69a6-6e3e-4f1b-85d8-c89b43d2f2a1&quot;,
  &quot;updated&quot;: &quot;2022-11-23T15:01:32Z&quot;,
  &quot;title&quot;: &quot;Submit Quarterly Report&quot;,
  &quot;status&quot;: &quot;in-progress&quot;,
  &quot;priority&quot;: 1,
  &quot;due&quot;: &quot;2024-12-31T23:59:59Z&quot;
}
</sourcecode>
</section>
</section>

<section anchor="notes"><name>Notes</name>

<ul spacing="compact">
<li>VJournal is first defined in iCalendar (now <xref target="RFC5545"></xref>).</li>
<li>VJournal also used in <eref target="https://datatracker.ietf.org/doc/html/rfc4791">CalDAV</eref></li>
<li>However, they are NOT used in <eref target="https://jmap.io/spec-calendars.html">https://jmap.io/spec-calendars.html</eref></li>
</ul>
<t>Do we even have a JSON format for notes defined?</t>
</section>

<section anchor="files"><name>Files</name>
</section>

<section anchor="other"><name>Other</name>
<t>(LMD Note: I think this might better fit in an out of scope section - I think out of scope sections are useful for statements that explain why scope is limited.  that's assuming we all agree that groups, freebusy and timezones are left out.)</t>
<t>Groups as defined in JSCalendar are NOT part of this archive format.  Groups in JSCalendar can combine events and tasks in a container.  This specification, for consistency and simplicity, uses folders and requires individual objects to be in separate files.</t>
<t>VFREEBUSY objects are not part of this archive format.  Calendar software can calculate freebusy time from event data.  Use cases that are not satisfied by this limitation could extend this archive format but understanding what it means to backup, restore, export or import freebusy data would need to be fleshed out.</t>
<t>VTIMEZONE objects are not part of this archive format.  Timezones are more likely to be system objects referred to by calendar objects in modern calendar systems, than objects which are exchanged across systems.</t>
</section>
</section>

<section anchor="synchronization-requirements"><name>Synchronization requirements</name>
<t>This section describes the requirements to achieve repeated one-way synchronization via export/import operations between software by different vendors. While limited, this still provides better functionality than what many end-users experience with their groupware software and services today.</t>
<t>Supporting <em>repeated</em> synchronization means that the export from system A and import to system B can happen over and over again without needlessly duplicating items.  Supporting <em>one-way</em> synchronization means that changes in the system with the exporter role propagate reliably to the system with the importer role, but not in the reverse direction.   Some of the constraints here arise from the fact that the two systems may not directly connect, and the import may be time-delayed from the export.</t>
<t>This limited solution for export/import sync may also be used for more direct system-to-system transfers such as service-to-service data transfers, repeated data access requests or data migrations, although some of those use cases could be solved much better with direct negotiation of features.</t>

<section anchor="always-include-uid-and-updated"><name>Always include 'uid' and 'updated'</name>
<t>These requirements apply to JSContact <xref target="RFC9553"></xref>, JSTask and JSCalendar <xref target="RFC8984"></xref> objects when exported or imported using the formats in this specification, because these all have 'uid' and 'updated' values.</t>
<t>Requirements:</t>

<ul spacing="compact">
<li>exporters MUST include the UID and updated fields.<br />
</li>
<li>The 'updated' field MUST be exported in UTC and interpreted in UTC.  Accurate system time is important.<br />
</li>
<li>importers MUST use the UID in an imported object, if the importer is creating a new object, rather than invent a new UID.</li>
<li>importers MUST search for existing objects with the same UID, and if the object in storage is <em>similar enough</em>  (see Note below) to the import data, the importer SHOULD NOT change the object and MUST NOT update its 'updated' timestamp.<br />
</li>
</ul>
<t>Recommendations:</t>

<ul spacing="compact">
<li>Importers SHOULD use caution with fields that are system-updated, especially frequently updated.  Such fields SHOULD NOT change the value of 'updated' that is exported or used to decide whether to update an object during an import operation. See note below on 'updated'.</li>
<li>Importers SHOULD apply common sense in updating internal or implementation-specific fields.  This specification does not require the importer to include, omit, handle or disregard values for fields that it believes are internally-generated or implementation-specific.  For example, a system in the role of exporter might export an event object with a video-conference room ID in a custom field.  It can decide that it is sensible to export that value as a URL for external use.  Later, the same system or one with code written for compatibility could import that event with the video-conference URL, and it would be sensible to avoid overwriting its own knowledge of the room ID with the URL.</li>
<li>When importing a <em>changed</em> or <em>new</em> object with a UID and 'updated' value, the importer SHOULD set the 'updated' value to the one imported. Thus, if a Contact is updated on Jan 1, exported on Jan 2 and imported on Jan 3, the new or updated imported contact would show an 'updated' value of Jan 1.</li>
</ul>
<t>Note on <em>similar enough</em>: This specification requires nuance in order to allow both reasonably consistent synchronization and reasonable behavior in a wide variety of use cases and implementations.  The language above is intended to give implementors both guidance and wiggle room.  For example, the importer could convert a DTSTART time from UTC to the user's local time and save it as the displayed start time. Later, re-importing the same object with the same UID, the importing code could be smart enough to realize that the time hasn't <em>actually</em> changed, and avoid changing the 'updated' timestamp or creating a conflicting event.  This logic could be implemented by saving separate fields (imported time vs display time), by keeping a log of updates (log entry stating that the system auto-converted start time from X to Y), or by other clever algorithms. Thus, the clever implementation can avoid the appearance of an object that changes every time the calendar is synchronized.</t>
<t>Note on <em>updated</em>: The definition of 'updated' in JSContact <xref target="RFC9553"></xref> is not rigorous or nuanced. &quot;when the data in the Card was last modified&quot; could refer to several instances of the card -- its internal implementation, its representation in an email share, its representation in an HTTP GET response <xref target="RFC6352"></xref>.   It's not specified whether 'updated' is the same as REV in VCard <xref target="RFC6350"></xref>, which is defined differently.  Neither definition explicitly covers vendor-specific fields.  Thus, this specification makes additional recommendations for handling 'updated':</t>

<ul spacing="compact">
<li>The value of 'updated' SHOULD only change when two conditions hold: the end-user makes a decision to change a value of a user-visible field, AND the export of the JSContact shows a different value.<br />
</li>
<li>Thus, non-user-visible fields like 'version' could be changed without causing the 'updated' value to change.  A value such as 'language' could be set without changing 'updated' (if an implementation infers the language tag and begins to include 'fr-CA' as the language value in exports instead of no language, nevertheless this doesn't change the user-visible content).</li>
<li>If implementations need to manage the synchronization of vendor-specific fields, a vendor-specific field like 'example.com:updated' can be used rather than affect the user-visible synchronization made possible by 'updated'. Implementations could possibly also handle 'updated' differently when used for export/import using the formats in this specification, than when the same field is handled in other code paths.<br />
</li>
</ul>
<t>We recognize that this understanding of 'updated' is highly judgement-dependent. The same field can change in one way and cause a change to 'updated' and in another way may not (the example of server inferring the language is 'fr-CA' vs the user explicitly setting it).  It is likely to be frustrating to protocol designers and implementors (as it is to the authors of this specification) that the definition is so wobbly.  We'd love to know of better solutions that work with the status quo.</t>

<section anchor="synchronizing-from-and-to-caldav-servers"><name>Synchronizing from and to CalDAV servers</name>
<t>CalDAV <xref target="RFC4791"></xref> uses URLs, ETags and UIDs for synchronizing changes between two systems reliably, but it relies upon client-server architecture, where the server is the &quot;source of truth&quot; and the client must manage its local history and decide which things to update from the server and which things to tell the server to update.  If a user is setting up synchronization or an implementor is building a system that involves synchronization, it may be best to use CalDAV if that is a feasible solution.</t>
<t>Nevertheless, we believe some of the use cases in our <eref target="#use_cases">use case section</eref> motivate not only including calendar data in these archives for backup purposes, but also for partial updates.  This works the same way it does for JSContact and JSCard objects.</t>
</section>
</section>

<section anchor="synchronizing-address-books"><name>Synchronizing address books</name>
<t>Build on <xref target="RFC9610"></xref></t>
</section>

<section anchor="synchronizing-mailbox-folders"><name>Synchronizing mailbox folders</name>
<t>Because servers may differ in which characters they support in folder names, how many levels deep folders may be created, and even in what separator character is used to indicate folder hierarchy, difficulties in synchronizing folder names will definitely arise.  Folder names that are not likely to be widely supported in other systems should be translated for export, because if the exporting system has a consistent translation algorithm, then even if the mailbox name looks different in the importing system it will still be imported consistently.</t>
<t>Systems that support mailbox IDs MUST include them in exports.  Systems that do not (though it's strongly encouraged) SHOULD use the full mailbox name as the unique identifier value.</t>
<blockquote><t>TODO: Also it would be good to include a &quot;display name&quot; in case the server has had to translate the mailbox name for compatibility.  E.g. a server that has a mailbox named &quot;%L33T%&quot;, but knows the &quot;%&quot; should not be exported because many servers forbid the &quot;%&quot;, would translate the name consistently to _pc_L33T<em>pc</em> or another set of safe characters and include a display name of &quot;%L33T%&quot; for reference and debugging.</t>
</blockquote></section>

<section anchor="blobs-and-files"><name>Blobs and files?</name>
<t>Reference <xref target="RFC9404"></xref>?</t>
</section>
</section>
</section>

<section anchor="open-issues"><name>Open issues</name>

<section anchor="container-format"><name>Container format</name>
<t>This document leverages existing data formats and adds certain files for representing metadata. While one may work with this raw data, most import/export scenarios will rather require the bundling of individual data items into one or few container files.</t>
<t>This document does not strive to invent its own container format, but may refer to existing ones.</t>
<t>High level options would be:</t>

<ul spacing="compact">
<li>Recommend using a container format without preferring a particular one</li>
<li>Mandating a specific format</li>
<li>...?</li>
</ul>
<t>Actual container formats likely differ in various dimensions:</t>

<ul spacing="compact">
<li>Ease of adding incremental data</li>
<li>Ease of updating existing data</li>
<li>Ease of accessing files</li>
<li>Support for compression</li>
<li>Support for data streaming</li>
<li>Availability of library/tool support across platforms</li>
<li>Internal file references</li>
<li>Open standard</li>
<li>...?</li>
</ul>
<t>Candidates</t>

<ul spacing="compact">
<li>tar/gz</li>
<li>zip</li>
<li>7z</li>
<li>zpaq</li>
<li>...?</li>
</ul>
<t>See <eref target="https://github.com/hhappel/draft-happel-mailmaint-pdparchive/issues/13">https://github.com/hhappel/draft-happel-mailmaint-pdparchive/issues/13</eref></t>
</section>

<section anchor="encryption"><name>Encryption</name>
<t>Support for encryption of any kind is so far no requirement in the draft. However, an increasing number of services offers forms of data encryption. Implications for this draft may be considered.</t>
<t>&quot;Encryption&quot; might refer to various aspects:</t>

<ul spacing="compact">
<li>Existing encryption of individual files in the export</li>
<li>Encrypting the complete export (incl. metadata?)</li>
<li>...?</li>
</ul>
<t>See <eref target="https://github.com/hhappel/draft-happel-mailmaint-pdparchive/issues/14">https://github.com/hhappel/draft-happel-mailmaint-pdparchive/issues/14</eref></t>
</section>
</section>

<section anchor="implementation-status"><name>Implementation status</name>
<t>&lt; RFC Editor: before publication please remove this section and the reference to <xref target="RFC7942"></xref> &gt;</t>
<t>This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in <xref target="RFC7942"></xref>. The description of implementations in this section is intended to assist the IETF in its decision processes in progressing drafts to RFCs. Please note that the listing of any individual implementation here does not imply endorsement by the IETF. Furthermore, no effort has been spent to verify the information presented here that was supplied by IETF contributors. This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist.</t>
<t>According to <xref target="RFC7942"></xref>, &quot;this will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature. It is up to the individual working groups to use this information as they see fit&quot;.</t>
</section>

<section anchor="security-considerations"><name>Security considerations</name>
<t>Archive files can be maliciously edited before being uploaded to servers. There is no guarantee that the content of the archive file accurately represents email that an email box received or sent, and so on.</t>
</section>

<section anchor="privacy-considerations"><name>Privacy considerations</name>
<t>tbd.</t>
</section>

<section anchor="iana-considerations"><name>IANA Considerations</name>

<section anchor="file-extension"><name>File extension</name>
<t>register .pdpa?</t>
</section>
</section>

</middle>

<back>
<references><name>Informative References</name>
<reference anchor="GoogleTakeout" target="https://takeout.google.com/settings/takeout">
  <front>
    <title>Google Takeout</title>
    <author>
      <organization>Google</organization>
    </author>
    <date></date>
  </front>
</reference>
<reference anchor="PST" target="https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-pst/141923d5-15ab-4ef1-a524-6dce75aae546">
  <front>
    <title>[MS-PST]: Outlook Personal Folders (.pst) File Format</title>
    <author>
      <organization>Microsoft</organization>
    </author>
    <date></date>
  </front>
</reference>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2822.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3501.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4155.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4314.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4791.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5322.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5545.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5598.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6154.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6350.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6352.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7162.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7942.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.822.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8610.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8621.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8984.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9404.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9553.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9610.xml"/>
</references>

</back>

</rfc>
