<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<rfc category="info" docName="draft-canel-robots-ai-control-00" ipr="trust200902" submissionType="IETF" updates="9309" xmlns:xi="http://www.w3.org/2001/XInclude">

  <front>
    <title>Robots Exclusion Protocol Extension to manage AI content use</title>
    <author initials="F." surname="Canel" fullname="Fabrice Canel" role="editor">
      <organization>Microsoft Corporation</organization>
      <address>
        <postal>
          <street>One Microsoft Way</street>
          <city>Redmond</city>
          <region>WA</region>
          <code>98052</code>
          <country>United States</country>
        </postal>
        <email>facan@microsoft.com</email>
      </address>
    </author>
    <author initials="K." surname="Madhavan" fullname="Krishna Madhavan">
      <organization>Microsoft Corporation</organization>
      <address>
        <postal>
          <street>One Microsoft Way</street>
          <city>Redmond</city>
          <region>WA</region>
          <code>98052</code>
          <country>United States</country>
        </postal>
        <email>krmadhav@microsoft.com</email>
      </address>
    </author>
    <date day="21" month="October" year="2024"/>
    <area>Applications and Real-Time Area</area>
    <workgroup>Internet Engineering Task Force</workgroup>
    <abstract>
      <t>This document extends RFC9309 by specifying additional rules for controlling usage of the content in the field of Artificial Intelligence (AI).</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>While the Robots Exclusion Protocol enables service owners to control how, if at all, automated clients known as crawlers may access the URIs on their services as defined by [RFC8288], the protocol doesn't provide controls on how the data returned by their service may be used in training generative AI foundation models.</t>
      <t>Application developers are requested to honor these tags. The tags are not a form of access authorization however.</t>
    </section>

    <section title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.</t>
    </section>

    <section title="Specification">
      <section title="Robots Control Rules">
        <t>The possible values of the rules complementing existing allow, disallow rules are:</t>
        <list>
          <t>DisallowAITraining - instructs the parser to not use the data for AI training language model.</t>
          <t>AllowAITraining - instructs the parser that the data can be used for AI training language model.</t>
        </list>
        <t>The values are case insensitive and honor the same matching logic as Allow and disallow rules. When Allow and Disallow rules define if the content can be downloaded, AllowAITraining and DisallowAITraining rules only apply rules on usage of the content for AI training.</t>
      </section>

      <section title="Application Layer Response Header">
        <t>The same rules can also be set in the Application Layer Response Header:</t>
        <list>
          <t>DisallowAITraining - instructs the parser to not use the data for AI training language model.</t>
          <t>AllowAITraining - instructs the parser that the data can be used for AI training language model.</t>
        </list>
        <t>The values are case insensitive and honor the same matching logic as Allow and disallow rules.</t>
      </section>

      <section title="HTML Meta Element">
        <t>Same rules can also be set via an HTML meta tag:</t>
        <list>
          <t>&lt;meta name="robots" content="DisallowAITraining"&gt;</t>
          <t>&lt;meta name="examplebot" content="AllowAITraining"&gt;</t>
        </list>
      </section>
    </section>

    <section title="IANA Considerations">
      <t>TODO: https://www.rfc-editor.org/rfc/rfc9110.html#name-field-name-registry</t>
    </section>

   </middle>
</rfc>