<?xml version='1.0' encoding='UTF-8'?>
<rfc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" category="info" docName="draft-grimminck-safe-ioc-sharing-03" ipr="trust200902" xml:lang="en" version="3">
  <front>
    <title>A Standard for Safe and Reversible Sharing of Malicious URLs and Indicators</title>
    <author fullname="Stefan Grimminck" role="editor">
      <address>
        <email>ietf@stefangrimminck.nl</email>
      </address>
    </author>
    <date year="2025" month="April" day="9"/>
    <abstract>
      <t>This document defines a consistent and reversible method for sharing potentially malicious indicators of compromise (IOCs), such as URLs, IP addresses, email addresses, and domain names. It introduces a safe obfuscation format to prevent accidental execution or activation when IOCs are displayed or transmitted. These techniques aim to standardize the safe dissemination of threat intelligence data. This specification uses the URI syntax defined in RFC 3986 and follows the key word conventions from RFC 2119.</t>
    </abstract>
  </front>
  <middle>
    <section title="Introduction">
      <t>The secure sharing of malicious artifacts is vital to threat intelligence, open-source intelligence (OSINT), and incident response efforts. However, sharing raw URLs, IP addresses, and email addresses associated with malware or threat actors poses a risk of accidental activation.</t><t>The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shown here.</t>

      <t>This document defines a clear and reversible method for obfuscating and de-obfuscating IOCs to support safe sharing across various platforms, formats, and use cases. The requirements language (e.g., "MUST", "SHOULD") follows <xref target="RFC2119"/>, and URI syntax adheres to <xref target="RFC3986"/>.</t>
    </section>

    <section title="Terminology">
      <t>Obfuscating: The process of altering an indicator so that it cannot be accidentally activated or clicked. This was previously referred to as "defanging".</t>
      <t>De-obfuscating: The process of restoring an obfuscated indicator to its original, actionable form. This was previously referred to as "refanging".</t>
      <t>IOC: Indicator of Compromise - data such as a URL, IP address, domain name, email address, or hash associated with malicious activity.</t>
    </section>

    <section title="Problem Statement">
      <t>Inconsistent obfuscation practices hinder the reliable and automated exchange of threat intelligence. For example:</t>
      <ul>
        <li>A URL obfuscated as "h**p://example[.]com" cannot be reliably parsed by tools expecting "hxxp://example[.]com".</li>
        <li>An IP address obfuscated with parentheses (e.g., "192.0.2(.)1") may fail to de-obfuscate in systems expecting "[.]".</li>
      </ul>
      <t>Such inconsistencies reduce the effectiveness of threat detection and response.</t>
    </section>

    <section title="Obfuscation Techniques">
      <t>The following transformations MUST be consistently applied:</t>
      <ul>
        <li>Replace "http" and "https" schemes with "hxxp" and "hxxps" respectively.</li>
        <li>Replace every period (".") in domain names and IP addresses with "[.]".</li>
        <li>Replace the "@" character in email addresses or credentials with "[@]".</li>
      </ul>
      <t>Using encoded characters (such as %2e for ".") SHOULD be avoided to prevent ambiguity.</t>
      <t>Examples:</t>
      <ul>
        <li>Original: https://evil.example.com/path<br/>Obfuscated: hxxps://evil[.]example[.]com/path</li>
        <li>Original: http://username:password@attacker.com<br/>Obfuscated: hxxp://username:password[@]attacker[.]com</li>
        <li>Original: user@phishing.example.com<br/>Obfuscated: user[@]phishing[.]example[.]com</li>
        <li>Original: http://192.0.2.1<br/>Obfuscated: hxxp://192[.]0[.]2[.]1</li> 
        <li>Original: http://[2001:db8::1]:8080<br/>Obfuscated: hxxp://[2001:db8::1]:8080</li>
      </ul>
      <t>Note: Credentials in URIs (e.g., <em>username:password</em>) are included here for illustrative purposes only. Sharing credentials, even in obfuscated form, is strongly discouraged in operational contexts.</t>
      <t>Note: IPv6 addresses enclosed in square brackets MUST retain their colon-based syntax (e.g., "::") and brackets. These characters are essential to URI parsing and MUST NOT be altered. Obfuscation should apply only to components such as the scheme ("http") or domains, not to the IPv6 address syntax.</t>
    </section>

    <section title="De-obfuscation Techniques">
      <t>Tools designed to ingest obfuscated data SHOULD automatically reverse these transformations in a deterministic manner:</t>
      <ul>
        <li>Convert "hxxp" and "hxxps" back to "http" and "https" respectively.</li>
        <li>Convert "[.]" back to ".".</li>
        <li>Convert "[@]" back to "@".</li>
      </ul>
      <t>De-obfuscation MUST maintain the original semantics of the data to avoid misinterpretation. Examples:</t>
      <ul>
        <li>Obfuscated: hxxps://evil[.]example[.]com/path<br/>De-obfuscated: https://evil.example.com/path</li>
        <li>Obfuscated: user[@]phishing[.]example[.]com<br/>De-obfuscated: user@phishing.example.com</li>
      </ul>
    </section>

    <section title="Example Use Cases">
      <t>Common scenarios include:</t>
      <ul>
        <li>**OSINT Sharing**: A report lists obfuscated URLs (e.g., "hxxp://malware[.]com/payload") to prevent accidental clicks.</li>
        <li>**Email Communication**: Security teams share obfuscated IOCs like "attacker[@]example[.]com" in email threads.</li>
        <li>**Threat Intelligence Platforms**: Automated ingestion of obfuscated IPs (e.g., "192[.]0[.]2[.]1") for blocklist updates.</li>
      </ul>
    </section>

    <section title="Security Considerations">
      <t>While these obfuscation techniques reduce the risk of accidental activation of malicious indicators, obfuscated data SHOULD always be handled with caution.</t>
      <t>Repeated application of obfuscation without proper context or normalization MAY result in ambiguous or non-reversible transformations. Implementations SHOULD avoid multiple layers of obfuscation without canonicalization.</t>
      <ul>
        <li>Obfuscated URLs in PDFs may still be rendered as hyperlinks; use plain-text formatting.</li>
        <li>Systems processing obfuscated indicators MUST treat them as potentially harmful data, applying sandboxing or isolated environments for analysis.</li>
        <li>Credentials (e.g., <em>username:password</em>) SHOULD NOT be shared, even in obfuscated form, due to inherent security risks.</li>
      </ul>
    </section>

    <section title="Implementation Guidance">
      <t>Software designed to parse threat intelligence feeds should explicitly support these obfuscation and de-obfuscation standards. Implementations SHOULD verify correct de-obfuscation through unit tests and validation scripts. Example test case:</t>
      <artwork>
        Test Input: "hxxp://192[.]0[.]2[.]1"
        Expected Output: "http://192.0.2.1"
      </artwork>
    </section>

    <section title="Edge Cases and Special Handling">
      <t>**Internationalized Domain Names (IDNs)**: Obfuscate punycode domains similarly (e.g., "xn--example[.]com").</t>
      <t>**Non-Standard URI Schemes**: For schemes like "ftp", apply analogous obfuscation (e.g., "fxp://example[.]com").</t>
      <t>**IPv6 Literals in URIs**: Do not alter colon characters (":") or brackets ("[", "]") in IPv6 addresses. For example, "[2001:db8::1]" MUST remain unchanged. Only scheme names or domain elements surrounding them should be obfuscated.</t>
    </section>

    <section title="IANA Considerations">
      <t>This document has no IANA actions.</t>
    </section>
  </middle>

  <back>
    <references>
      <name>Normative References</name>
      <reference anchor="RFC3986">
        <front>
          <title>Uniform Resource Identifier (URI): Generic Syntax</title>
          <author initials="T." surname="Berners-Lee" fullname="T. Berners-Lee"/>
          <author initials="R." surname="Fielding" fullname="R. Fielding"/>
          <author initials="L." surname="Masinter" fullname="L. Masinter"/>
          <date year="2005" month="January"/>
        </front>
        <seriesInfo name="STD" value="66"/>
        <seriesInfo name="RFC" value="3986"/>
      </reference>

      <reference anchor="RFC2119">
        <front>
          <title>Key words for use in RFCs to Indicate Requirement Levels</title>
          <author initials="S." surname="Bradner" fullname="S. Bradner"/>
          <date year="1997" month="March"/>
        </front>
        <seriesInfo name="BCP" value="14"/>
        <seriesInfo name="RFC" value="2119"/>
      </reference>
    
      <reference anchor="RFC8174">
        <front>
          <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
          <author initials="B." surname="Leiba" fullname="B. Leiba"/>
          <date year="2017" month="May"/>
        </front>
        <seriesInfo name="BCP" value="14"/>
        <seriesInfo name="RFC" value="8174"/>
      </reference>
    </references>
  </back>
</rfc>
