AI使用揭露的HTML標準

Hacker News·3 個月前

一項提議新的HTML屬性和meta標籤，以實現網頁內容中AI參與的元素級別揭露，解決了缺乏細粒度透明度的問題。

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

To see all available qualifiers, see our documentation.

Explainer: AI Content Disclosure for HTML — element-level markup for AI authorship transparency

License

dweekly/ai-content-disclosure

Folders and files

Latest commit

History

Repository files navigation

AI Content Disclosure for HTML

Authors

Participate

Introduction

Web pages increasingly contain text produced with varying degrees of AI
involvement — from light AI-assisted editing to fully autonomous generation.
There is currently no standard HTML mechanism for authors to disclose AI
involvement at element-level granularity within a page.

This explainer proposes an aidisclosure HTML attribute and a companion
<meta name="ai-disclosure"> tag, enabling authors to declare the degree
of AI involvement in any section of a web page.

Problem / Motivation

A modern news article page might contain a human-written investigation
alongside an AI-generated summary sidebar and AI-moderated user comments.
Today, there is no standard way to label these sections differently.

Existing approaches operate at coarser granularity:

WHATWG HTML #9479 proposes
a page-level <meta> tag with four values. It does not support marking
individual elements. Commenters on that issue (42+) identified element-level
granularity as the critical missing capability.

IETF draft-abaris-aicdh-00
defines an AI-Disclosure HTTP response header. It applies to entire
HTTP responses and cannot distinguish mixed content within a page.

C2PA 2.2 provides cryptographic provenance
for media files (images, video, audio). It does not support HTML text
content and is designed for file-level, not element-level, assertions.

Regulatory Context

The EU AI Act Article 50
(effective August 2026) requires that AI-generated text content be "marked in
a machine-readable format and detectable as artificially generated or
manipulated." Major platforms (YouTube, Meta, TikTok) already require AI
disclosure in their policies. A standard mechanism would serve both regulatory
compliance and voluntary transparency.

Goals

Non-Goals

Proposed Solution

Page-Level Declaration (Meta Tag)

For pages with uniform AI involvement:

The value mixed signals that different sections have different levels;
inspect element-level attributes for detail.

Element-Level Declaration (HTML Attribute)

An aidisclosure global attribute on any HTML element:

Disclosure Values

Four values, aligned with the IETF AI-Disclosure header and IPTC Digital
Source Type vocabulary:

Optional Metadata Attributes

Schema.org Integration (JSON-LD)

For search engine discoverability, the same information can be expressed as
structured data. The simplest form is just the level string:

An expanded form supports optional metadata. All fields except level are
strictly optional — publishers may have legitimate reasons not to disclose
specific tools or providers:

(Proposed as a comment on
schemaorg/schemaorg#3391.)

Key Scenarios

Scenario 1: News Article with AI Summary Sidebar

A newsroom publishes an investigative piece with a human-written article
and an AI-generated summary:

Scenario 2: Blog Post Written with AI Editing

A blogger writes a post and uses an LLM for grammar, style, and clarity
improvements:

Scenario 3: Fully Automated Content Feed

An automated system generates weather reports without per-instance human
oversight:

Scenario 4: Human-Only Publication Asserting Provenance

A literary journal positively asserts that no AI was used:

Note: aidisclosure="none" is a positive assertion. The absence of the
attribute means "unknown," not "none."

Detailed Design

Inheritance

Relationship to HTTP AI-Disclosure Header

These are complementary. A CDN or reverse proxy can set the HTTP header;
a CMS can set the meta tag; an author or AI tool can set element-level
attributes. None supersedes the others.

Vocabulary Alignment with IPTC

What Counts as AI?

To address a frequently raised concern ("where do you draw the line with
grammar checkers?"), here is boundary guidance:

Not AI (no disclosure needed):

ai-assisted:

ai-generated:

autonomous:

The boundary is generative/inferential AI — systems trained on data
that produce novel outputs. Deterministic tools that apply fixed rules are
not covered.

Alternatives Considered

1. Page-Level Meta Tag Only (WHATWG #9479)

Does not handle mixed-content pages — the single most requested feature in
that issue's 42+ comments.

2. HTTP Header Only (IETF AI-Disclosure)

No element-level granularity. Not accessible to client-side tools processing
the DOM. Cannot distinguish mixed content within a page.

3. C2PA Manifest for HTML

C2PA is file-based cryptographic provenance. HTML pages are dynamically
assembled from templates, databases, and user input — they are not single
files with stable hashes. C2PA and this proposal are complementary, not
competing.

4. data-* Attributes Only

Using data-ai-disclosure instead of a dedicated attribute was considered.
Trade-off: data-* attributes have no semantic meaning to browsers or
assistive technology. A dedicated attribute signals intent for future browser
integration (e.g., address bar indicators, accessibility announcements) and
is consistent with how other proposals (like containertiming) have
proceeded.

5. RDFa / Microdata Only

Too verbose for common cases. RDFa requires namespace declarations and
multi-attribute markup for simple assertions. The aidisclosure attribute
provides a lightweight default; RDFa or JSON-LD can supplement it for richer
structured data needs.

Privacy and Security Considerations

Responses to the W3C Security and Privacy Self-Review Questionnaire:

Accessibility Considerations

Internationalization Considerations

FAQ / Common Objections

"This is the evil bit — bad actors won't comply."

This standard serves responsible publishers, regulated industries, and AI
tool vendors who want to be transparent. It is not a detection mechanism.
The EU AI Act makes compliance mandatory for covered entities, and major
platforms already require disclosure in their terms of service.

The analogy is rel=nofollow: voluntary, widely adopted because it aligns
incentives, and useful despite being ignorable by bad actors.

"Where do you draw the line with grammar checkers?"

See What Counts as AI? above. The boundary is
generative/inferential AI — systems trained on data that produce novel
outputs. Deterministic spell-check and thesaurus tools are excluded.

"Metadata will be gamed like SEO dates."

True for any self-declared metadata. The standard enables honest disclosure;
verification requires pairing with C2PA or regulatory auditing. The value is
in the signal for those who choose to use it honestly — same as Schema.org
structured data, which search engines use despite its spoofability.

"This will stigmatize AI-assisted content."

The granular levels (four values, not binary) allow publishers to distinguish
"AI helped me edit" from "AI wrote everything." The ai-assisted level
should carry no more stigma than acknowledging the use of a human copy
editor.

"Everything will be AI-touched soon, making this meaningless."

That is exactly why granularity matters. Binary "AI/not-AI" is already
inadequate. The spectrum from none to autonomous reflects the reality of
modern content workflows and remains meaningful as AI tools become ubiquitous.

Stakeholder Feedback

References

About

Explainer: AI Content Disclosure for HTML — element-level markup for AI authorship transparency

你的個人知識庫

AI使用揭露的HTML標準

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

dweekly/ai-content-disclosure

Folders and files

Latest commit

History

Repository files navigation

AI Content Disclosure for HTML

Authors

Participate

Table of Contents

Introduction

Problem / Motivation

Regulatory Context

Goals

Non-Goals

Proposed Solution

Page-Level Declaration (Meta Tag)

Element-Level Declaration (HTML Attribute)

Disclosure Values

Optional Metadata Attributes

Schema.org Integration (JSON-LD)

Key Scenarios

Scenario 1: News Article with AI Summary Sidebar

Scenario 2: Blog Post Written with AI Editing

Scenario 3: Fully Automated Content Feed

Scenario 4: Human-Only Publication Asserting Provenance

Detailed Design

Inheritance

Relationship to HTTP AI-Disclosure Header

Vocabulary Alignment with IPTC

What Counts as AI?

Alternatives Considered

1. Page-Level Meta Tag Only (WHATWG #9479)

2. HTTP Header Only (IETF AI-Disclosure)

3. C2PA Manifest for HTML

4. data-* Attributes Only

5. RDFa / Microdata Only

Privacy and Security Considerations

Accessibility Considerations

Internationalization Considerations

FAQ / Common Objections

"This is the evil bit — bad actors won't comply."

"Where do you draw the line with grammar checkers?"

"Metadata will be gamed like SEO dates."

"This will stigmatize AI-assisted content."

"Everything will be AI-touched soon, making this meaningless."

Stakeholder Feedback

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors

Footer

Footer navigation