AI使用揭露的HTML標準
一項提議新的HTML屬性和meta標籤,以實現網頁內容中AI參與的元素級別揭露,解決了缺乏細粒度透明度的問題。
Navigation Menu
Search code, repositories, users, issues, pull requests...
Provide feedback
We read every piece of feedback, and take your input very seriously.
Saved searches
Use saved searches to filter your results more quickly
To see all available qualifiers, see our documentation.
Explainer: AI Content Disclosure for HTML — element-level markup for AI authorship transparency
License
dweekly/ai-content-disclosure
Folders and files
Latest commit
History
Repository files navigation
AI Content Disclosure for HTML
Authors
Participate
Table of Contents
Introduction
Web pages increasingly contain text produced with varying degrees of AI
involvement — from light AI-assisted editing to fully autonomous generation.
There is currently no standard HTML mechanism for authors to disclose AI
involvement at element-level granularity within a page.
This explainer proposes an aidisclosure HTML attribute and a companion
<meta name="ai-disclosure"> tag, enabling authors to declare the degree
of AI involvement in any section of a web page.
Problem / Motivation
A modern news article page might contain a human-written investigation
alongside an AI-generated summary sidebar and AI-moderated user comments.
Today, there is no standard way to label these sections differently.
Existing approaches operate at coarser granularity:
WHATWG HTML #9479 proposes
a page-level <meta> tag with four values. It does not support marking
individual elements. Commenters on that issue (42+) identified element-level
granularity as the critical missing capability.
IETF draft-abaris-aicdh-00
defines an AI-Disclosure HTTP response header. It applies to entire
HTTP responses and cannot distinguish mixed content within a page.
C2PA 2.2 provides cryptographic provenance
for media files (images, video, audio). It does not support HTML text
content and is designed for file-level, not element-level, assertions.
Regulatory Context
The EU AI Act Article 50
(effective August 2026) requires that AI-generated text content be "marked in
a machine-readable format and detectable as artificially generated or
manipulated." Major platforms (YouTube, Meta, TikTok) already require AI
disclosure in their policies. A standard mechanism would serve both regulatory
compliance and voluntary transparency.
Goals
Non-Goals
Proposed Solution
Page-Level Declaration (Meta Tag)
For pages with uniform AI involvement:
The value mixed signals that different sections have different levels;
inspect element-level attributes for detail.
Element-Level Declaration (HTML Attribute)
An aidisclosure global attribute on any HTML element:
Disclosure Values
Four values, aligned with the IETF AI-Disclosure header and IPTC Digital
Source Type vocabulary:
Optional Metadata Attributes
Schema.org Integration (JSON-LD)
For search engine discoverability, the same information can be expressed as
structured data. The simplest form is just the level string:
An expanded form supports optional metadata. All fields except level are
strictly optional — publishers may have legitimate reasons not to disclose
specific tools or providers:
(Proposed as a comment on
schemaorg/schemaorg#3391.)
Key Scenarios
Scenario 1: News Article with AI Summary Sidebar
A newsroom publishes an investigative piece with a human-written article
and an AI-generated summary:
Scenario 2: Blog Post Written with AI Editing
A blogger writes a post and uses an LLM for grammar, style, and clarity
improvements:
Scenario 3: Fully Automated Content Feed
An automated system generates weather reports without per-instance human
oversight:
Scenario 4: Human-Only Publication Asserting Provenance
A literary journal positively asserts that no AI was used:
Note: aidisclosure="none" is a positive assertion. The absence of the
attribute means "unknown," not "none."
Detailed Design
Inheritance
Relationship to HTTP AI-Disclosure Header
These are complementary. A CDN or reverse proxy can set the HTTP header;
a CMS can set the meta tag; an author or AI tool can set element-level
attributes. None supersedes the others.
Vocabulary Alignment with IPTC
What Counts as AI?
To address a frequently raised concern ("where do you draw the line with
grammar checkers?"), here is boundary guidance:
Not AI (no disclosure needed):
ai-assisted:
ai-generated:
autonomous:
The boundary is generative/inferential AI — systems trained on data
that produce novel outputs. Deterministic tools that apply fixed rules are
not covered.
Alternatives Considered
1. Page-Level Meta Tag Only (WHATWG #9479)
Does not handle mixed-content pages — the single most requested feature in
that issue's 42+ comments.
2. HTTP Header Only (IETF AI-Disclosure)
No element-level granularity. Not accessible to client-side tools processing
the DOM. Cannot distinguish mixed content within a page.
3. C2PA Manifest for HTML
C2PA is file-based cryptographic provenance. HTML pages are dynamically
assembled from templates, databases, and user input — they are not single
files with stable hashes. C2PA and this proposal are complementary, not
competing.
4. data-* Attributes Only
Using data-ai-disclosure instead of a dedicated attribute was considered.
Trade-off: data-* attributes have no semantic meaning to browsers or
assistive technology. A dedicated attribute signals intent for future browser
integration (e.g., address bar indicators, accessibility announcements) and
is consistent with how other proposals (like containertiming) have
proceeded.
5. RDFa / Microdata Only
Too verbose for common cases. RDFa requires namespace declarations and
multi-attribute markup for simple assertions. The aidisclosure attribute
provides a lightweight default; RDFa or JSON-LD can supplement it for richer
structured data needs.
Privacy and Security Considerations
Responses to the W3C Security and Privacy Self-Review Questionnaire:
Accessibility Considerations
Internationalization Considerations
FAQ / Common Objections
"This is the evil bit — bad actors won't comply."
This standard serves responsible publishers, regulated industries, and AI
tool vendors who want to be transparent. It is not a detection mechanism.
The EU AI Act makes compliance mandatory for covered entities, and major
platforms already require disclosure in their terms of service.
The analogy is rel=nofollow: voluntary, widely adopted because it aligns
incentives, and useful despite being ignorable by bad actors.
"Where do you draw the line with grammar checkers?"
See What Counts as AI? above. The boundary is
generative/inferential AI — systems trained on data that produce novel
outputs. Deterministic spell-check and thesaurus tools are excluded.
"Metadata will be gamed like SEO dates."
True for any self-declared metadata. The standard enables honest disclosure;
verification requires pairing with C2PA or regulatory auditing. The value is
in the signal for those who choose to use it honestly — same as Schema.org
structured data, which search engines use despite its spoofability.
"This will stigmatize AI-assisted content."
The granular levels (four values, not binary) allow publishers to distinguish
"AI helped me edit" from "AI wrote everything." The ai-assisted level
should carry no more stigma than acknowledging the use of a human copy
editor.
"Everything will be AI-touched soon, making this meaningless."
That is exactly why granularity matters. Binary "AI/not-AI" is already
inadequate. The spectrum from none to autonomous reflects the reality of
modern content workflows and remains meaningful as AI tools become ubiquitous.
Stakeholder Feedback
References
About
Explainer: AI Content Disclosure for HTML — element-level markup for AI authorship transparency
Resources
License
Uh oh!
There was an error while loading. Please reload this page.
Stars
Watchers
Forks
Releases
Packages
0
Contributors
2
Footer
Footer navigation
相關文章
其他收藏 · 0