A2UI:Google 針對 AI 代理的聲明式使用者介面協定

A2UI:Google 針對 AI 代理的聲明式使用者介面協定

Hacker News·

Google 推出 A2UI 協定,這是一個開源規格,旨在標準化 AI 代理如何傳達其生成使用者介面的意圖。這種聲明式的 JSON 格式讓代理能夠描述 UI 組件和數據,而客戶端應用程式則負責原生渲染,從而解決純聊天介面的局限性。

The A2UI Protocol: A 2026 Complete Guide to Agent-Driven Interfaces

Image

The A2UI Protocol: A 2026 Complete Guide to Agent-Driven Interfaces

🎯 Core Points (TL;DR)

Table of Contents

What is A2UI and Why Does it Matter?

Defining the Agent-to-User Interface (A2UI)

The A2UI (Agent-to-User Interface) Protocol is an open-source specification introduced by Google to standardize how AI agents communicate their intent to generate a user interface [1]. It is designed to be a "universal UI language" that agents can "speak" to any client application [3].

Unlike traditional methods where an agent might output raw HTML or rely on a pre-built, static UI, A2UI uses a declarative JSON format to describe the required UI components (e.g., a Card, a TextField, a Button) and the data to populate them [4] [5]. The client application, which hosts the agent, then uses its own native widget set (e.g., React components, Flutter widgets) to render this JSON description [1].

This approach is a critical architectural shift, moving the responsibility of what to display to the agent, while keeping the control over how it is displayed firmly with the client application.

Why Pure Chat Fails: The "Chat Wall" Problem

Agentic systems often excel at generating text, but they quickly hit a limitation known as the "Chat Wall" when the task requires structured input or complex output [2].

For example, an agent tasked with booking a restaurant reservation would typically result in a slow, error-prone text exchange: "What day?" -> "What time?" -> "No availability, try another time?" [1]. This back-and-forth is inefficient and frustrating for the user.

A2UI solves this by allowing the agent to dynamically generate an Action Surface—a simple form with a date picker, time selector, and a submit button—at the exact moment it is needed [2]. This transformation from a purely conversational surface to a structured interaction surface is what defines the next generation of agent-driven applications.

The Core Philosophy: Security, Decoupling, and Portability

The design of A2UI is rooted in three core principles that address the challenges of building multi-agent systems that operate across different platforms and trust boundaries [5].

How A2UI Achieves Security (Data vs. Code)

The most significant innovation of A2UI is its security-first approach, which is essential in a "multi-agent mesh" where agents from different, potentially untrusted, organizations collaborate [1].

Historically, allowing a remote source to render UI meant sending executable code (HTML/JavaScript) and sandboxing it in an iframe, which is visually disjointed and introduces security complexity [1]. A2UI avoids this by being a declarative data format, not executable code [5].

The client application maintains a "catalog" of trusted, pre-approved UI components (e.g., Card, Button, TextField). The agent can only request to render components from this catalog. This mechanism helps to reduce the risk of UI injection and other vulnerabilities, making the agent's output "safe like data, but expressive like code" [1] [4].

The A2UI Interaction Loop: Emit, Render, Signal, Reason

A2UI introduces a dynamic, living interaction loop that moves away from the static request-response model of traditional chat [2]. This loop defines the contract between the agent and the client:

📊 Implementation Flow

A2UI vs. HTML/iFrames: A Trust Boundary Comparison

The fundamental difference between A2UI and older methods of remote UI rendering lies in the trust boundary and control [3].

A2UI in the Agent Ecosystem

A2UI is not intended to replace existing agent frameworks but to serve as a specialized protocol for interoperable, generative UI responses [1]. It fits into a broader ecosystem of agent communication standards.

A2UI and AG-UI: Complementary Layers

The Agent-User Interaction Protocol (AG-UI), developed by CopilotKit, is often discussed alongside A2UI. They are complementary layers, not competing technologies [4].

In practice, an agent might use A2UI to describe the UI intent, and AG-UI to stream that intent to the client and receive the user's structured response back [4].

Real-World Experience: Building Custom Renderers

Experience (E): Developers are already adopting A2UI's core pattern. Even before official support for all frameworks, the concept of a declarative, agent-generated UI is proving valuable.

💡 Pro Tip

A2UI's Extensibility: The protocol's design allows for custom components. If a client needs a specialized widget (e.g., a custom chart or a Google Maps component), the agent can request it, provided the client has registered that component in its trusted catalog [1]. This is crucial for enterprise applications.

One developer, for instance, implemented a custom React renderer based on the A2UI pattern to fix their AI Shopping Agent, LogiCart [6]. The agent dynamically switches the UI based on user intent: a "Product Search" intent renders a Comparison Table, while a "DIY Project" intent renders an interactive Timeline/Checklist [6]. This real-world application demonstrates the power of the agent deciding the interface structure on the fly.

⚠️ Note

UX Consistency Challenge: A key concern raised by the developer community is maintaining a consistent user experience (UX) when the UI is dynamically generated [4]. If the interface changes drastically with every interaction, users may struggle to learn the application, potentially hindering the development of "expert" users [4]. Developers must ensure their client-side renderers enforce strong design system rules to maintain brand and functional consistency.

🤔 Common Questions (FAQ)

Q: Is A2UI Just Reinventing HTML?

A: No. While A2UI aims to achieve cross-platform UI, its core goal is to solve the problems of trust boundaries and LLM generation [3]. HTML is an imperative language that includes executable code (JavaScript), which poses significant security risks in multi-agent environments. A2UI is a declarative data format (JSON) that only describes the intent of the UI, not the implementation. The client uses its native, trusted components to render it, ensuring security, performance, and native appearance [1] [3].

Q: Does A2UI Replace Existing UI Frameworks Like React or Flutter?

A: No. A2UI is a protocol, not a framework [5]. It is complementary to existing UI frameworks. React, Flutter, Angular, and other frameworks remain the foundation for building client applications and A2UI renderers [1] [5]. A2UI JSON messages are sent to renderers running within these frameworks, and the renderers are responsible for mapping abstract A2UI components to concrete, native or web UI components [5].

Conclusion and Action Advice

The A2UI Protocol represents a significant step forward in the evolution of agentic applications, providing the missing link between powerful LLM reasoning and structured, interactive user experiences. By prioritizing security, decoupling the agent's intent from the client's implementation, and embracing a streamable, LLM-friendly format, A2UI enables developers to move beyond the "Chat Wall" and build truly dynamic, cross-platform agent interfaces.

Action Advice:

References

An open protocol for agent interoperability, enabling seamless collaboration between AI agents across platforms.

Protocol

Resources

About

© 2025 A2A Protocol. All rights reserved.

Hacker News

相關文章

  1. 完整開發者教學:使用 A2UI 和 A2A 協議建構 AI Agent 使用者介面

    3 個月前

  2. A2A協定解析:AI代理程式如何在系統間進行溝通

    CodiLime · 2 個月前

  3. 利用 A2UI 自訂元件目錄建構 AI 生成儀表板:2026 指南

    3 個月前

  4. 概述 - A2A 協議

    a2a-protocol.org · 2 個月前

  5. Show HN:Use-AI - 輕鬆為 React 應用程式添加 AI 自動化

    3 個月前