如何確保AI程式碼代理的安全?

Hacker News·

本文探討了AI程式碼代理的安全風險,例如提示注入可能導致任意命令執行,並探討了諸如策略即代碼等潛在解決方案,以強制執行安全措施。

Image

These tools don’t just suggest code they can read local files and run shell commands. That’s very powerful, but it also means a prompt injection (or poisoned context) can turn a “helpful assistant” into something that looks a lot like an attacker’s shell.

I noticed that Cursor has publicly patched prompt-injection issues, including ones that opened paths to arbitrary command execution. Some security research is increasingly focused on “zero-click” prompt injection against AI agents.

The architectural problem I keep running into is that most guardrails today are opt-in (“use my tools”) rather than enforced (“you can’t do this operation”). If the agent decides to use a native tool directly, policy checks often don’t exist or don’t fire (There are bugs across Claude, Github Copilot and others that make enforcement a pain as well in todays atmosphere)

So I’m experimenting with a small proof-of-concept around policy-as-code for agent action that can for example,

  • block reads of sensitive files (.env, ~/.ssh/*, tokens)

  • require approval before risky shell commands run

  • keep an audit log of what the agent attempted

  • where supported, enforce decisions before execution rather than relying on the model’s cooperation

I’d really value input from people using these tools in real teams:

Would you install something that blocks or asks approval before an agent reads secrets or runs risky commands?

Would your company pay for centrally managed policies and audit logs?

What’s the least annoying UX that still counts as “real security”?

If you’ve seen real incidents or if you think this whole thing is dumb, inevitable, or already solved by containers, I’d would love your genuine take

Image

Hacker News

相關文章

  1. AI代理安全 – 為何您應該關注

    8 個月前

  2. 設計具備防禦提示注入能力的 AI 代理

    OpenAI · 大約 1 個月前

  3. 理解提示注入:一項前沿安全挑戰

    OpenAI · 6 個月前

  4. Cursor AI程式碼編輯軟體遭單一行提示攻擊劫持,可獲取本地Shell權限

    9 個月前

  5. 超越良好氛圍:透過設計確保AI代理的安全

    9 個月前