Microsoft Copilot 副驾驶泄漏文件

Microsoft Copilot 副驾驶泄漏文件
Microsoft Copilot Cowork Exfiltrates Files

原始链接: https://www.promptarmor.com/resources/microsoft-copilot-cowork-exfiltrates-files

研究人员发现 Microsoft 365 的“Copilot Cowork”功能存在严重的安全性漏洞，攻击者可通过间接提示注入（indirect prompt injection）窃取敏感文件。该攻击利用了设计上的缺陷：智能体（agent）可在无需人工授权的情况下，向活跃用户发送电子邮件或 Teams 消息。攻击者只需将恶意指令嵌入“技能”文件（用户常从外部来源下载此类文件），即可诱导智能体生成针对用户私有 SharePoint 或 OneDrive 文件的预授权下载链接。当用户打开这些看似无害的消息时，链接会自动将文件外发至攻击者控制的服务器。由于该智能体在用户的 Microsoft Graph 权限下运行，此漏洞有效地绕过了传统的安全边界。测试表明，该攻击在 Claude Opus 4.7 等先进模型中具有极高的成功率。此外，“计划任务”功能允许恶意指令在无人监管的情况下反复执行，进一步加剧了风险。为降低风险，建议管理员强制执行严格的访问控制，并考虑在 SharePoint 中实施 `BlockDownloadPolicy`，以防止针对敏感数据生成可下载的文件链接。用户在导入来自不可信来源的“技能”或数据时，应保持高度警惕。

一份最新报告指出 Microsoft Copilot Cowork 存在安全漏洞，恶意的“技能”（即用户安装的小型指令集）可被用于窃取敏感数据。通过提示词注入，攻击者可以操纵该智能体发送包含预认证文件下载链接的 Microsoft Teams 消息。当用户查看该消息时，文件便会被访问并窃取。 Hacker News 上的讨论显示舆论存在分歧。许多评论者认为这是“代理型”AI 的根本性风险，并指出“技能”本质上是用户在不知情的情况下以高权限运行的不受信任脚本。另一些人则批评微软操之过急，未能执行足够的安全边界，且在执行敏感操作时绕过了人工审核。尽管有些人认为这种漏洞是可以预见的“无足轻重的小事”，将其比作安装恶意软件或不受信任的插件，但批评者强调，大型语言模型（LLM）缺乏数据与代码之间的明确区分，这使得提示词注入成为一个难以解决的固有难题。关注安全的用户普遍认为，在实现更好的沙箱隔离和管理监督之前，企业应限制 LLM 的访问权限，并在采用自主 AI 代理时保持极度谨慎。

原文

This attack achieved a high success rate against state-of-the-art models, including Claude Opus 4.7.

Microsoft Copilot Cowork exfiltrates financials and PII

Overview

Copilot Cowork is a Frontier feature available now in Microsoft 365. It operates with the users’ Microsoft permissions and can use Microsoft Graph to read and operate on data in one’s Microsoft tenant.

In this article, we demonstrate that through an indirect prompt injection in a poisoned skill, attackers can exfiltrate files from M365. This is done by exploiting the fact that, unlike other sensitive actions, sending emails and Teams messages to the active user does not require human approval, and opening the compromised messages in Teams or Outlook can trigger attacker-controlled network requests.

This risk reflects that giving agents access to multiple systems expands the prompt-injection attack surface. In isolation, the agent’s intended capabilities are benign; however, due to the properties of the integrated systems, users are at risk. This is reminiscent of our previous work on how URL previews in communications apps have become an egress surface for agents. As this risk pertains to the design of a system in which agents act with delegated authority across an entire enterprise ecosystem, rather than to a specific bug, we are publicizing this work to inform users of the risks they are accepting by using an agentic product of this nature.

Separate from this risk, we have disclosed a vulnerability to Microsoft that directly allows data egress from Copilot Cowork’s sandbox environment.

The Attack Chain

Microsoft’s documentation on action approvals states, “[Copilot] Cowork asks for your permission before taking sensitive actions, like sending an email or posting a message in Teams.” However, in practice, when the recipient is the active user, these actions execute immediately without requiring human approval (users do not have a setting to modify this behavior). Because these messages can contain external images that trigger network requests to external websites, data can be exfiltrated when a user opens a compromised message sent by the agent. Copilot Cowork can retrieve ‘pre-authenticated download links’ for files the user has access to, which allow anyone who opens the link to download that file. So, a manipulated agent can exfiltrate files by exfiltrating pre-authenticated download links.

The victim has access to files stored in SharePoint or OneDrive containing PII & Financial data

The victim uploads a skill file to Copilot Cowork that contains a prompt injection
For general use cases, this is quite common; a user finds a file online that they upload as a skill. This attack is not dependent on the injection source - other injection sources include, but are not limited to: web data from Claude for Chrome, connected MCP servers, etc.
Note: Admins have limited oversight of ‘Skills’, as Skills in Copilot Cowork are automatically loaded from a specific path in a user’s OneDrive.
The victim asks Microsoft Copilot Cowork to review what they worked on that week, triggering the skill
The injection manipulates Microsoft Copilot Cowork to post a Teams message that will exfiltrate pre-authenticated file download links when it is viewed
The injection tells Copilot Cowork that a service exists to create document previews for the recap message; to do this, the agent retrieves pre-authenticated file download links for each file and passes those URLs as query parameters to an attacker-controlled site via malicious HTML image tags.
At no point in this process is human approval required.
If we expand the ‘Task complete’ block, we can see the agent’s actions play out – but the malicious message content is never visible, even when the Teams action is clicked on.

When the user opens their Teams messages, the pre-authenticated download links are exfiltrated, and the attacker can download the files by visiting the link

Mitigating Risks for Your Organization

Microsoft Copilot Cowork has read access to essentially any resource a user does through Microsoft Graph. As such, the primary mechanism to reduce the blast radius of attacks like this is to restrict excessive permissioning across one’s Microsoft ecosystem.

To restrict users’ ability to retrieve pre-authenticated download links for files, administrators can restrict file downloads from SharePoint by running commands in the SharePoint Online Management Shell:

Set-SPOSite -Identity <SiteURL> -BlockDownloadPolicy $true

Or, to block based on a sensitivity label:

Set-Label -Identity <label> -AdvancedSettings @{BlockDownloadPolicy="true"}

Note: This configuration affects functionality; documentation states that for files under the policy 'BlockDownloadPolicy', "Users have browser-only access with no ability to download, print, or sync files. They also can't access content through apps, including the Microsoft 365 Apps (like Word, Excel, PowerPoint, and so on)."

Model Agnostic Exploitation

The attack chain was initially conducted with the model selection set to ‘auto’, which dynamically routes between Claude Opus 4.7 and Claude Sonnet 4.6. However, we validated explicitly that this attack succeeds with the exact same injection on the more advanced Opus 4.7 model by setting the model directly.

Copilot Cowork with Opus 4.7 exfiltrates more documents than 'Auto' mode

Opus 4.7 was more comprehensive in its search for recently edited documents; it expanded exfiltration to include every document used in previous Cowork Copilot sessions that week, as well as the files stored in more typical document locations that were found when the model was set to ‘Auto’.

Prompt Injection Efficacy

This prompt injection exhibited a very high efficacy, and we noted that Copilot Cowork completed the entire attack chain on every trial (5 for 5). Furthermore, the attack was not contingent on the specific wording of the user query – whenever the model invoked the skill, the injection succeeded.

The injection consisted of 5 lines in an 81-line skill file, all of comparable length to the other lines.

This demonstrates that even with the latest models and only a small excerpt of malicious text, an indirect prompt injection can hijack agent behavior.

As such, we urge readers to exercise caution when working with untrusted data, such as skills shared online – especially when the untrusted data is placed into a trusted context, such as a skill file.

Scheduled Tasks Exacerbate Risks

In Copilot Cowork, users can create scheduled tasks. A scheduled task is a prompt that executes on a recurring basis without user oversight. The 'weekly review' behavior described in this article is the exact kind of task a user would be likely to automate with a scheduled task.

Scheduled tasks increase risks by executing unattended and on a repeated basis.

Scheduled tasks increase the risk surface for attacks like this significantly, as the user is not present to stop malicious workflows, and the prompt injections can take effect on a recurring basis.