可以使用代理定义和子代理组合来绕过计费。

可以使用代理定义和子代理组合来绕过计费。
Billing can be bypassed using a combo of subagents with an agent definition

原始链接: https://github.com/microsoft/vscode/issues/292452

## Copilot 计费绕过总结存在一种潜在方法，可以通过利用子代理和特定模型配置来绕过 Copilot 的高级请求计费。其核心原理是，使用免费模型（如 GPT-5 Mini）的初始聊天请求不会消耗高级额度。该免费模型随后可以启动配置为使用强大、通常付费模型（如 Opus 4.5）的子代理。关键在于，在子代理上下文中执行的高级请求不会被计费。这使得能够以仅初始免费请求的成本，广泛使用昂贵模型。通过设置较高的 `chat.agent.maxRequests` 值并编写重复的工具调用脚本，可以进一步滥用，创建持续调用高级模型的循环，从而以极低的成本实现。该报告还强调了与消息类型验证相关的潜在 API 漏洞，这些漏洞可能有助于直接的 API 滥用。此问题已报告给微软安全响应中心 (MSRC)，但被认为超出其范围，因此公开披露。

一个 Hacker News 的讨论围绕着一个 GitHub 仓库中发现的，绕过微软高级人工智能模型计费系统的报告。最初的报告指出，存在可通过子代理和代理定义利用的漏洞。然而，讨论迅速转移到对“氛围工程师”可能利用此类问题以及由于大量（且可能恶意）贡献而导致维护开源项目日益困难的担忧。许多评论者对微软最近的软件质量表示沮丧，并指出频繁的 Azure 中断以及过去 15 多年来产品可靠性普遍下降。有趣的是，一些人指出微软似乎对 Windows 和 Office 现成的盗版工具持容忍态度，暗示即使这意味着接受未经授权的软件使用，微软也更倾向于将其用户保留在其生态系统中。总体基调是对微软当前开发实践和质量控制的批评。

原文

Summary

It's possible in Copilot to bypass any billing / 'premium request' usage by taking advantage of:

Subagents and tool calls not consuming any 'requests'.
Request cost being calculated on the initial model used.
"Free" models incl. in Copilot e.g. GPT-5-mini, GPT-4.1 etc.
Ability to define an agent for a subagent.
Ability to specify a model for an agent.

Combining these correctly results in 'free' and almost unlimited, usage of expensive premium models like Opus 4.5 which would usually cost '3 premium requests':

Instructions

Start a new Chat.
Set the model to a "free" model, included in Copilot e.g. GPT-5 Mini.
Create an agent, and set it's model to a premium model, e.g. Opus 4.5
Set the mode to "agent".
In the initial message, instruct it to launch an agent '[your_agents_name_here]' as a subagent using the runSubagent tool, and to pass on the following query e.g. "What time is it in London, UK".
Submit the message.
The initial request will be picked up by the free GPT-5 Mini model, incurring no fees.
The free model will create a subagent (which is also free)
The free subagent will launch with an 'agent' profile, this profile has the model set to a premium model
The premium model will be used for the subagent - but premium requests will be consumed.

Example 1

Example Chat Message:

/ask-opus Make a todolist app.

Example Prompt File:
.github/prompts/ask-opus.prompt.md

---
name: ask-opus
description: Run a query in a subagent that uses the Opus-4.5 model.
model: GPT-5 mini (copilot)
agent: agent
---
<USER_REQUEST_INSTRUCTIONS>
Call #tool:agent/runSubagent - include the following args:
- agentName: "opus-agent"
- prompt: $USER_QUERY
</USER_REQUEST_INSTRUCTIONS>

<USER_REQUEST_RULES>
- You can call the 'subagent' defined in 'USER_REQUEST_INSTRUCTIONS' as many times as needed to fulfill the user's request.
- It's recommended you use the subagent to help you decide how best to respond and/or complete the task (because it is a larger model than you) including how best to break the task down into smaller steps if needed.
- Use the subagent for all todos/tasks/queries, do not perform any task or respond to any query yourself, you are just an orchestrator.
- Do not manipulate/summarize subagent responses to save on tokens, always be comprehensive and verbose.
- Do not evaluate or respond to the remainder of this message, the subagent is responsible for all further content.
</USER_REQUEST_RULES>

--- USER_REQUEST_START ---

Example Agent File
.github/agents/opus.agent.md

---
name: opus-agent
description: An AI agent that assists a user with a task or query.
argument-hint: Query or task to complete
model: Claude Opus 4.5 (copilot)
---
Respond to the user's query/task ($ARGUMENTS) in comprehensively and accurately.

Example 2

Another vector for abuse - albeit requiring more effort is:

Set chat.agent.maxRequests to a high value.
Use a premium model e.g. Opus 4.5 as the initial model for the chat session.
Build a custom script (not disclosed for safety), that you tell the model to call as part of a tool invocation.
Craft some prompts to direct the model to repeat the tool call(s).
The right script, with the right prompts can be tailored to create a loop, allowing the premium model to continually be invoked unlimited times for no additional cost beyond that of the initial message.

In my testing I had a single message result in a 3hr+ process that launched hundreds of Opus 4.5 subagents to process hundreds of files - and only consumed 3 premium credits. Had I not stopped it at 3hrs, it would have continued.

Related: I also noted the message 'types' are being declared on the client, inferring no API validation e.g: https://github.com/microsoft/vscode-copilot-chat/blob/main/src/extension/intents/node/toolCallingLoop.ts#L484

I believe this is another vector that allows for more blatant abuse directly against the API.

Note: Initially submitted this to MSRC (VULN-172488), MSRC insisted bypassing billing is outside of MSRC scope and instructed me multiple times to file as a public bug report.

Copilot Chat Extension Version: 0.37.2026013101
VS Code Version: 1.109.0-insider (Universal) - f3d99de
OS Version: OSX Tahoe 26.3
Feature: Agent / SubAgent

This is NOT the same issue as #252230
(My previous issue was auto closed by the bot and deferred to the above).