/fast when you need speed for interactive work like rapid iteration or live debugging, and toggle it off when cost matters more than latency.
Fast mode is not a different model. It uses the same Opus 4.6 with a different API configuration that prioritizes speed over cost efficiency. You get identical quality and capabilities, just faster responses.
What to know:
- Use
/fastto toggle on fast mode in Claude Code CLI. Also available via/fastin Claude Code VS Code Extension. - Fast mode for Opus 4.6 pricing starts at $30/150 MTok. Fast mode is available at a 50% discount for all plans until 11:59pm PT on February 16.
- Available to all Claude Code users on subscription plans (Pro/Max/Team/Enterprise) and Claude Console.
- For Claude Code users on subscription plans (Pro/Max/Team/Enterprise), fast mode is available via extra usage only and not included in the subscription rate limits.
Toggle fast mode
Toggle fast mode in either of these ways:- Type
/fastand press Tab to toggle on or off - Set
"fastMode": truein your user settings file
- If you’re on a different model, Claude Code automatically switches to Opus 4.6
- You’ll see a confirmation message: “Fast mode ON”
- A small
↯icon appears next to the prompt while fast mode is active - Run
/fastagain at any time to check whether fast mode is on or off
/fast again, you remain on Opus 4.6. The model does not revert to your previous model. To switch to a different model, use /model.
Understand the cost tradeoff
Fast mode has higher per-token pricing than standard Opus 4.6:| Mode | Input (MTok) | Output (MTok) |
|---|---|---|
| Fast mode on Opus 4.6 (<200K) | $30 | $150 |
| Fast mode on Opus 4.6 (>200K) | $60 | $225 |
Decide when to use fast mode
Fast mode is best for interactive work where response latency matters more than cost:- Rapid iteration on code changes
- Live debugging sessions
- Time-sensitive work with tight deadlines
- Long autonomous tasks where speed matters less
- Batch processing or CI/CD pipelines
- Cost-sensitive workloads
Fast mode vs effort level
Fast mode and effort level both affect response speed, but differently:| Setting | Effect |
|---|---|
| Fast mode | Same model quality, lower latency, higher cost |
| Lower effort level | Less thinking time, faster responses, potentially lower quality on complex tasks |
Requirements
Fast mode requires all of the following:- Not available on third-party cloud providers: fast mode is not available on Amazon Bedrock, Google Vertex AI, or Microsoft Azure Foundry. Fast mode is available through the Anthropic Console API and for Claude subscription plans using extra usage.
- Extra usage enabled: your account must have extra usage enabled, which allows billing beyond your plan’s included usage. For individual accounts, enable this in your Console billing settings. For Teams and Enterprise, an admin must enable extra usage for the organization.
Fast mode usage is billed directly to extra usage, even if you have remaining usage on your plan. This means fast mode tokens do not count against your plan’s included usage and are charged at the fast mode rate from the first token.
- Admin enablement for Teams and Enterprise: fast mode is disabled by default for Teams and Enterprise organizations. An admin must explicitly enable fast mode before users can access it.
If your admin has not enabled fast mode for your organization, the /fast command will show “Fast mode has been disabled by your organization.”
Enable fast mode for your organization
Admins can enable fast mode in:Handle rate limits
Fast mode has separate rate limits from standard Opus 4.6. When you hit the fast mode rate limit or run out of extra usage credits:- Fast mode automatically falls back to standard Opus 4.6
- The
↯icon turns gray to indicate cooldown - You continue working at standard speed and pricing
- When the cooldown expires, fast mode automatically re-enables
/fast again.
Research preview
Fast mode is a research preview feature. This means:- The feature may change based on feedback
- Availability and pricing are subject to change
- The underlying API configuration may evolve