AI Catchup

Claude Opus 4.8 Fast Mode: 2.5x Faster Output Tokens in Research Preview

By 4 min read

Anthropic launched Fast mode for Claude Opus 4.8 in research preview, promising 2.5x faster output token speeds with the same Opus-level intelligence. It is available now in Claude Code for developers with extra usage enabled, and on the Claude Platform API through an account manager or a waitlist form.

Anthropic launched Fast mode for Claude Opus 4.8 in research preview. The pitch on the official page is "the same Opus-level model intelligence with significantly faster speeds," quantified for this generation as "2.5x faster output token speeds" compared to standard Opus 4.8. It is available now in Claude Code for developers with extra usage enabled, and on the Claude Platform API through an account manager or a waitlist.

This extends the Fast mode story we covered when Claude Code 2.1.142 moved Fast mode's default to Opus 4.7; Opus 4.8 is now the model the feature targets. For the model generation context, see the Opus 4.7 launch, and for how Claude Code stacks up against the alternative most readers weigh it against, our Cursor vs Claude Code comparison is the anchor.

Key Takeaways

  • 2.5x faster output tokens. Anthropic's Fast mode page cites 2.5x faster output token speeds for Opus 4.8 versus standard Opus 4.8.
  • Same intelligence, more speed. Fast mode is positioned as "the same Opus-level model intelligence with significantly faster speeds," not a smaller or distilled model.
  • Research preview, not GA. Anthropic describes Fast mode as a research preview with restricted availability.
  • Claude Code: enable with extra usage. It is available now for all developers who have extra usage enabled; toggle it in Claude Code with /fast.
  • API: account manager or waitlist. Claude Platform API access is a limited research preview; contact your account manager or join the waitlist form linked from the page.
  • No cost claims on the page. The Fast mode page states no pricing or premium fee for Fast mode, so treat any cost difference as undocumented.

What Fast Mode Does for Opus 4.8

Fast mode is Anthropic's option to run Opus 4.8 with substantially faster output without dropping to a smaller model. The Fast mode page describes it as delivering "the same Opus-level model intelligence with significantly faster speeds," and for Opus 4.8 it puts a number on the speed: "2.5x faster output token speeds" compared to standard Opus 4.8. The headline benefit is latency on long generations -- the part of an agent loop where you wait on tokens streaming back.

That framing matters because speed gains often come from a weaker model. Anthropic's claim here is the opposite: the intelligence is the same Opus generation, and the lever is throughput. For interactive coding, faster output tokens translate directly into shorter waits between a prompt and a usable diff.

How to Enable It in Claude Code

Per Anthropic, Fast mode is "available now in research preview for all developers with extra usage enabled" in Claude Code. There is no separate activation flow described beyond having extra usage enabled on the account. In Claude Code you switch it on with the /fast command, which toggles the faster Opus variant for the session.

If you already use Claude Code's Fast mode, the change is which model it targets: the feature now centers on Opus 4.8 rather than the earlier Opus 4.7 default.

How to Get It on the API

API access is more gated. Anthropic describes the Claude Platform path as a "limited research preview." Access "requires contacting your account manager," and teams without an account manager can "join the waitlist for fast mode by completing this form" linked from the Fast mode page. In other words, the Claude Code path is self-serve for accounts with extra usage, while the raw API path is request-based.

Availability

The table below summarizes what Anthropic documents on the Fast mode page.

SurfaceStatusHow to access
Claude CodeResearch preview, available nowExtra usage enabled; toggle with /fast
Claude Platform APILimited research previewContact account manager, or join the waitlist form

When to Use It

Reach for Fast mode when latency is the bottleneck and you are already on Opus 4.8: long code generations, agentic loops with many turns, or any workflow where you sit watching tokens stream. Because Anthropic positions it as the same intelligence at higher speed, there is no quality reason to avoid it for those cases. The gating factors are practical -- extra usage enabled in Claude Code, or an account-manager relationship or waitlist slot for the API. The page does not state a cost difference, so confirm billing details with Anthropic before standardizing a team on it.

Sources

Keep building the workspace playbook

Frequently Asked Questions

What is Fast mode for Claude Opus 4.8?

Anthropic's Fast mode page describes Fast mode as delivering the same Opus-level model intelligence with significantly faster speeds. For Opus 4.8 specifically, it cites 2.5x faster output token speeds compared to standard Opus 4.8. It is in research preview.

How do you enable Fast mode in Claude Code?

Per Anthropic, Fast mode is available now in research preview for all developers with extra usage enabled in Claude Code. In Claude Code you toggle it with the /fast command. No special activation process beyond having extra usage enabled is mentioned.

How do you get Fast mode on the Claude Platform API?

Anthropic says API access is a limited research preview. Access requires contacting your account manager, and teams without an account manager can join the Fast mode waitlist by completing the form linked from the Fast mode page.

Does Fast mode change Opus 4.8's quality or cost?

Anthropic says Fast mode delivers the same Opus-level model intelligence with significantly faster speeds, so it is positioned as a speed change rather than a quality change. The Fast mode page states no pricing or cost claims, so any cost difference is not documented there.

Get the weekly AI Catchup

Tools, practices, and what matters -- in your inbox every Monday.