人工智能公司根据你的语言和BPE分词，收取60%更高的费用。

人工智能公司根据你的语言和BPE分词，收取60%更高的费用。
AI companies charge you 60% more based on your language, BPE tokens

原始链接: https://tokenstree.com/newsletter-article-5.html

## 人工智能的隐藏成本：在不知情的情况下多付钱人工智能公司正在悄悄地通过按*token*（子词单元）收费，而不是按词收费来抬高用户成本。这个看似微小的细节导致了显著的价格差异，相同的请求根据供应商（OpenAI、Google、Anthropic等）的不同，成本可能高出高达60%。核心问题是：**token缺乏标准化。** 每家公司使用自己的“tokenizer”（将文本分解为token的系统），导致对相同内容token数量的计算不同。这对于非英语语言影响尤其大，例如印地语和阿拉伯语，由于训练数据存在偏差，token成本比英语高*四到五倍*。目前，价格差距巨大——截至2026年3月，最便宜和最昂贵的模型之间高达420倍。这种不透明性类似于早期的云计算，供应商使用不可比的计算单位。 **TokensTree** 旨在通过提供跨供应商的token记账和“SafePaths”（缓存解决方案，可减少冗余计算并规范token成本）来解决这个问题，最终促进一个更透明、更高效的人工智能经济。他们旨在使token使用更加公平和环保。

一个黑客新闻的讨论指出，像OpenAI、Google和Anthropic这样的AI公司根据“token”而不是单词来收费，导致某些语言的成本显著提高（高达60%）。核心问题并非故意歧视，而是token的确定方式——很大程度上取决于模型训练数据中字符序列（“n-gram”）的频率。由于英语在训练数据中占主导地位，用token来表示相同的信息量所需的token数量比其他语言少。本质上，你支付更多是因为使用这种token化方法处理非英语文本在计算上更昂贵。评论员指出，像中文这样的表意文字缺乏数据，并质疑用更平衡的数据集重新训练模型是否可以缓解这个问题，但这样做可能在经济上不可行。一些人认为这种定价模式具有欺骗性，而另一些人则认为这合理地反映了实际的处理成本。

05 Deep Dive March 2026

How AI Companies Are Charging You More Without You Even Realizing It

You pay for what you use. That's the deal. Except it's not.

When you use an AI model — GPT-4, Claude, Gemini — you do not pay per word. You pay per token. And that tiny technical detail is quietly costing you, depending on which company you choose, up to 60% more for the exact same request.

60% Extra cost for non-English speakers

420× Price gap between cheapest & priciest model

0 Standardization across providers

What Is a Token, Really?

Before we get to the money, a crash course. Tokens are not words. They are subword units produced by a compression algorithm called BPE (Byte Pair Encoding) — originally a data-compression technique, repurposed for NLP in the 2010s. The algorithm learns frequent character sequences in a corpus and groups them into single vocabulary entries.

The catch: every AI company trains its own tokenizer on its own corpus with its own vocabulary size. The result is that the same word gets sliced differently depending on who's counting:

OpenAI · tiktoken

"unbelievable"

un believ able

Total tokens 3

Google · SentencePiece

"unbelievable"

▁un believable

Total tokens 2

Anthropic · Proprietary

"unbelievable"

un be liev able

Total tokens 4

Same word. Three different prices. The bill you receive depends not on what you said — but on which tokenizer counted it.

The Dirty Secret — Tokens Are Not Standardized

There is no ISO standard for AI tokens. No regulatory body. No published audit. Each major provider uses a different system:

OpenAI      → tiktoken (cl100k_base / o200k_base)    ~100k vocab
Google      → SentencePiece (older) + custom (Gemini) ~256k vocab
Anthropic   → Proprietary — barely documented          ~?? vocab
Meta LLaMA  → BPE                                      ~32k vocab
Mistral     → Custom BPE                               ~32k vocab

Anthropic's tokenizer is particularly opaque. There is no public specification, no open-source release, and the documentation amounts to a single paragraph in their pricing FAQ. You are billed by a black box.

The Language Tax

The most damaging consequence of non-standardized tokenization is what we call the Language Tax. English — specifically American English — was the dominant language in most training corpora. As a result, English tokenizes efficiently. Every other language pays a premium.

Language	Overhead vs English	Relative Cost
English	baseline	1.0×
Spanish	+62%	1.6×
French	+54%	1.5×
German	+62%	1.6×
Russian	+154%	2.5×
Arabic	+208%	3.1×
Hindi	+392%	4.9×

A Spanish speaker pays 60% more tokens for the same content. A Hindi speaker pays nearly 5× more. The pricing page lists the same dollar rate per million tokens — but the number of tokens you consume is quietly different depending on your language.

The Pricing War

On top of tokenization differences, the pricing gap between providers has exploded. As of March 2026:

Provider / Model	Input $/M	Output $/M	Note
Google Gemini Flash-Lite	$0.10	$0.40	Cheapest viable
Google Gemini 2.5 Pro	$1.25	$10	Strong value
OpenAI GPT-4o	$3	$10	Mainstream
Anthropic Claude Opus 4.6	$5	$25	Standard
Anthropic Claude Opus 4.6 (Fast)	$30	$150	Speed premium
OpenAI GPT-5.2 Pro (projected)	$21	$168	Most expensive

💸 420× Price Gap

Between GPT-5.2 Pro output ($168/M) and Gemini Flash-Lite ($0.40/M), there is a 420× price difference — for models both marketed as "AI assistants." The gap is real, and growing.

Same Prompt, Different Bill

Let's make this concrete. Take a real-world agent task: 100-word user message + 500-word system prompt + 200-word response. English vs Spanish, same content:

                       English     Spanish     Difference
──────────────────────────────────────────────────────────
User message (100w)  ~  130 tok  ~  210 tok
System prompt (500w) ~  650 tok  ~ 1,050 tok
Response (200w)      ~  260 tok  ~  404 tok
                    ─────────── ───────────
TOTAL                ~ 1,040 tok ~ 1,664 tok   +60%

At Claude Opus 4.6 rates:
  English:  ~$0.0052  (input) + ~$0.0065 (output)
  Spanish:  ~$0.0083  (input) + ~$0.0101 (output)
  Extra monthly cost for a Spanish-language app: significant.

This is not a rounding error. At scale — millions of agent calls per month — the language tax becomes a serious cost factor, and most teams discover it only after they've already committed to a provider and a language.

When Token Became Fake Currency

This pattern has happened before. When cloud computing emerged in the 2000s, every major provider invented their own unit of compute: AWS had EC2 hours, Azure had Credits, Google had Compute Units. Each defined differently. Each deliberately opaque. Comparison required a spreadsheet — and that friction always benefited the seller.

AI has recreated the same opacity with tokens. A "token" from OpenAI is not the same as a "token" from Anthropic, which is not the same as a "token" from Google. They share a name and nothing else.

The uncomfortable truth: Tokens are a brilliant business model. Abstract enough that most users don't think deeply about them. Defined differently by every player. Non-comparable by design. And confusion, in markets with asymmetric information, always benefits the seller.

The Solution: TokensTree

We built TokensTree precisely because this problem is structural — it won't be fixed by any single provider, because it's in their interest to maintain the fog. The answer has to be infrastructural.

Two mechanisms address this directly:

SafePaths with Remote Cache: Verified command paths are stored once and reused across agents. The first agent that solves a problem pays the full token cost. Every subsequent agent retrieves the cached result for a fraction of the tokens. Like Bazel build caching for AI knowledge — repeated computations are cached, shared, and reused. Token consumption drops. Latency drops. The language of the requesting agent becomes irrelevant to the token cost of the stored answer.

Cross-provider token accounting: TokensTree normalizes token counts across providers, so you can see what a task actually costs — not what each provider's tokenizer claims it costs. One dashboard. Real comparisons. No fog.

Every 1B tokens saved = 1 tree planted. When token efficiency is the mission, not just a talking point, the incentives align differently. We save tokens because it matters — for cost, for access equity, and for the planet.

If the language tax is the toll you pay at every call, tokenstree.eu is the route optimizer that finds the cheapest crossing before your prompt even reaches the tokenizer. It intercepts requests automatically — translating them into the most BPE-efficient encoding, sending them to the model, then returning the response in your language. Your French stays French. Your Spanish stays Spanish. The token count drops in the middle. That is what fighting the fog looks like in practice.

TokensTree is building the infrastructure for a more efficient AI economy. Token pricing data reflects publicly available rates as of March 2026 and is subject to change. Language tax ratios are approximate averages across common use cases, not guarantees for specific inputs. tokenstree.com