中文速成 II：角色循环

中文速成 II：角色循环
中文 Literacy Speedrun II: Character Cyclotron

原始链接: https://blog.kevinzwu.com/character-cyclotron/

## 识字速通II：摘要本文详细介绍了作者加速学习汉字的一种极端方法，摒弃了“通过阅读来学习阅读”的常见建议。作者认为90%的词汇覆盖率不足以实现真正的理解，因此目标是99%——这需要大量的、专注的记忆。由于在多个平台（抽认卡、词典、LLM）上查找汉字信息效率低下，作者构建了一个由LLM（Claude Code）驱动的自定义JavaScript扩展程序。这个“信息提供者”将字源、书法视频、笔顺和形态分析直接整合到抽认卡界面中，并通过复杂的键盘层访问。重点从阅读转移到*快速信息获取*，将查找过程从30秒缩短到不到一秒。作者形容自己“被信息轰炸”，为了接近完全掌握汉字，不断挑战认知处理的极限。最终目标是绕过自然的学习曲线，强行将符号直接输入记忆。

对不起。

中文 Literacy Speedrun II: Character Cyclotron | macroraptor

09 Apr, 2026

Part I: Cyborg Learning

A great lie of learning Chinese is that reading in a meaningful sense is possible before an advanced level of study. By December of 2025 I had reached a vocabulary of about 1000 characters. On a strict token decoding basis this meant I had about 90% token coverage of typical text. On a semantic understanding basis - well - what does it mean to miss 10% of tokens?

The main @@@@@ is that one out of every ten words doesn't mean a dropped @@@@@ every now and then. Typically, the most important words get @@@@@, and key passages can @@@@@ new words above the base rate. The secondary @@@@@ is a @@@@@ loop that never closes. @@@@@ acquisition is done via repetitive exposure, but low-frequency words need to be @@@@@ in advance to stick. @@@@@ new @@@@@ inline for the first time breaks the flow of reading. Both @@@@@ can be resolved by sticking to "n+1" @@@@@ of @@@@@ difficulty, but I'd rather read nothing at all than educational @@@@@.

With this in mind, I decided to go against the grain of the near-universal advice to "learn to read by reading". The goal was 99% coverage, or die trying. I was going to stop reading, and shove the symbols into my head by force, like cornmeal tunneled down a foie gras duck's throat to fatten its diseased liver.

First, I would need a better feeder.

flashcard before

The flashcard site I was using was called Hack Chinese. The front side of each flash card was the word, the back its pronunciation and definition. In a separate interface there was a dictionary listing character etymology and stroke order.

I would end up copy-pasting interesting words into the dictionary window to pull up the word entry. SLOW!

I would then click on the component characters to open their nested dictionary entries. SLOW!

If I needed to remember the stroke order, I would scroll down for the static display. SLOW!

When I was well and truly stuck, I would open an LLM chat window and paste the word into an extended "Explain This Word" chat. SLOW!

How to FAST?

I glanced at the huge expanse of empty white space in the back of the flashcard and had an idea: what if I never had to leave this window?

I opened Claude Code and started rambling into my mic. It wrote thousands of lines of questionably efficient JavaScript. I didn't read a single one.

I passed the beast whatever it needed. Character graphs by the Unicode consortium were traversed into one hundred thousand lines of phonetic mapping data. If I still couldn't understand at the end of the day, a morphology breakdown would be one keystroke and API call away. A guy on a forum had hired a calligrapher to write three thousand characters in ballpoint pen; I bet that calligrapher never expected an LLM pipeline to eat the videos for regurgitation and redisplay.

The extension injection was fragile, proprietary, and platform-specific. The beauty of the dawn of agentic coding was that I didn't care; it was so easy.

The existence of information is useless without clean, rapid access to that information.

I matched border radii and background colors until the injected panels were pixel-flush with the original interface. Hues, alignment, font weights, everything needed to be in the right place. The keyboard layer grew until I needed both hands to operate it. E for etymology, C for calligraphy, T for tones, numbers to select characters...

Eventually selecting which information to display wasn't the bottleneck either - loading it was. I couldn't figure out how to speed up the morphology LLM call; if the calligraphy video ran at more than double speed, I'd lose track; opening the dictionary required a new browser tab.

It was already in the same place. Why not the same time? While the API call was happening, I could kick off the dictionary lookup in a permanent dummy window, then flip to the video, then the morphology breakdown would be back from API to close the loop.

What used to take upwards of 30 seconds now took less than one. At this point keystrokes were no longer the limiting factor, as my brain was being firehosed with information. In one sleepy haze, I considered optimizing my brain out of the loop.

flashcard after

I had my feeder. I was about to be fed.

Part III: Symbolhead Syndrome

中文速成 II：角色循环 中文 Literacy Speedrun II: Character Cyclotron

中文速成 II：角色循环
中文 Literacy Speedrun II: Character Cyclotron