文本模式的谎言:为什么现代TUI对可访问性来说是个噩梦
The text mode lie: why modern TUIs are a nightmare for accessibility

原始链接: https://xogium.me/the-text-mode-lie-why-modern-tuis-are-a-nightmare-for-accessibility

## 终端可访问性:一个被打破的承诺 尽管普遍认为基于文本的界面天生具有可访问性,但现代终端用户界面 (TUI) 往往会*恶化*盲用户的使用体验。 命令行界面 (CLI) – 作为简单的文本流运行 – 与屏幕阅读器配合良好,而大多数 TUI 将终端视为二维网格,优先考虑视觉布局而非文本的顺序流。 Ink 和 Bubble Tea 等旨在简化 TUI 开发的框架,实际上会制造可访问性障碍。 它们为了更新(例如计时器)而不断重绘屏幕,导致屏幕阅读器在元素之间“跳跃”,并提供碎片化、无法使用的输出。 `gemini-cli` 等工具就是一个例子,在处理对话历史记录时可能导致屏幕阅读器崩溃或出现明显的延迟。 较旧的工具,如 `nano` 和 `menuconfig`,之所以成功,是因为它们允许隐藏光标或保持单列焦点,从而最大限度地减少干扰。 真正可访问的解决方案,如 Irssi,会利用终端硬件功能来实现高效的滚动。 然而,许多现代项目忽视了可访问性问题,这体现在被自动化机器人驳回的未解决的错误报告中。 核心问题是:优先考虑开发人员的便利性而非高效的文本渲染,最终为盲用户创造了无法访问的体验。 简单、线性的文本流仍然比视觉上复杂但功能上存在缺陷的 TUI 更易于使用。

黑客新闻 新的 | 过去的 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 文本模式的谎言:为什么现代TUI对可访问性来说是一场噩梦 (xogium.me) 16 分,由 SpyCoder77 发表于 32 分钟前 | 隐藏 | 过去的 | 收藏 | 讨论 帮助 考虑申请YC 2026年夏季项目!申请截止至5月4日 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系方式 搜索:
相关文章

原文

The mythical, it's text, so it's accessible

There is a persistent misconception among sighted developers: if an application runs in a terminal, it is inherently accessible. The logic assumes that because there are no graphics, no complex DOM, and no WebGL canvases, the content is just raw ASCII text that a screen reader can easily parse.

The reality is different. Most modern Text User Interfaces (TUIs) are often more hostile to accessibility than poorly coded graphical interfaces. The very tools designed to improve the Developer Experience (DX) in the terminal—frameworks like Ink (JS/React), Bubble Tea (Go), or tcell—are actively destroying the experience for blind users.

The Architectural Flaw: Stream vs. Grid

To understand the failure, we must distinguish between two distinct concepts often conflated under “terminal apps”: the CLI (Command Line Interface) and the TUI.

  1. The CLI (The Stream): This operates on a standard input/output model (stdin/stdout). You type a command, the system appends the result below, and the cursor moves down. This is linear and chronological. For a screen reader, specifically kernel-level readers like Speakup, this is ideal.

  2. The TUI (The Grid): This treats the terminal window not as a stream of text, but as a 2D grid of pixels, where every character cell is a pixel. It abandons the temporal flow for a spatial layout.

Case Study: The gemini-cli Madness

Let's look at a concrete example: gemini-cli, a tool written in Node.js using the Ink framework. On the surface, it looks like a simple chat interface. But underneath, Ink is trying to reconcile a React component tree into a terminal grid.

When you use this tool with Speakup (Linux) or NVDA (Windows), the application doesn't just fail; it actively spams you.

Because the framework treats the screen as a reactive canvas, every update triggers a redraw. When the AI is “thinking,” the tool updates a timer or a spinner. To do this, it moves the hardware cursor to the timer location, writes the new time, and moves it back.

For a sighted user, this happens instantly. For a screen reader user, this is what you hear: “Responding... Time elapsed 1s... Responding... Time elapsed 2s... [Fragment of chat history]... Responding...”

It drives the screen reader mad. The cursor is teleporting all over the screen to update status indicators, spinners, and history. Speakup tries to read whatever is under the cursor at that exact millisecond. You end up hearing random bits of conversation mixed with timer updates, making it impossible to focus on what you are actually typing.

Worse, lets pretend that you've somehow managed well with speakup so far, but that you want to do some work with nvda. Maybe paste an error you're getting on windows. So you open your terminal, ssh into your linux box, attach to your screen session and paste your text.

The result is an immediate crash of the screen reader (NVDA) or massive system instability. Why? Every time you type a character or paste text, the application triggers a state change. The framework decides it needs to re-render the interface. Because the conversation history is part of that state, the application attempts to redraw or re-calculate the layout for thousands of lines of text instantly. The more messages you have in a conversation, the more this will happen. And no, you can't just avoid this by using insert+5, the key combo supposed to avoid announcing dynamic change of content.

The Lag Loop

Furthermore, frameworks like Ink running on single-threaded environments (like Node.js) suffer from massive performance degradation when the history grows. If you paste a large block of text, the system has to calculate the diff for thousands of lines.

This causes input lag. You press a key, and you wait. You can wait up to 10 seconds for a single character to echo back. The system is too busy calculating how to redraw the screen to actually process your input.

Sighted developers often ask: “If TUIs are bad, why do you use nano, vim, or menuconfig?”

The answer is not that these tools handle the cursor perfectly by default. The answer is that they allow you to hide the cursor entirely.

1. Hiding the Cursor (nano, vim)

In tools like nano or vim, usability depends on turning off features that track cursor position. If you run nano with options that show the cursor position (like --constantshow), or if you use vim without specific configuration, the experience is broken.

When the cursor is visible and tracking is active, Speakup prioritizes the cursor's location update over the character echo. Instead of hearing the letter “a” when you type it, you hear “Column 2”. You type “b”, and you hear “Column 3”.

These older tools succeed because they allow you to disable this noise. You can configure them to suppress the visual cursor or status bar updates, forcing the screen reader to rely on the character input stream rather than the noisy coordinate updates. Modern frameworks rarely offer a “no-cursor” or “headless” mode; they assume the visual cursor is essential.

2. Single Column Focus (menuconfig)

Tools like the Linux kernel's menuconfig work because they enforce a strict, single-column focus. Even though there are borders and titles, the active area is a vertical list. The cursor stays pinned to that list. It doesn't jump to the bottom right to update a clock, then to the top left to update a title. The spatial complexity is kept low enough that the screen reader never gets “lost.”

Irssi is the gold standard for accessible chat, but not because of luck. Irssi was built over 20 years with a custom rendering engine that utilizes VT100 Scrolling Regions.

When a new message arrives in Irssi:

  1. It tells the terminal driver: “Define a scrolling region from line 1 to 23.”

  2. It sends a command: “Scroll up.” The terminal moves the bits up.

  3. It draws the new text at the bottom of that region.

Crucially, it handles this in a way that minimizes interference with the input line. It relies on the terminal's hardware capabilities rather than rewriting every character on the screen manually. Modern frameworks ignore these hardware features in favor of “diffing” the screen state and rewriting characters, which is computationally heavier and hostile to accessibility.

The “Stale Bot” excuse: A Case Study in Neglect

Google and the maintainers of gemini-cli pretend to care about accessibility. “Pretend” is the operative word here. If you look at the repository, critical accessibility regressions like Issue #3435 and Issue #11305 have been left to rot. There is no discussion, no roadmap, and no fix. Even worse is the fate of Issue #1553, which was supposed to track these accessibility failures. It didn't get solved; it got silenced. It was closed automatically by a bot with this generic dismissal:

Hello! As part of our effort to keep our backlog manageable and focus on the most active issues, we are tidying up older reports. It looks like this > issue hasn't been active for a while, so we are closing it for now.”

This is unacceptable. Closing an accessibility report because the maintainers haven't touched it in months is not “tidying up”; it is hiding evidence. It effectively says that if a bug is ignored long enough, it ceases to exist. It boosts the project's “Closed Issues” metric while leaving the actual software unusable for blind users.

Conclusion

If you are building for the terminal and care about accessibility, stop using declarative UI frameworks that treat the terminal like a canvas.

The “modern” TUI stack has optimized for the developer's ability to write React-like code at the expense of the machine's ability to render text efficiently.

If you cannot guarantee that your application allows the user to hide the cursor, or if you rely on aggressive redrawing to show spinners and timers, you are building an inaccessible tool.

For the blind user, a dumb, linear CLI stream is infinitely superior to a “smart” TUI that lags, spams, and scatters the cursor across the screen.

联系我们 contact @ memedata.com