Wayland 破坏了我赖以生存的工具。
Wayland breaks the tools I use to make a living

原始链接: https://rykarn.se/2025/01/26/wayland

## Wayland 与语音控制用户无障碍问题 主要 Linux 发行版即将从 X11 转向 Wayland 作为默认显示系统,这对无障碍功能构成了重大威胁,特别是对于像 Talon Voice 这样的语音控制软件用户。 Talon Voice 允许用户通过语音命令控制他们的电脑,对于患有 RSI 或其他残疾的人来说,这非常有价值。然而,Wayland 的设计有意缺乏一些关键功能——例如查询窗口信息和模拟输入——这些功能在 X11、macOS 和 Windows 中都可以轻松获得。 目前,在 Wayland 上访问这些功能需要每个“合成器”(Wayland 窗口管理器的等效物)的特定实现,导致碎片化和缺乏标准化。这给 Talon 的开发者带来了负担,需要为每个合成器创建解决方法,甚至可能需要定制合成器,这是一项艰巨的任务。 作者承认技术专业知识有限,但核心问题很明确:Wayland 的发展可能会使无障碍工具——以及依赖它们的人——落后。目前正在与发行版和桌面环境开发者合作解决这些问题,但解决方案仍然遥遥无期。作者目前受到其选择的系统保护,担心未来可能需要在 Linux 和其必要的语音控制软件之间做出选择。

Hacker News 的讨论强调了从 X11 过渡到 Wayland 仍然面临的挑战,尽管 Wayland 正在变得越来越成熟。虽然对许多人来说(估计 95%)Wayland 已经可用,但大量专业工具和特定用例仍然依赖 X11,因为缺乏可行的 Wayland 替代方案。 用户指出 Wayland 自身存在碎片化——不同的合成器对功能实现不一致——这与 X11 统一的体验不同。可访问性是一个主要问题,GNOME 之外的支持有限,并且新协议与现有的 Linux 可访问性库不兼容。 人们对强制采用和 Wayland 周围的“部落主义”表示担忧,一些用户表达了沮丧,认为设计优先级已经取代了用户需求,尤其是在可访问性方面,导致一些长期 Linux 用户转向 macOS。 核心问题仍然是:由于依赖项和特殊需求的长期存在,完全从 X11 迁移可能需要数十年,甚至永远无法实现。
相关文章

原文

This was originally posted on cohost on 2023-12-21.

If you are not familiar with the things discussed in this post - Wayland and X11 are display systems on Linux, where Wayland is a modern replacement for X11. Their purpose is to provide ways to draw windows and graphical UIs. Without them, you are basically confined to operating your computer from the command line.

Please note the disclaimer at the end that my technical understanding of the workings of Wayland and the challenges it presents to Talon Voice was limited at the time of writing. I sort of get it a bit more since then, but I’m still by no way confident in explaining the technical details how the accessibility problem on Wayland is structured. But I get the gist of it - the common Wayland protocols do not provide a complete interface to all the things an accessibility tool like Talon Voice needs in order to assist the user in using their computer’s UI.


Major linux distros are soon switching to versions of desktop environments that use Wayland instead of X11. This is a bad state of affairs for accessibility.

I use Talon Voice. It is a program that runs in the background that listens to spoken commands in your microphone, and performs certain actions on your computer based on what you say. I started using it earlier this year when I got RSI in my wrist which made it hard for me to type on a keyboard and use a mouse. My wrist has healed up again and I am once more capable of typing on a computer (this text for instance). But my main source of typing is when I am programming, something I do for a living, and for that I still use Talon and the Cursorless extension. I could probably move back to using emacs with vim keybindings, the way I used to before my wrist issues, but I would prefer not to. Both because I would be worried to injure my wrist once more, and also because I have grown really fond of programming with Cursorless. I am as fast if not faster with voice coding compared to my ability to program with a keyboard (with a typing speed of about 85 WPM). Still, I am lucky that for me, Talon Voice is an option and not a necessity for interacting with a computer. Many others in the Talon Voice community do not have that privilege.

The Wayland protocol lacks (by design) a whole bunch of features needed for Talon and similar accessibility programs to function, things that are fully possible on Linux with X11, OSX and Windows. Things such as querying what windows are available, their titles, what window is focused, emulating keyboard input, querying for the mouse position. At the moment, these APIs would need to be provided by whatever Wayland compositor happens to be in use on a user’s system. Since no common accessibility interface for these tasks exists between the currently available compositors you then run into the problem of getting the development teams behind the different compositors to agree on a standard. That, or require the single developer behind Talon to write his own compositor in order to support Linux.

This really fucking bums me out. For my home computer I’m good for a while because I use a distro where I can decide whether to use X11 or Wayland. At my current work we use a LTS release of a distro still on X11 with the intention of staying on it for as long as it is supported. So I’m good for now, but I do not look forward to potentially having to choose between ditching Talon Voice or ditching Linux altogether.

I’m sure that Wayland is a necessary step forward to get rid of the technical debt of X11, but moving forward now will leave the state of accessibility on Linux, and thus the people depending on it, behind. There are ongoing efforts to compile the issues faced by Talon Voice and contacting the teams behind the big distros and desktop environments, but any change that hopefully might occur from that is still a long way out.

(Note that I have no real insight into the nature of Wayland/X11 and how compositors work and so on, so my description of the technical challenges should be taken with a grain of salt. This is mostly based on what I have pieced together from a high level view of the Wayland architecture and the things mentioned by the Talon developer in the Talon community Slack channel.)


2025-01-26: As far as I can tell, the situation has not really changed much. There are a few experimental or non-core protocol extensions that provide subsets of the features that Talon Voice needs to function, but the actual existencences of an implementation of those protocols are fragmented across the multitude of compositor families that exist today. To me, it really looks like an impossible task to create an application like Talon Voice that can work in a Wayland compositor-agnostic way for the forseeable future.


联系我们 contact @ memedata.com