我设计一个极简UI引擎的经验

我设计一个极简UI引擎的经验
What I learned designing a barebones UI engine

## HelloUI：一个自定义PyGame UI框架出于对快速实验的需求，且避免现有UI解决方案的开销，作者开发了HelloUI，一个基于PyGame的Python软件渲染自定义UI框架。最初的设计重点是简单性，使用扁平的组件列表和手动点击检测与渲染。认识到这种方法的局限性，该框架演变为一种受Flutter和Jetpack Compose等现代UI引擎启发的树形架构。这使得使用`measure()`和`distribute()`方法进行固有尺寸的布局系统更加易于管理。进一步的开发包括通过线程实现异步支持、全局事件监听系统，以及使用“dirty”标志来最小化不必要的重绘以进行性能优化。还实现了一个UI舞台管理系统，用于在“页面”之间进行导航。虽然目前适用于基本的实验性脚本，但未来的目标包括用于状态管理的声明式API、更具组合性的组件结构以及可定制的样式系统——旨在平衡快速迭代与可维护性和高级功能。该项目最终成为一次宝贵的学习经历，展示了构建健壮UI系统所固有的复杂性。

## 极简UI引擎：一则黑客新闻摘要一位开发者分享了设计极简UI引擎的经验，强调了从头构建而非依赖现有解决方案所获得的宝贵经验。这引发了一场关于即时模式与保留模式GUI的争论。许多评论者支持即时模式GUI——UI每帧都会重绘——因为它简单、性能高且潜在的bug更少，尤其适用于游戏或音频插件等实时应用。然而，有人对复杂场景或嵌入到现有系统中的状态管理和性能表示担忧。有人澄清说即时模式并*不*消除状态，而是将所有权转移到用户代码中。另一些人指出，保留模式对于静态UI或省电更有效，而即时模式在动态场景中表现更佳。原始作者解释说，他们的动机是希望拥有一个允许低级优化的极简系统，以及一次学习体验。另一位用户表示赞同，描述了在构建WebXR UI时遇到的类似困境，原因是Web标准存在局限性，即使是简单的UI元素也令人沮丧地复杂。最终，对话强调最佳方法很大程度上取决于具体的用例和开发者偏好。

原文

The Usecase

I wrote a custom UI framework in PyGame, a library used for software rendering (graphics on the CPU), to support my experiments while giving me a standard interactive layer using event-driven paradigms similar to other UI frameworks.

The requirements were specific:

It needed to be transparent - I didn't want my UI layer to add extra cost over standard software rendering, which means no workarounds to get it to display custom canvases
It needed to be in Python - The main goal is to have an interactive layer ready to spin up for rapid experimentation. Python has a vast ecosystem of libraries and is fast to write - the UI layer needs to match that iteration speed.

Starting From Nothing

UI at its most simplest.

The initial architecture focused on brutal simplicity. I persisted a flat list of components that I would manually place by first sketching it out in Photoshop, and every frame the engine ran a minimal loop:

Hit-test: Compare the mouse coordinates and click state with the coordinates of every single component in the flat hierarchy, triggering any click/hover handlers on any components that passed the hit-test.
Update: Run a global update() loop for every component if they need to update private state consistently every frame.
Render: Call the render() method on each component, relying on my Photoshop math to make sure they render at the right size and in the right position.

This is very simple to write, but it's impractical for all but the most stylised or minimal UI layers. For a general purpose tool, it would be ideal to offload some of the math to the engine and focus on describing my UI through higher layer layout semantics, as opposed to manual pixel math.

The Family Tree

A reunion.

To achieve this, we can draw inspiration from actual UI engines and model our UI to represent nodes as a tree, instead of a flat hierarchy. Each node has a parent and one or more child nodes, which can each have their own children, and so on. I implemented an architecture where nodes are exclusively either layout-only or content-only, as opposed to something like HTML, where nodes can be contentful and have children of their own. Less flexible, but simpler to implement.

Instead of a simple list iteration, this approach requires depth-first traversal of the tree, which recurses through all the nodes. This recursive nature is essential to how the layout engine works. Each layout node implements two key methods, a measure() method to measure and return its rectangle size, and a distribute() method where a child node can be issued its final size and position.

This seems simple, but combined with the recursive nature of the tree traversal, it results in a layout engine that calls measure() on a child, that calls measure() on its child, so on and so forth, until instrinsic sizes bubble up and final positions can be distributed back down the tree.

This is an incredibly powerful paradigm and is inspired by how actual layout engines similar to the ones in Flutter and Jetpack Compose function. A crucial difference is that my layout engine only works with instrinsic sizing, and does not support any constraints. Practically, this means that a parent cannot grow or shrink its children, which is a key requirement if you want responsive design or fluid layouts. While these weren't the main requirements for the initial version of this engine, they are things I'd like to revisit, especially after watching this excellent video of how Clay (a layout engine for C) works.

Refining the engine

class Offset(ui.core.Stage):
def start(self):
self.back = Button(UII, "<- Back", self.clickon_back)\
.place(Alignment.TOP_LEFT, offset=[Style.PADDING.LAYOUT_PADDING]*2)
self.root = UIContainer(UII, BoxLayout("vertical")).add_elements({
"start": UIContainer(UII, BoxLayout("horizontal")).add_elements({
"label": TextLabel(UII, "Start: "),
"ebox": EntryBox(UII, "YYYY-MM-DD"),
}),
"end": UIContainer(UII, BoxLayout("horizontal")).add_elements({
"label": TextLabel(UII, "End: "),
"ebox": EntryBox(UII, "YYYY-MM-DD"),
}),
"amount": UIContainer(UII, BoxLayout("horizontal")).add_elements({
"label": TextLabel(UII, "GB: "),
"ebox": EntryBox(UII),
}),
"buttons": UIContainer(UII, BoxLayout("horizontal")).add_elements({
"date_ez": Button(UII, "Smart fill", self.clickon_fill),
"go": Button(UII, "Add!", self.clickon_go),
}),
})
UIEngine.add({"back": self.back, "main": self.root})

Code snippet of what a simple form looks like, showcasing the nested box layouts with anchoring support.

With the core component API and layout abstraction nailed down, I finally reached a point where I could start designing components and simple test programs for me to use. I quickly discovered some exceptions that I took for granted in other UI engines.

Asynchronous support: One of the first GUIs I wrote involved a script that had to talk to an API, which would freeze the entire window. My solution was an abstraction for the base threading library where threads are tracked by the engine and callbacks are called on the main thread upon completion. This helps reduce the surface area for race conditions while keeping the program responsive.
Event listeners: Sometimes components need access to I/O events that involve more than just the mouse. I added a system to globally emit events that can be subscribed to, similar to JavaScript APIs in the browser (... and running into the same memory leaking problems).
Performance optimisations: Software rendered UIs can quickly slow down if not optimised correctly. I used flags to mark if a component or a layout was dirty, and made use of Python's context handler API to provide a Pythonic way of updating components while handling the flags behind the scenes. Components are only redrawn and layouts are only recalculated when the respective flag is set, allowing the program to minimise CPU usage to only when it's needed.
UI Stages: Most UIs don't consist of a single "stage" of UI elements. Ideally, we want to navigate to various "stages" (or "pages" as they're called in a browser) depending on UI state. I implemented a state machine similar to how mobile applications work, where you can push a stage to a stack and return from it, or clear the entire stack and start fresh for destructive navigation.

Beyond the basics

An actual screenshot - featuring the minimal hardcoded stylesheet that ended up inspiring the style of this website.

What I have now works fine for basic / experimental scripts where raw iteration speed is more important than maintenance, but ideally, we'd want to bridge that gap and add more functionality. Here are a couple of more advanced ideas I'd like to explore in the future, inspired by real systems:

Declarative API: Can we take the huge improvement in developer experience from moving from manual pixel -> automatic layout, and apply that to UI state? The program becomes a description of what you'd want to see for any given state, instead of a set of instructions to poke at the UI every single time a variable changes. This requires either a fine-tuned reactivity primitive (similar to SolidJS) or an optimised reconciler for diffing our UI tree with an ephemeral one created when state changes (like React.js).
Composability: With the current API, my programs consist of big components that do whole tasks at once, render directly to surfaces, and store and manage their state opaquely. This is simple for the engine, but gets hard to manage for the developer. Modern paradigms are adopting a more functional, compositional API where programs consist of many tiny UI primitives that compose to make something larger. Supporting this requires an overhaul of the event-handling system to support event bubbling, and optimisation of almost all aspects of the engine to handle moving the complexity to the UI tree.
Custom styling: Right now, the engine relies on a hardcoded stylesheet full of global style declarations that are referenced in the render method for each component. Ideally, we would combine this with a user-configurable styling API. Something similar to TailwindCSS utility classes would fit perfectly with the "minimal" target we're aiming for - but applying directly to the renderer instead of compiling to a file.

Conclusion

Ironically, this project started because I didn't want a UI. Existing solutions were opaque and required lots of boilerplate that often exceeded the actual scale of my projects. I just wanted clickable surfaces and a way to hack at the layers underneath. As the project grew, I ended up organically discovering how to construct simple abstractions through trying (and sometimes failing) to write my own, and why it's paradoxically anything but simple to do right.

While it’s far from perfect, writing it taught me more about UI systems than I ever would have learned by sticking to established solutions alone.

Read more about the high-performance video mosaic rendering and streaming engine I originally designed this UI library for.