QRV 操作系统：基于 RISC-V 的 QNX

QRV 操作系统：基于 RISC-V 的 QNX
QRV Operating System: QNX on RISC-V

原始链接: https://r-tty.blogspot.com/2026/03/qrv-operating-system-first-publication.html

## QRV-OS：一个与QNX兼容的RISC-V微内核经过始于90年代末的早期操作系统实验（“RadiOS”）的数十年努力，一位开发者取得了一项重要里程碑：在QEMU上启动了一个可用的shell提示符，使用的是移植到RISC-V的、与QNX兼容的微内核QRV-OS。这项成果，详情请见[GitHub](https://github.com/r-tty/qrv)，经历了2020年取得初步进展后的五年暂停，随后是为期三周的集中冲刺。 QRV-OS利用了QNX的架构——消息传递、用户空间资源管理器——但它是对32位代码库的全新重写，以适应64位RISC-V。它成功地生成并运行了动态链接的用户空间程序，使用了完整的QNX IPC堆栈。该项目基于2008-2009年发布的QNX 6.4公共源代码，并受益于像Claude这样的人工智能工具提供的调试和代码审查方面的帮助。目前该项目由个人开发，但开发者希望围绕该项目建立一个社区，倡导更宽松的许可协议（Apache 2.0），以鼓励更广泛的协作，并推动QNX架构超越其当前的商业重点进行开发。未来的工作包括SMP稳定化、设备驱动程序支持和错误修复。

黑客新闻新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录 QRV 操作系统：基于 RISC-V 的 QNX (r-tty.blogspot.com) 8 分，chrsw 发表于 2 小时前 | 隐藏 | 过去 | 收藏 | 1 条评论帮助 ymz5 发表于 16 分钟前 [–] 感谢 chrsw！我鼓励大家参与请愿，将旧的 QNX 源代码以 Apache 2.0 许可重新授权。如果黑莓这样做，我们将拥有另一个良好且免费的基于微内核的操作系统。回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

v0.16 boots to a working shell prompt on QEMU. pwd prints the working directory, echo works, ls lists the root filesystem, and non-existent commands report "No such file or directory". The first dynamically-linked user-space program spawns, runs, and exits through the full QNX-style IPC stack.

The patches (and the build script) for v0.16 are available at https://github.com/r-tty/qrv .

Getting there took about five to six days of intensive debugging in the final stretch — log analysis, trace output, one fix at a time. The shell prompt # was not some sudden revelation; it appeared early in that process, embedded in pages of rtld: and procmgr: trace lines, surrounded by the next batch of things not yet working. It is satisfying work, but "satisfying" is the right word, not "dramatic". This kind of thing is mostly meticulous and sometimes tedious. I am at peace with that.

But this post is also a story about a rather long road to get here. Bear with me.

RadiOS: 1998–2005 (and a bit beyond)

I first tried to write an operating system in 1998. The initial attempt was called "The Hawk Operating System" and ran inside MS-DOS. It died quickly and deservedly.

A second attempt followed almost immediately — a collaboration with a friend called Serhiy, named "Radionix". That one also died, this time because the architecture was simply wrong: real-mode x86, no memory protection. By October 1998 I had thrown it out and started again from scratch, this time in 386 protected mode. The new project was called RadiOS. Its SourceForge page is still there.

The entire system was written in x86 assembly — NASM, later RDOFF2, and eventually a partial transition toward FreePascal. There was no C in the kernel to speak of. This sounds masochistic, and at times it was, but there was a clear logic to it at the time: I wanted to understand the hardware at the lowest possible level, and assembly left nothing hidden.

RadiOS grew slowly but steadily through the early 2000s. By late 2002 I had read Jochen Liedtke's papers on L4 and gotten hold of a copy of QNX 6.1, and the combination was decisive. The message-passing microkernel architecture — the send–receive–reply model, resource managers as ordinary user processes — clicked immediately as the right way to build a system. So RadiOS pivoted to become a microkernel, with explicit QNX 6 compatibility as the goal: same syscall numbers, same parameters, eventual binary compatibility with QNX programs.

Version mk4 (November 2002) had taskman running in user mode for the first time. Version mk5 (December 2002) had basic message passing. Version mk6 (January 2003) had working message passing primitives — channels, connections, send and receive — all in assembly, all on real 486-class hardware. The architecture worked. The implementation ground forward.

By 2004–2005 the system had mutex synchronization, proper TLS, process pools initialized in user space, a resource manager library. It was doing most of what a real QNX-compatible microkernel needs to do. But it was doing all of it in assembly, and that weight accumulated. Every new feature required building infrastructure that C gives you for free. Every abstraction had to be constructed by hand. The code was correct in the places I had finished it and getting harder to extend everywhere else.

Eventually, life also had something to say about it. A move of 2000 km north, a "real" career as a software engineer, the ordinary pressures of adult life. RadiOS did not have a dramatic end. It just slowed, and slowed, and eventually stopped.

Version 0.0.1.10 came out in February 2010 — a cleanup, a transition toward FreePascal that never really completed. The last tagged release is 0.0.2 from January 2015, a bugfix. And that was that.

The QNX Sources: SVN, Then SourceForge

In 2008–2009, QNX Software Systems did something that was not widely publicized but was quietly significant: they allowed SVN access to the QNX Neutrino source code under the QNX Community License 2.0. You could check out the kernel, the process manager, the C library, the runtime linker. The license was non-commercial and academic, but the code was there, readable, buildable.

This was the original QNX 6.4 codebase — the real thing, not a simplified version. For anyone who had spent years implementing QNX-compatible primitives in assembly and wondering whether their mental model of the internals was correct, having the actual source available was quite something.

Around the same time, a group of researchers at HEIG-VD — the Haute École d'Ingénierie et de Gestion du Canton de Vaud in Yverdon-les-Bains, Switzerland — were doing research on kernel tracing. Pietro Descombes, Jérôme Stadelmann, and Daniel Rossier were instrumenting QNX mutexes, watching thread scheduling, studying the microkernel internals for their academic work. They published their work on SourceForge under the project name openqnx, repository monartis. Their commits come from machines with names like A05BPC14 and A05bLi17 — the unmistakable hostnames of a university computer lab.

The last commit from Jérôme Stadelmann is dated December 2, 2009. Pietro Descombes's final entry is September 18, 2009. And with that, the project went silent.

What remained on SourceForge — apparently without anyone thinking very hard about the long-term implications — was the full QNX Neutrino 6.4 kernel source, the process manager, the C library, the runtime linker, all the headers. Freely downloadable. The QNX Community License 2.0 still applied, but the HEIG-VD project had made the sources far more accessible than they had been through the official SVN channel.

I found this in December 2020, during the COVID lockdowns, and the thought that immediately followed was: what if I ported this to RISC-V?

Christmas 2020, and Five Years of Pause

The timing made sense in a particular way. RISC-V had matured. The toolchains were stable. The original QNX sources were 32-bit ILP32, targeting x86, ARM, MIPS, SH, and PPC — no 64-bit port existed, let alone RISC-V. Doing the LP64 transition and the architecture port in a single effort seemed like exactly the kind of large, difficult, satisfying project that a long holiday lockdown invites.

I started on December 17, 2020. First strings to console on December 19. By December 24 I was committing at 23:59, the message reading "last change before v0.0.1" — SBI support, SMP startup, syspage populated. The kernel printed things to the screen. A small but real beginning.

Then the holiday ended. Work resumed. The project sat.

It sat for just over five years. Not abandoned — I thought about it periodically, kept track of RISC-V toolchain improvements, occasionally reread the QNX architecture documentation. But it was not active. Then on February 27, 2026 — my father's birthday — I sat down and started again.

February 27 to March 20, 2026

The restart began with a cleanup session: archiving the dead architecture ports (ARM, MIPS, PPC, SH — none of which had been meaningfully ported), clearing out old cruft, renaming things properly. The project had accumulated five years of drift. An hour of housekeeping, and then back to work.

The sprint from restart to v0.16 covered about three weeks.

v0.2 (Feb 28): Kernel boots to idle — "QRV-OS kernel alive!". All four taskman subsystems compile with zero errors. LP64 porting pass across the codebase.

v0.3 (Mar 1): First ecall from U-mode dispatched and answered. A minimal test binary calls ChannelCreate(0), the kernel allocates the channel, returns chid=1. Sv39 MMU off at this point — bare-mode trap handler, no page table switch needed.

v0.4 (Mar 2): Sv39 virtual memory. ELF loading into user-space page tables. ChannelCreate working under real Sv39 mapping.

v0.6–v0.7 (Mar 4–5): IPC data-transfer machinery: RISC-V _setjmp/_longjmp for xfer fault recovery, all six xfer modules, core message passing linked into the kernel. The in-kernel ELF linker loads libc.qrl as a shared library, resolves all 5,746 symbols, and taskman_main() runs for the first time. Real pthread mutexes and TLS via the RISC-V tp register — not stubs.

v0.10 (Mar 11): SMP bring-up (six CPUs). FreeBSD stdio ported to libc. esh built as a real dynamically-linked executable and placed in the CPIO image.

v0.13–v0.15 (Mar 13–15): Kernel moves to Sv39 upper-half. Cooperative context switching. Ecalls routing through the full ker_call_table[] (~100 calls). The runtime linker self-relocates with zero PLT imports. First complete IPC round-trip: ConnectAttach + MsgSendv to the path manager, fstat via _IO_STAT to cpiofs returning a real file size, write to /dev/console producing visible output in the terminal.

v0.16 (Mar 20): Shell prompt. pwd → getcwd → _connect_ctrl → path manager → cpiofs → reply with /. ls /rd/bin spawns as pid 4099, rtld loads libc.qrl, opendir/readdir/stat through message passing, directory listing printed, process exits cleanly.

What v0.16 Looks Like

The terminal at this stage still has trace noise around everything. Here is a relevant excerpt:

rtld: transferring control to program entry
rtld: _rtld done
# echo "Greetings from the first ever boot of QRV with the userspace shell!"
Greetings from the first ever boot of QRV with the userspace shell!
# pwd
/
# ls
No such file or directory
# /rd/bin/ls
procmgr_spawn: SPAWN_DONE from pid=4099
rtld: entry point
rtld: init_rtld entered
rtld: before relocate_objects
...

ls without a path fails correctly: the shell does not yet add /rd/bin to the search path, spawnp returns -1 with errno=ENOENT, and the shell prints "No such file or directory". /rd/bin/ls works: spawn, ELF load, Sv39 page tables, dynamic linking, IPC — the full chain.

What the Codebase Is

For clarity: QRV is not a patch on the original QNX sources. It is a ground-up reworking of the 32-bit ILP32 codebase into a 64-bit LP64 system for RISC-V, with deliberate simplifications:

Adaptive Partition Scheduling removed
Callouts and mini-drivers removed; direct hardware abstraction instead
QNX IFS image format replaced with a CPIO module package
"procnto" renamed to taskman
Startup and kernel linked into a single binary
Build system replaced with plain GNU Make + Kconfig

What is preserved is the core architecture: the nanokernel handles only message passing, scheduling, synchronization, and interrupt dispatch; taskman contains the process, memory, and path managers as threads in a single privileged process; user space communicates through the full QNX send–receive–reply IPC protocol. The codebase is currently about 90,000 non-comment lines of C and assembly.

The Bugs That Took Time

The LP64 transition produces a category of bugs that are invisible on 32-bit and crash reliably on 64-bit. A few that required real digging:

Struct layout mismatch. tChannel and tConnect are cast between each other in the terminator cleanup loop. On ILP32 both had the type discriminator at offset 12. After the LP64 rework, tChannel had acquired a uint32_t flags field before type, shifting it to offset 16. The code was still reading offset 12 — which is now a pointer — and treating it as a type discriminator. Restoring the original QNX field order fixed it.

Unsigned underflow. In _connect_ctrl, the expression save[path_skip + pad_len - 1] with pad_len=0 and unsigned path_skip=0: 0U - 1 is 0xFFFFFFFF. On 32-bit this wraps within the address space and silently reads adjacent stack memory. On LP64 it zero-extends to 0x00000000FFFFFFFF and produces an address four gigabytes past where you intended to be. This bug was present in the original QNX sources; LP64 just made it crash instead of silently misbehave.

Error sign convention. QNX on MIPS used the a3 register as an error flag and kept the error code positive in the return value. RISC-V uses negative return values to signal errors. kererr() was ported from MIPS verbatim, so MsgError(rcvid, EACCES) made the client's MsgSendvnc return +13 — which looks like a valid PID. Took a while to find.

Terminator deadlock. The pool thread held the process lock while calling termer_start(). The terminator then called proc_lock_pid() on the same process. With a spin+yield mutex on uniprocessor this is a clean deadlock: the terminator spins and yields, the pool thread never runs again to release the lock. Fix: one line, release the lock before calling termer_start(). Simple in retrospect, invisible until you hit it.

A Note on Tooling

Several commits carry:

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Claude was useful as a reasoning aid — working through LP64 alignment issues, reviewing diffs, generating boilerplate for repetitive fixes. For a solo project where you are simultaneously holding the RISC-V spec, QNX architecture documentation, and the current call stack in your head, having something that can follow a long technical explanation without losing track is genuinely useful. It does not replace understanding the code. It helps with the tedious parts.

The Bigger Picture

QRV is one person's port right now. That is a limitation worth being honest about.

The QNX architecture — synchronous message passing, resource managers as ordinary user-space processes, a nanokernel that does only what a kernel strictly must — is worth preserving and developing further. Modern QNX 8 has moved in a specific direction: safety certification, automotive, aerospace. That is a legitimate business direction and it is what BlackBerry does with QNX today. But it leaves room for a different kind of effort: more hardware targets, more experimentation, an open community developing the architecture in directions that commercial QNX does not pursue. The goals would not overlap. BlackBerry would have no reason to object.

For that to work properly, the licensing needs to be resolved.

The QNX Community License 2.0 permits study and non-commercial use, but it does not permit the kind of open collaboration that lets a contributor community actually form. Anyone who wants to build on QRV today has to obtain the original QNX sources separately, apply patches, and navigate a license that was designed for academic use rather than community development.

There is now an open petition asking QNX Software Systems and BlackBerry to relicense the historical 2007–2009 QNX Neutrino source code under the Apache License 2.0. The argument is straightforward: the code is seventeen years old, modern QNX products have moved far beyond it, and open-sourcing this historical snapshot would not affect BlackBerry's commercial business. What it would allow is a community of OS developers who care about microkernel architecture to build on solid foundations — rather than recreating everything from scratch or navigating a license that was never designed for this purpose.

If you work on operating systems, embedded systems, or real-time software and think this architecture is worth preserving: please consider signing the petition.

What Comes Next

SMP stabilization (four CPUs reliable under the full IPC stack), more user-space utilities, a device driver framework. Known issues: occasional instruction page fault on process exit, a timer crash under heavy load, SMP not yet reliable with the full IPC stack. All documented in the README.