![]() |
|
![]() |
| In this case it is portable, because the Zig compiler source tree includes an interpreter for the blob (WASM) in portable C.
It's not objectionable to have non-portable source code anyway. I think it's fine having architecture-specific assembly code, just as long as it's hand-written. The problems arise when you're storing generated content in the source repository, because it becomes unclear how you're meant to understand and fix the generated content. In this case it seems like the way to fix it is by rerunning the compiler, but if running the compiler involves running this incorrect blob, it's not clear that running the compiler again will produce a correct blob. I wonder if anyone is monitoring these commits in Zig to ensure that the blobs are actually generated genuinely, since if not it seems like an easy way for someone to inject a KTH (Ken Thompson Hack): https://github.com/ziglang/zig/commits/master/stage1/zig1.wa... |
![]() |
| This is 100% the case. All of the honest-to-god Rust experts I know work on the compiler in some way. Same goes for Lean, which bootstraps from C as well. |
![]() |
| You often write two compilers when trying to bootstrap a C compiler, as GCC used to do. Often, it's a very simple version of the language implemented in the architecture's assembly. |
![]() |
| That was my first introduction to C and I hacked a lot on that code. A very enjoyable time was had.
My only regret with Rust is that a “Small Rust Compiler” will be an order of magnitude larger. |
![]() |
| But we used simpler forms like oil to bootstrap renewables. For instance, making solar panels takes lots of energy. Would it be possible to go straight to renewables? |
![]() |
| We all know why the Lovelace myth still persists
http://projects.exeter.ac.uk/babbage/ada.html
"It is often suggested that Ada was the world's first programmer. This is nonsense: Babbage was, if programmer is the right term. After Babbage came a mathematical assistant of his, Babbage's eldest son, Herschel, and possibly Babbage's two younger sons. Ada was probably the fourth, fifth or six person to write the programmes. Moreover all she did was rework some calculations Babbage had carried out years earlier. Ada's calculations were student exercises. Ada Lovelace figures in the history of the Calculating Engines as Babbage's interpretress"
|
![]() |
| Or to take it another direction - how do they gestate? At what point can we call it a chicken and when does the shell (assuming that's what would make us call it an egg) develop? |
![]() |
| It can be difficult to explain why bootstrapping is important. I put a "Why?" section in the README of my own bootstrapping compiler [0] for this reason.
Security is a big reason and it's one the bootstrappable team tend to focus on. In order to avoid the trusting trust problem and other attacks (like the recent xz backdoor), we need to be able to bootstrap everything from pure source code. They go as far as deleting all pre-generated files to ensure that they only rely on things that are hand-written and auditable. So bootstrapping Python for example is pretty complicated because the source contains code generated by Python scripts. I'm much more interested in the cultural preservation aspect of it. We want to preserve contemporary media for future archaeologists, for example in the Arctic World Archive [1]. Unfortunately it's pointless if they have no way to decode it. So what do we do? We can preserve the specs, but we can't really expect them to implement x265 and everything else they would need from scratch. We can preserve binaries, but then they'd need to either get thousand-year-old hardware running or virtualize a thousand-year-old CPU. We can give them, say, a definition of a simple Lisp, and then give them code that runs on that, but then who's going to implement x265 in a basic Lisp? None of this is really practical. That's why in my project I made a simple virtual machine, then bootstrapped C on top of it. It's trivially portable, not just to present-day architectures but to future and alien architectures as well. Any future archaeologist or alien civilization could implement the VM in a day, then run the C bootstrap on it, then compile ffmpeg or whatever and decode our media. There are no black boxes here: it's all debuggable, auditable, open, handwritten source code. [0]: https://github.com/ludocode/onramp?tab=readme-ov-file#why-bo... |
![]() |
| I'm curious as to why you need to bootstrap at all? Why not start with adding the OS/kernel as a target for cross-compilation and then cross-compile the compiler? |
![]() |
| The article mentions that the Bootstrappable Builds folks don't allow pre-generated code in their processes, they always have to build or bootstrap it from the real source. |
![]() |
| that's interesting! what kind of os did you write? it sounds like you didn't think supporting the linux system call interface was a good idea, or perhaps even feasible? |
![]() |
| I'm not sure I see the point. To generate functional new binaries on the target machine, rustc will need to support the target. If you add that support to rustc, you can just have it build itself. |
![]() |
| > perhaps a half a dozen times until you get to today's rust.
Perhaps? It was already more than that in 2018: https://guix.gnu.org/blog/2018/bootstrapping-rust/ That was back in 2018. Today mrustc can bootstrap rustc 1.54.0, but current rustc version is 1.80.1. So if the amount of steps still scales similarly, then today we're probably looking at ~26 rustc compilations to get to current version. And please read that while keeping in mind how Rust compilation times are. |
![]() |
| > It's about having a shorter auditable bootstrap process
Yeah, in 2018 the chain looked like this[1]:
Though for me it's less the auditable part, and more that I would be able to build the compiler myself if I wanted, without jumping through so many unnecessary hoops. For the same reason I like having the source code of programs I use, even if most of the time I just use my package manager's signed executable.And if someone open sources their program, but then the build process is a deliberately convoluted process, then to me that starts to smell like malicious compliance ("it's technically open source"). It's still a gift since I'd get the code either way, so I appreciate that, but my opinion would obviously be different between someone who gives freedoms to users in a seemingly-reluctant way vs someone who gives freedoms to users in an encouraging way. |
![]() |
| It’s a huge project, I wonder if it wouldn’t be simpler to try to compile cranelift or mrustc to wasm (that’s still quite difficult) then use wasm2c to get a bootstrap compiler. |
![]() |
| The article mentions that the Bootstrappable Builds folks don't allow pre-generated code in their processes, they always have to build or bootstrap it from the real source. |
![]() |
| The ultimate answer given later in the above-linked comment is that bootstrapping with FORTH is a great idea but programming in FORTH isn't fun enough to follow up on the notion. |
![]() |
| > It’s basically code alchemy.
More like archaeology. Alchemy was essentially magic, but there's nothing magic about bootstrapping from hex-punched assembly. |
![]() |
| if it is smaller, doesn't it mean that it has less code to execute hence should it be faster? Trying to understand better -- this is something completely new for me |
![]() |
| Why not write the compiler in Rust, then compile it to assembly, and then use some disassembler/decompiler to compile that back to portable C? |
![]() |
| The article mentions that the Bootstrappable Builds folks don't allow pre-generated code in their processes, they always have to build or bootstrap it from the real source. |
![]() |
| This is why the aforementioned ABI (of the latter language in the title of this post) won't die for a long time. The name of the game is compatibility, not performance/security. Bell Labs was first. |
![]() |
| TL;DR his goal is rust, but for bootstrapping a first rust compiler for a new environment, the work is already done for C
the article is interesting, and links to some interesting things, but that's what the article is about his project is https://codeberg.org/notgull/dozer he references bootstrappable builds https://bootstrappable.org/ a systematized approach to start from the ground up with very simple with a 512 byte "machine coder" (more basic than an assembler) and build up from there rudimentary tools, a "small C subset compiler" which compiles a better C compiler, etc, turtles all the way up. |
![]() |
|
5kloc is pretty light for a `rustc', where are the tests showing what aspects of the grammar ar supported so far in @notgull's crowning achievement ? The article might be longer than the source code, which would be extremely impressive if the thing actually worked :)I was not able to compile tokio with dozer. For comparison, turn towards the other major lang HN submission today: a Golang compiler written in PHP; It comes with extensive tests showing what works and what does not. Somehow even the goroutines are working.. in PHP. Golang interpreter written in PHP - https://github.com/tuqqu/go-php - https://news.ycombinator.com/item?id=41339818 Godspeed. |
![]() |
| > Remembered this article... https://drewdevault.com/2019/03/25/Rust-is-not-a-good-C-repl...
Remembering Drew Devault is the Fox News of programming bloggers. He exhibits the same sort of bad faith obtuseness, and knee-jerk neck beard tech conservatism, that makes me/many want to scream. First, his thesis is risible. "Rust is not a good C replacement". Note, Drew does not mean replace C code with Rust code, but Rust, the language, literally replacing C, the language. Ignoring, perhaps, Rust doesn't want to "replace" C, because we have C! Next, see the bulleted text. Upon each topic something interesting might be said re: Rust, but instead they all serve a garbage thesis that Rust can never be the 50 year old language that the tech world is currently built upon. Well, duh. My least favorite, though, is the final bullet: > Safety. Yes, Rust is more safe. I don’t really care. In light of all of these problems, I’ll take my segfaults and buffer overflows. And everyone wants to be a cowboy and watch things blow up when they are 8 years old. |
![]() |
| What does that article have to do with this article? The author of the latter article even says that they don’t enjoy writing C, which is kind of the opposite of what your article says |
‘proto-rust’ might, for example, not have a borrow checker, may have limited or no macro support, may never free memory (freeing memory isn’t strictly needed in a compiler whose only goal in life is to compile a better compiler), and definitely need not create good code.
That proto-rust would basically be C with rust syntax, but for rust aficionados, I think that’s better than writing a rust compiler in “C with C syntax” that this project aims for.
Anybody know why this path wasn’t taken?