LLMs as Compilers

原始链接: https://resync-games.com/blog/engineering/llms-as-compiler

In this blog post, Kadhir envisions a future where LLMs transition from coding assistants to "compilers" for software development. Instead of directly writing and examining code, engineers will focus on providing context and testing the resulting features. The LLM-compiler will handle code generation and integration, iteratively refining the output until it meets predefined tests. This shift could democratize engineering by requiring less specialized coding skills and accelerate feature development. While traditional compilers are provable, the author argues that LLM-compilers can be validated through testing and iterative refinement, treating the code as an intermediary layer. The proposed workflow involves defining context, specifying tests, running the LLM-compiler, caching the output, and iteratively refining the context based on test results. Software engineering agents are seen as crucial for converting context into features within this system, which requires context specification, reward signal definition, change monitoring, and redirection mechanisms. Ultimately, the goal is to significantly reduce the need to directly manipulate code.

This Hacker News thread discusses the idea of using LLMs as compilers, generating code directly from natural language. Some, like kadhirvelm, find it a compelling concept, envisioning a future where engineers focus on context and testing rather than direct code manipulation. However, concerns arise about determinism and the inherent ambiguity of natural language. daxfohl argues for a conversational, iterative approach to LLM-assisted development, emphasizing the value of real-time feedback and refinement. He suggests that specifying all requirements upfront, like traditional waterfall development, is less effective than leveraging the LLM's conversational abilities. Others pointed out that compilers already translate to assembly code first before converting it to machine code. They also cautioned that relying solely on LLMs might lead to brittle systems prone to errors, especially without robust testing and a means to ensure deterministic outputs. Several contributors suggest that a formal, testable specification language might be a more practical approach than pure natural language.

原文

LLMs as compilers

7/2/2025 by Kadhir

So far, I've only used LLMs as an assistant, where I'm doing something, and an LLM helps me along the way. Code autocomplete feels like a great example of how useful it can be when it gets it right. I don't doubt that over time this will improve, but I'm excited to see a more significant transition from this assistant mode to a compiler mode, at least for coding.

It will be exciting when we focus solely on the context we fed the LLM, then test the features it generates rather than the code. And importantly, we let the LLM handle integrating new features into the existing codebase. That means we no longer examine the code. Our time as engineers will be spent handling context, testing features, and iterating on them.

The consequence of that seems to be:

Democratize access to engineering
- You don't need as specialized skillsets to build complex apps, you just need to know how to put context together and iterate
Increase the velocity of feature development
- My gut says dealing with context will result in a better ratio of engineering time into features shipped than dealing with code directly

The obvious pushback here is, well, compilers are provable. There's a straightforward mapping between inputs and outputs, and we can prove the outputs are the same each time. We can also write tests to ensure the outputs are optimized.

But if we squint, a compiler transforms an input into an output. If we treat the code as an intermediate layer, viewing the input as context and the output as features, then we can demonstrate that the compiler is reliable through evaluations and testing. And importantly, we don't have to get the output right in the first go, we can let it iterate over and over until it gets it right. A new kind of compiler.

So I propose that if we get LLM-as-a-compiler, as a software engineer, I will go through this cycle:

Put together the context
1. Which includes a series of tests for the final behavior (perhaps I use an LLM for this)
I put it through the LLM compiler
1. Which is probably a system composed of several things
Which continually iterates on the output until all the tests pass
1. Ideally, as the LLM compiler gets better, the latency gets lower and lower
We cache the output (code) for performance improvements
I decide how I need to edit the context, and go back to step 1

SWE agents feel like they're the right abstraction on this path; they convert context into features, iterating in the background. They feel like they'll be an integral part of the LLM compiler system, which I think will have the following pieces:

A way to specify the context of my app
- And a way to specify which part of my context to focus on
Mechanism for specifying my reward signal (my tests)
A system for monitoring the changes happening
- And a way to redirect parts of the compiler if it's not doing what I expect
- Over time, I'd expect this part to evolve and the need to see the code to reduce

Resources