为什么要打造Lua口译员

原文

I recently needed to build a custom Lua interpreter in Go. The exact reasons aren’t important for this blog post, but neither the reference implementation — which I will be referring to as “C Lua” throughout this article — nor the other open source Go Lua intepreters I could find were a good fit for my needs. Building a Lua interpreter ended up being a rather enjoyable months-long side quest. I’ve had a number of folks ask me to write about the experience since these sorts of projects usually highlight interesting aspects of both the implementation language and the interpreted language. So here we are!

What is Lua?

First, let’s talk a little bit about the Lua language from the perspective of an implementer. Lua is a dynamically typed language, so any variable can hold any value. Values in Lua can be one of a handful of types:

nil, the singleton null value.
Booleans: either true or false.
Numbers, which are generally 64-bit IEEE 754 floating-point numbers. Lua 5.2 introduced transparent representation of 64-bit signed integers, so if a numeric literal does not contain a fractional part nor an exponent and fits in a 64-bit signed integer (like -345 or 0x100000000), then it will be stored as an integer.
Strings, which are immutable sequences of bytes. By convention, strings are UTF-8 encoded.
Userdata, which are implementation-defined values.
Tables, which are maps of values to other values. Lua raises an error if nil or NaN (the IEEE 754 sentinel value for undefined numeric results) are used as keys. Furthermore, table entries cannot use nil as a value: assigning a nil value to an entry removes it from the table.
Functions. Functions in Lua can take any number of arguments and can return any number of values. They may be closures, which means they have references to variables from their containing function. Lua calls such references “upvalues”.

The reference manual describes the syntax of the Lua language in depth, but most of these details don’t matter at a high level: the syntax performs various operations on these data types. What’s relevant for this blog post is that Lua files get parsed into functions. Global variables accesses and assignments are transformed into operations on an implicit upvalue table called _ENV. This allows us to understand all of the Lua language solely in terms of these data types. (Neat!)

Code Structure

My Lua interpreter is split up into three Go packages: lualex, luacode, and lua. The packages form a pipeline that translate Lua source code into a running program.

The first I package wrote is lualex. It’s very small and I wrote it from scratch. The main type in the package is lualex.Scanner, which takes in a io.ByteScanner and splits the stream into tokens (words and symbols). The scanner is a fairly direct implementation of the Lexical Conventions section of the Lua Reference Manual.

After lualex was done, I ported C Lua’s parser code into a package called luacode. I wanted to port C Lua’s parser directly for performance and compatibility reasons. C Lua’s parser is notable for being geared for immediate execution. A common approach for a programming language parser is to create an abstract syntax tree that represents the source code. C Lua’s parser does not take this approach. Instead, it generates a list of instructions. These instructions can be efficiently stored (they are encoded as 32-bit integers) and executed. This format also enables the parser to perform peephole optimizations. The full data structure that the parser produces is luacode.Prototype: a tree of functions. luacode.Prototype can be serialized into a binary format, called a binary chunk, that can be stored for later use. I intentionally preserved compatibility with C Lua’s binary format in luacode. This ended up working out really well to spot correctness issues; I’ll touch on the benefits in the “What Went Well” section. After the Lua source code leaves luacode, we have a luacode.Prototype that’s ready to run.

The lua package brings everything together with the lua.State structure that represents an intepreter. lua.State can create Lua values and executes the instructions produced by luacode. Each instruction is handled by a giant switch statement run in a loop. Because the expressions have already been broken down into fine-grained instructions, the interpreter does not have to be aware of operator precedence or lexical scopes. Thus, the interpreter is largely concerned with performing operations on Lua data. The lua package provides the same general API and stack-based execution model that C Lua has. However, as we’ll see in the next section, the data representation diverges quite a bit from the C Lua implementation and thus is an independent implementation.

Data Representation

Now that we have a high-level structure of the interpreter in the lua package, let’s dive into the internal data representation. As it turns out, a Go interface type is perfect for representing Lua values:

package lua

// value is the internal representation of a Lua value.
type value interface {
  valueType() Type
}

// Type is an enumeration of Lua data types.
type Type int

// Value types.
const (
	TypeNil           Type = 0
	TypeBoolean       Type = 1
	TypeNumber        Type = 3
	TypeString        Type = 4
	TypeTable         Type = 5
	TypeFunction      Type = 6
	TypeUserdata      Type = 7
)

I use a value(nil) as Lua nil. I use extension interfaces to implement operations that have per-type implementations. A prime example is the numericValue interface:

package lua

// numericValue is an optional interface for types that implement value
// and can be coerced to a number.
type numericValue interface {
  value
  toNumber() (_ floatValue, ok bool)
  toInteger() (_ integerValue, ok bool)
}

Then, I was able to create Go data types that map directly to Lua data types. For example, here are the numeric types:

package lua

type floatValue float64

func (v floatValue) valueType() Type              { return TypeNumber }
func (v floatValue) toNumber() (floatValue, bool) { return v, true }

func (v floatValue) toInteger() (integerValue, bool) {
  f := math.Floor(float64(v))
  if f != float64(v) || f < math.MinInt64 || f >= -math.MinInt64 {
    return 0, false
  }
  return f, true
}

type integerValue int64

func (v integerValue) valueType() Type                 { return TypeNumber }
func (v integerValue) toNumber() (floatValue, bool)    { return floatValue(v), true }
func (v integerValue) toInteger() (integerValue, bool) { return v, true }

The table and userdata types are more involved, but not interesting enough to cover here. The function type is more significant. Functions written in Lua are a *luacode.Prototype plus their upvalues. Built-in functions are implemented in Go and follow a signature of:

package lua

type Function func(ctx context.Context, l *State) (int, error)

The lua.State argument is used to access function arguments, push return values, and access Lua upvalues. C Lua passes around a pointer to implement stateful C functions, but Go has first-class closures, so we can use function values directly. lua.State maintains a stack of Lua values ([]value) used as temporary function storage.

My Lua data types have a notable difference from C Lua: an ability to be “frozen”. Freezing is a concept I borrowed from Starlark where the interpreter prevents mutations on a value. To implement this, I added flags to the table and userdata value types as well as the internal upvalue representation. When the flag is set, mutations raise an error. Freezing prevents unintentional global state and permits sharing Lua data values among concurrent lua.State interpreters without copying. Data sharing is only possible because of Go’s process-wide garbage collector: C Lua’s garbage collector is limited to individual lua_State* interpreters.

Bringing it Together

Now that we’ve talked about all the pieces, let’s look at how some Lua code gets run.

We’ll start with a variable assignment:

lualex and luacode parse the source, into a LOADI (load immediate) instruction. In luac’s listing/disassembly output, this looks like:

The 0 means to store the result into “register” 0. When a Lua function starts, a number of nils are pushed onto the lua.State stack. (The parser determines the exact number by counting the maximum number of local variables that need to be available at once.) These elements in the stack are called “registers” as a nod to CPU registers. The case for LOADI in the big interpreter switch looks like this:

i := currFunction.proto.Code[frame.pc]
// ...
switch opCode := i.OpCode(); opCode {
// ...
case luacode.OpLoadI:
  ra, err := register(registers(), i.ArgA())
  if err != nil {
    return err
  }
  *ra = integerValue(i.ArgBx())
// ...
}

registers is a local helper closure that returns a slice of the lua.State stack for the current function’s registers. register is a helper function that translates the register number to a *value referencing the slice’s underlying array after performing a bounds check.

Operators have a little bit more fanfare, but end up performing operations on registers.

local x = 42
local y = x + 3

is parsed into:

LOADI  0 42
ADDI   1 0 3
MMBINI 0 3 6 0

A LOADI instruction like before.
An ADDI (add immediate) instruction. If the value in register 0 is numeric, ADDI adds the integer 3 to the value in register 0, stores the result in register 1, then skips the next instruction. Otherwise, ADDI is a no-op and the interpreter proceeds with the next instruction.
The MMBINI (binary metamethod with immediate) instruction with the __add metamethod (value 6). This is a quirk to handle operator overloading metamethods in the case register 0 has a non-numeric value. The first two arguments specify the arguments to the metamethod, in this case the value in register zero and the integer 3. The destination register is used from the previous instruction.

The switch case that handles the ADDI instruction is:

i := currFunction.proto.Code[frame.pc]
// ...
switch opCode := i.OpCode(); opCode {
// ...
case luacode.OpAddI, luacode.OpSHRI:
  r := registers()
  ra, err := register(r, i.ArgA())
  if err != nil {
    return err
  }
  rb, err := register(r, i.ArgB())
  if err != nil {
    return err
  }
  c := luacode.IntegerValue(int64(luacode.SignedArg(i.ArgC())))
  if kb, isNumber := exportNumericConstant(*rb); isNumber {
    op, ok := opCode.ArithmeticOperator()
    if !ok {
      panic("operator should always be defined")
    }
    result, err := luacode.Arithmetic(op, kb, c)
    if err != nil {
      return err
    }
    *ra = importConstant(result)
    // The next instruction is a fallback metamethod invocation.
    l.frame().pc++
  }
// ...
}

exportNumericConstant is a helper function defined as:

// exportNumericConstant converts a floatValue or an integerValue
// to a luacode.Value.
func exportNumericConstant(v value) (_ luacode.Value, ok bool) {
	switch v := v.(type) {
	case floatValue:
		return luacode.FloatValue(float64(v)), true
	case integerValue:
		return luacode.IntegerValue(int64(v)), true
	default:
		return luacode.Value{}, false
	}
}

You’ll notice that the arithmetic function is defined in the luacode package, not in the lua package. This is because the parser performs constant folding, and such arithmetic needs to be consistent with the interpreter. Constant folding allows a script like:

to be parsed as:

instead of performing the arithmetic at runtime. Such arithmetic needs to be consistent with the interpreter for correctness, and thus the function is defined in the luacode package. The parser does not attempt constant folding on variables. However, <const> variables will be folded, so the following Lua code will produce the same single LOADI instruction:

local x <const> = 42
local y = x + 3

As we can see, the bytecode approach results in fine-grained instructions that the interpreter can process one-at-a-time with short snippets of Go that operate on our value data types. This approach also enables the parser to perform some on-the-fly optimizations to reduce pressure on the interpreter. The rest of the interpreter code follows the same general pattern of reading values from registers, doing something with them, and/or storing values into other registers.

What Went Well

This custom Lua interpreter was exactly what I needed for zb. The highlights:

Go’s built-in types, garbage collector, and standard library made my interpreter simpler than the C Lua implementation. One example I noted above was being able to separate the parser from the interpreter. C Lua’s memory allocation is tied to the lua_State* type, so it was tightly coupled, but my interpreter could use Go’s built-in garbage-collected types to structure the code differently. There were similar small wins in other places like deduping the constant table in the parser and implementing table.sort.
As I mentioned before, my decision to port the parser as directly as I could made it easy for me to spot correctness issues. I created a test suite with Go’s testing package and testdata directory convention where I could create a Lua file and compare it to a golden luac output. The go-cmp package surfaces the difference in instructions, so I could quickly see failures. The C Lua parser also has some nice optimization tricks like we saw above.
As we saw above, Go’s powerful interface types map nicely to Lua value types. C Lua uses tagged C unions and other memory-unsafe tricks to make this work.
Go’s built-in testing tooling helped me spot-check lots of small parts along the way. In C Lua, there are comments like “if you change this, change this other file” to ensure that invariants hold. In my interpreter, I ensured that the invariants hold by writing unit tests, like for ceilLog2, the unary operator instruction translation and the binary operator instruction translation.
Go’s built-in benchmarking and support for pprof helped me compare the performance of different approaches while I was developing. This Lua interpreter is not highly optimized yet, but I’m confident that between Go’s tooling and control over memory layout, I can solve most any performance issue.

Challenges

This interpreter project, like any project of this scale, was not without its challenges. Here were the most notable ones:

I had to completely rethink how my interpreter handles errors compared to C Lua. C Lua uses longjmp to pop the call stack and raise errors. I could have used Go’s panic mechanism, but this runs counter to the common Go error handling convention of returning an error as the final return value, and I didn’t want to mask runtime panics using this mechanism. I ended up coming up with an approach where Lua message handlers are stored in the Lua call stack. If an error reaches the lua package, the message handler is called as appropriate, then the interpreter unwinds the stack. This has the disadvantage that Go-implemented functions can drop errors, but it also gives Go-implemented functions the flexibility to handle errors how they want. In practice, being able to see at a glance which Lua API calls can produce errors is convenient for maintenance, much like in any Go application, so the benefits outweighed the implementation complexity.
I found several Lua standard libraries to be annoying to port over. I ended up rewriting the pattern matching logic entirely because C Lua has the exponential time complexity problem described in Russ Cox’s essay “Regular Expression Matching Can Be Simple And Fast”. It’s one of the few places where I break compatibility with C Lua. I was unable to reuse the Go regexp package because it operates on UTF-8 codepoints whereas Lua depends on its patterns operating on bytes. (Lua’s test suite has several cases which assert for byte-matching behavior.) A much lesser (but vexingly pervasive) problem is the availability of functions that surface a Lua value’s raw pointer address (e.g. string.format("%p")). Go provides no guarantees about the stability of its addresses, so I ended up giving each object a unique 64-bit integer identifier on creation. This solves the issue except for strings, but I think it is a mistake to expose the pointer identity of strings anyway.
Lua’s standard library exposes access to its garbage collector. I dropped this access entirely, because it would give scripts the ability to stall the host program. The other unfortunate part is that Lua has finalizers in the form of the __gc metamethod. Finalizers in Go are not guaranteed to run and technically neither are Lua finalizers, so I “implement” finalizers by ignoring them entirely. Similarly, I don’t implement weak tables because of how they would interact with Go’s garbage collector.

EDIT(2025-06-26): Folks have pointed out that Go 1.24 introduced weak pointers in the form of the weak package. Go 1.24 had not been released when I made the decision to not support weak tables. I haven’t studied Lua’s weak tables in depth yet, but I’m not exactly sure how the semantics of table entry removal would work with Go’s weak pointers. I am also unsure how or if this should work with value freezing. For now, weak tables are simpler to omit until there’s a strong use case.
A nit-pick: Lua’s test suite was hard to set up. It is lovely to have it available and it covers many excellent edge cases, but in many cases it depends on all the libraries being available. I had to implement the math and string libraries simultaneously to test behaviors.

Although these challenges added some development time, I think the end result is better for the changes I made along the way.

Conclusion

The features that make Go productive in general — a powerful standard library, garbage collection, and ubiquitous testing — make writing an interpreter fun. This interpreter is an internal package of my zb project if you want to look at it for yourself. I’m not planning on supporting it as a standalone package right now: zb is already large enough. However, the package is fairly self-contained and available under an MIT license, so you can fork it if you want scripting facilities in your Go application.

If this blog post seems interesting, check out zb! I’ll probably write another blog post in the future that dives deeper into my decision to use Lua for zb.

(Discussion on Reddit, Bluesky, and Lobsters)