![]() |
|
![]() |
| All of those formats are designed to be translated to machine code when maximum performance is desired. Whereas Lua byte code is designed and optimized to be interpreted directly.
One step in Lua's evolution was to change from a stack machine to a register machine: https://www.lua.org/doc/jucs05.pdf This made the interpreter faster, but also (I suspect) more difficult to verify. I believe both Java and Wasm are stack machines (don't know about BPF). |
![]() |
| Luau (Roblox's variant of Lua) seems to have disabled loading bytecode from Lua completely. Per https://luau-lang.org/sandbox:
> To achieve memory safety, access to function bytecode has been removed. Bytecode is hard to validate and using untrusted bytecode may lead to exploits. Thus, loadstring doesn’t work with bytecode inputs, and string.dump/load have been removed as they aren’t necessary anymore. When embedding Luau, bytecode should be encrypted/signed to prevent MITM attacks as well, as the VM assumes that the bytecode was generated by the Luau compiler (which never produces invalid/unsafe bytecode). |
![]() |
| >> the VM assumes that the bytecode was generated by the Luau compiler (which never produces invalid/unsafe bytecode)
Yep, to that end they also have a basic bytecode verifier (only used in debug mode / when asserts are enabled) that validates the compiler only outputs valid bytecode, and I believe they continuously fuzz the compiler to make sure those asserts can't be triggered. See https://github.com/luau-lang/luau/blob/0d2688844ab285af1ef52... It's fairly robust (and Luau bytecode isn't _that_ complex,) but they made the right decision disallowing direct bytecode execution. |
![]() |
| You should never assume any method of executing any attacker controlled code is safe, unless something explicitly calls that out and also has put Google-level amounts of effort into supporting that. |
![]() |
| My interpreter only accepts print and addition to a predefined variable. Let the attackers print and count all they want.
The problem isn’t the execution, it’s the scope of what it means to “execute”. |
![]() |
| > sent me screenshots of my desktop
Damn. That's the scariest thing I've read all week. This thread is really making me consider buying another computer for all gaming related things... |
![]() |
| The first thing to look for is if the solution states clearly that it is a speculation-safe sandbox. I do think that not many will do that, but there are some. And go from there. |
![]() |
| In general, verifying programs is extremely hard, not just because of rice's theorem but because it's so easy to miss a spot, especially for non-trivial bytecode languages like lua's. wasm has no concepts of for loops for example.
It's strange that after upstream has given up on the problem as it was too hard, factorio devs have chosen to try to fix the verifier/write their own (not sure which of the two they did). Minetest's loadstring function forbids bytecode entirely: https://github.com/minetest/minetest/blob/9a1501ae89ffe79c38... I wonder why factorio mods need the ability to execute raw lua bytecode. If they don't have it, there would be no need for a verifier. It's quite dangerous in the first place to execute lua code downloaded over the network. JS execution environments have gone through decades of cycles of discoveries of exploits and fixes. Lua gets those as well but on a smaller scale, and with less staffing to improve security. The main protection is I guess that there is fewer people running malicious game servers. |
![]() |
| Eventually every game developer learns the hard way that they must remove the bytecode ability from lua's loadstring() function.
E.g. here's a 12 year old blogpost on the topic from the ROBLOX developers: https://archive.is/oXPyM To be honest, it would probably be better off disabled by default. Its legitimate uses are pretty niche. |
![]() |
| So... this demonstrates an exploit that relies on a feature that is advertised as exploitable: loading byte code. What am I missing? |
![]() |
| > why can’t we just run it on their machine and propagate any game state changes (if the script adds an inserter, for example,)
Because that's an unbounded amount of traffic. You can reliably write data into RAM at many gigabits per second, whereas network connections are variable and many of them won't carry more than a few kilobits at the 99th percentile (note that you roll that 100-sided die like 30 times per second, so "1% situation" lag spikes are something you'd run into constantly) I sometimes use Lua commands in single player to clear biters from a certain region for example, which removes many entities. Propagating those sorts of changes on multiplayer (or take a more plausible example: wave defense that eventually spawns loads of entities at once) would cause a big lag spike if you have a few players that all need to receive this data, whereas simulating it locally on each machine is no problem Factorio multiplayer bandwidth is like a dozen kilobytes per second from what I remember, and this post agrees https://forums.factorio.com/viewtopic.php?p=125328#p125328 (couldn't quickly find an exact number though it must surely be out there). If you make it O(n) for every lua-touched entity in the game, it would quickly balloon into the megabits constantly and many mods would just not be viable for multiplayer for most people |
![]() |
| Unless I missed it (I admit I skimmed towards the end) The author does not discuss at all the actual remediation that was taken. I would love to hear more about that. |
![]() |
| Yes, but only because you might lose your job from playing too much factorio. :) the exploit was not a risk for vanilla unmodded single players, and has been patched in any event. |
Since lua interprets bytecode, it can check the arguments to the bytecode are meaningful. Point to memory lua allocated, things like that.
Turns out it doesn't do that. Feed it bytecode with invalid arguments passed to the instructions and it executes it anyway. The rest of the compromise follows.
Further, instead of fixing their interpreter, the game plan is to statically analyse bytecode. Which turns out to only work in simple cases.
For a sandbox friendly interpreted language this is pretty disappointing. I wonder if they'd take patches to stop the interpreter trusting the input - presumably performance regression is the fear there, which seems dubious when luajit is the fast option anyway.