两周内从二进制文件反编译并重写一款2003年的游戏。
Resurrecting Crimsonland – Decompiling and preserving a cult 2003 classic game

原始链接: https://banteg.xyz/posts/crimsonland/

## Crimsonland:一份热爱与逆向工程的成果 Crimsonland,一款2003年发布的俯视角射击游戏(2014年重制版),经历了一次独特的复兴。一位开发者没有进行典型的重制,而是着手一个精细的项目,以源代码*精确*地重现原始的Windows二进制文件。出于对游戏架构的怀旧和着迷,该项目于2026年初开始,使用了Ghidra、Codex(配合GPT-5.2)和Frida等工具进行反汇编、重命名函数和运行时分析。 目标不是现代化,而是*忠实度*——复制原始游戏中的每一个错误、纹理缺陷和硬编码的怪癖。这需要费力地破译游戏的自定义格式,它对DirectX 8引擎的依赖(通过Grim2D库),以及对游戏复杂且常常非常规代码的深入理解。 经过一年的工作和超过46,000行代码,该项目即将完成。核心游戏玩法已经完全可用,开发者希望添加在线排行榜甚至网络多人游戏等功能。这项工作展示了现代AI工具在逆向工程和保护方面的力量,在短短几周内实现了10tons花费一年时间并可访问原始源代码才完成的重制。该项目是开源的,邀请社区贡献来完善最终产品并保护这款备受喜爱的经典游戏。

一位名为banteg的开发者在短短两周内成功解编译并完全重写了2003年的俯视角射击游戏《Crimsonland》。这款游戏最初以精简的DirectX 8二进制文件形式发布,不包含调试符号,这给逆向工程带来了重大挑战。 Banteg使用了Ghidra等工具进行解编译,WinDbg和Frida进行行为验证,最终用Python/Raylib重建了游戏,忠实地复制了其46,800行原始代码。一份详细的文档记录了整个过程,包括静态和运行时分析以及逆向工程自定义资源格式。 该项目在GitHub上可用,突出了现代逆向工程工具的强大功能,并引发了关于利用LLM在“代理循环”中进一步增强该过程的讨论——可能超越大型公司使用的专有工具的能力。 重建的《Crimsonland》现在可以游玩。
相关文章

原文

some games die quietly. they get delisted, lose their multiplayer servers, fade into the digital void. others get remastered by the original authors with slightly better graphics and a battle pass.

and then there’s the third way: you open the binary in ghidra and start naming functions.

the history

crimsonland (2003, remastered 2014, resurrected 2026) is a top-down shooter from the era when indie games were still called “shareware” and steam was something that came out of radiators. you play as a small man with a gun. things try to touch you. you shoot them. eventually there are too many things. you die. it’s perfect.

i remember vividly downloading it on a 56k modem and playing with a friend. the tiny 7.5mb game had us entertained for months. it was the first game by a finnish studio 10tons. they initially made a free game in 2002. i preserved a few of those early freeware versions:

v1.0.2 from may 2002 is an early prototype that establishes core mechanics.

v1.3.0 from july 2002 adds 3 music tracks not heard later in the shareware version. you can listen to them here as tracks 10-12.

v1.4.0 from september 2002 is the final free version before 10tons heard the big news: the game got picked up by reflexive arcade, a major publisher at the time. the shareware version has seen release in april 2003 and it has spread like wildfire, including cover CDs of various game magazines.

the shareware version is known as v1.8.x-1.9.x series, with v1.9.8 from september 2003 receiving a cult following for some of the most overpowered combos possible of all versions. after reflexive shut down in 2010, there was another update v1.9.93 that has added widescreen support (960x800). the very same version has later become a free bonus on gog.com when the studio has made a remaster in 2014.

but let’s not get ahead of ourselves. the game got a cult following, the studio was teasing crimsonland 2 with features like network multiplayer and a fully rewritten engine. this archived blog page is the most representative of the sequel hopium. the forum was swarming with theories and excitement. here are some of the concepts that were posted (source):

by 2010 it was clear that crimsonland 2 was not going to happen. the studio has long shifted their focus to casual mobile games (infinite money glitch) that look more like zuma deluxe rather than their first game.

in 2013 the studio floats a remaster idea via steam greenlight, and the game sees a steam release on june 11, 2014. a gog.com release came a month later, and osx (now macos), linux, ps4, xbox releases have followed.

crimsonland 2003

crimsonland 2014

the game has its fans, but my heart lies there, in 2003, with the original mechanics.

the project

not gonna lie, i was interested in understanding what makes this game tick at a deeper level for a long time. i tried decompiling it, i came back to making clones decades apart. something about it was alluring.

by the time i started this project on january 16, 2026, i had a pretty good understanding of things like the custom formats the game uses. i could unpack and repack assets.

in my previous attempt around start of 2025 i loaded it in binary ninja and went back and forth with llms to gradually rename functions. this works for about three hours, then you start questioning your career choices.

now i could partially automate the loop with coding agents that never get tired, and if i set it up well enough, hopefully the errors won’t snowball. for this project i used codex with gpt-5.2 exclusively, as i found it to be the most rigorous agentic model.

about 4 days into the decompile and 653 commits in i finally understood my goal.

not a spiritual successor. not a modernization. not “inspired by.”

the goal is a complete rewrite that matches the original windows binary behavior exactly. if the original has a bug, the rewrite has the same bug. if there’s a texture that’s one pixel too small (there is), i replicate that too. the executable is the spec, and we’re writing the spec back into source code.

three rules:

  1. full fidelity. all behavior must match the gog classic build (v1.9.93, built february 2011). this is our specimen.

  2. no guessing. every reimplemented function must trace back to decompiled code or runtime evidence. when the decompiler lies, i instrument the running binary.

  3. no dependencies on the original runtime. assets load from the original archives, but all code is written from scratch.

the patient

the version of crimsonland.exe we are working with is a directx 8.1 game built in visual studio 2003 (vc++ 7.1 sp1). the binary has zero information that is helpful for reverse engineering it. it also comes with grim.dll, which is the game engine (grim2d). the remaster uses a different engine. i found NX symbols in the linux remaster (unstripped!), so i think it’s called nexus.

the binary is fascinatingly naked. 378kb of uninitialized data in the .data section. no names or types preserved for us to rely on. for the first ~800 commits i was just shooting in the dark.

the more i was looking at the game, this time capsule of early-2000s game architecture, the more i understood that i wouldn’t get any help. there were no lua scripts and everything was hardcoded in the exe.

so our starting point is missing names + missing types + missing calling conventions + c++/com indirection courtesy of directx 8 and grim.dll. everything is “object pointer → vtable → function pointer call”. the usual decompiler output looked somewhat like this:

third-party libraries. grim.dll statically linked libpng, libjpeg, zlib. it was often possible to identify the right versions from things like png_create_read_struct("1.0.5", ...), deflate 1.1.3, provide the appropriate headers, so ghidra could recognize their structs.

on day 1 i started a name_map.json, where i documented all the function renames and types i identified so far. since each rename was only our guess, it was important to document our logic. behavior observed to name inferred. string literals, call patterns, struct sizes, and relationships to already-named functions all serve as evidence.

for example, our agent observes that a function searches existing entry by name, allocates a 0x24 entry when missing, strdup’s name/value, parses float via crt_atof_l, and is used by register_core_cvars with “cv_*” strings. it renames FUN_00402350 to console_register_cvar and adds an entry to our map.

detangling notes.

the joy of renaming

if you look at the decompile today, you may notice it has grown by 13,000 lines to 114,473 lines. this is because ghidra initially has missed quite a large chunk of functions, most notably game initialization.

there was also some deliberately obfuscated functions like the credit secret sequence, where i needed to capture the right entrypoint at runtime and manually create a function at that address.

by day 5 i had a pretty good idea of game structs and the engine vtable layout. so i added header files for IGrim2D.h and crimsonland_types.h. later i have found that some versions of the game have shipped with cl_mod_sdk_v1 that had some headers, but it was not extremely helpful because i found it when the project was already in an advanced phase.

seeing steady progress was motivating. i set up a knowledge base (with zensical) on day one and was mapping whatever patterns codex had high confidence in. here is an example of what i started with and what it looks like now. and it gets more readable with each iteration.

binary_ninja_mcp and connect the agents to it. honestly they love it, and it allows for far more fluid automatic exploration than grepping through a 100k line ghidra decompile. they can easily ask the mcp questions like “what calls this?”, “what references this?”, “show me decompile of this function”, “find functions matching a pattern”.

it can also be used for retyping and renaming, but in my case the source of truth comes from my ghidra maps, which i apply to regen the binja outputs, so i ended up not using this functionality.

runtime analysis

static analysis is at best a hypothesis. i needed a way to validate in runtime. the game runs in wine (poorly), but the translation could look as ridiculous as this. we wouldn’t be capturing useful information after going through so many layers.

D3D8 → dgVoodoo → D3D11 → DXVK → Vulkan → MoltenVK → Metal

the truth is, it’s easiest to debug a windows game on windows. first i set up a vm using utm. it was a bit too slow for me to enjoy the project, so i bit the bullet and installed windows on my old macbook using bootcamp.

for runtime analysis i tried a bunch of things, many of them were a dead end. im looking at you, x86dbg, literally the most useless and frustrating program i ever interacted with and i got nowhere with it.

after a lot of trial and error i got two complementary tools working:

windbg

this is microsoft’s own debugger, it comes with a cli tool called cdb that shares the same engine. a cli tool is always nice, because it promises us headless analysis and agentic loops working.

cdb can connect to a running process and set breakpoints, read memory, inspect callstacks, all the usual stuff. combined with our data map, it becomes a very powerful extension to speed up the ghidra mapping process.

i tried setting it up with codex with cdb -pn crimsonland.exe. it should be noted that codex runs subprocesses in a pty, so it can talk to them both ways interactively. the only problem is it’s hardwired in a way that it kills the process when it ends the turn and gives you an answer. so you can’t really talk to codex while it sets breakpoints and watchers. of course, codex is open source and you can fork it and patch this behavior, but i found a different solution that works about as reliably.

cdb supports a server/client mode, so you can set this up using a long-running server process that attaches to the process that you start and a couple of commands for your agent. i use a justfile so it just needs to know the shortcuts. the client persists no logs, so we need the server to log into an external file. finally, there is a tiny helper script that remembers the position we have read the file last time and outputs the tail, so the agent can inspect the logs as it was receiving them live.

raylib for my rewrite. so far im happy with the choice.

the engine handles things like creating a game window, drawing textures, playing sounds, streaming music, and handling input. so basically it takes care of some of the stuff grim.dll does. the only thing left to do is to fill in the entire game.

the rewrite

up to this point i was greedily documenting every behavior we could infer from the executable, but we had no code of our own besides the format loaders.

i think of it like getting a noisy picture in path tracing rendering. some details start to come through but the picture is not fully clear yet. we have random things documented but we can’t know for sure if this knowledge is sufficient to reimplement the entire game. there is only one way to know.

we need to change our render settings to scanline. just kidding, but that’s how i thought about it. when my version could boot into the menu, we would have uncovered all the missing pieces that lead up to that point. when we get to the gameplay, we’d have to implement the most systems to get there. our documentation helps, but this forward path leaves no system untouched and eventually we have our working game.

on day 6, when i got to the main menu, i found a funny path that would allow me to cover a lot of ground. i noticed the game still had demo teaser code intact, but game_is_full_version() was hardcoded to 1 in this gog version. naturally, i could write a frida script to uncrack the game and make it behave like a shareware, even though this version never intended such functionality.

the rendering in this game is pretty simple. there is a ground framebuffer (in opengl, i believe, the correct term is render target). first the game generates a terrain, i went great to lengths to get it right. we literally have test fixtures that assert we generate the exact same picture from the same seed. then this framebuffer is used to render all sorts of decals, like bodies, blood, bullet casings, scorch marks. this simple technique allows the game to visually transform the battlefield during gameplay.

the creatures come from sprite atlases you’ve seen above. the projectiles render with either beautiful traces that stay in the air, or at most with a simple texture and additive glow.

so rendering was not a huge obstacle, the hard part was that everything in this game is hardcoded in the exe. for example, all quest spawn scenarios are just one massive 3950 line switch statement. the indirection i mentioned has caused me a bunch of headaches, and i hit off-by-one errors for some things a few times, before they got pinned down with runtime analysis.

obviously, i needed our version more testable, so some of it was destined to be refactored into composable and testable bits. testing is important, because you can turn runtime captures into fixtures, and prove that your behavior is identical. for example, this is how quest builders look in my version (and spawn templates are another work of art).

soft shadows in raymarched signed distance fields, so we can have long soft shadows with realistic penumbras. it works really well too, i might integrate it into the game as an optional night mode.

an early prototype of signed distance field raymarched lighting

the implications

in 2014, it took a year to port the game from directx 8 to directx 11, while having the sources. and it wasn’t even complete until later, some modes like typ-o-shooter landed as updates later on.

it took us a bit more than a year from the old windows version from 2003 to the current multiplatform version of crimsonland. – 10tons

in 2026, it took me just two weeks and 1666 commits to rewrite the game, starting from nothing but the worst case scenario binaries. the whole time i saw steady progress every day, i didn’t get stuck, the errors didn’t snowball, and the game is actually playable and faithful to the original.

i specifically picked a task that is on the harder end for the current models, and yes, you can discount this on me being a good engineer, knowing my tools well, etc etc. but it surely felt to me that something new was unlocked with gpt-5.2 xhigh and codex. i found it to be an extremely rigorous model that follows instructions to the letter, and doesn’t invent stuff by itself. paired with gpt-5.2 pro for planning, it works extremely well.

i shipped two large and complex projects with it in a month. and honestly im amazed with the capabilities already. in the right hands these tools can give you amazing results. people focus on one shot wonders, but the real test is what you can achieve when you use these models for determined work. and with that im satisfied.

hope you learned something useful and will go and try to preserve a bright memory from your childhood. it will be a lot of work, but it will be worth it in the end.

the invitation

if the original crimsonland is still in your muscle memory and you can call out subtle inconsistencies like “hmm i think this spider had friendly fire”, you can help speed up squashing the remaining bugs. our binary files are static, we will find all inconsistencies eventually anyway.

you can also join the telegram group to follow the project.

if you want to study the code, check out the github repo.

then you can look up how different mechanics work up to the implementation detail in the knowledge base.

if you just want to enjoy some action, you can play my version right now (you need uv package manager).

p.s. the screenshot labeled crimsonland 2003 is actually from my version

联系我们 contact @ memedata.com