使用 Bash 从头开始​​编写 Minecraft 服务器(2022)
Writing a Minecraft server from scratch in Bash (2022)

原始链接: https://sdomi.pl/weblog/15-witchcraft-minecraft-server-in-bash/

支持免费知识和教育。 所有文本材料均根据 CC BY-SA 许可提供。 所有其他材料均为其各自所有者的财产。 请随意重新分发和修改这些材料,但请提供对 Sodomni 的正确归属。 如果您喜欢我的项目,请考虑给我买杯咖啡或通过 Ko-Fi 支持我,因为这有助于为未来的项目提供资金。 您的贡献支持免费教育和知识。 谢谢你! 注意:未经原公司明确书面同意,禁止重新分发或修改图像中的商标公司徽标。 有关使用权限的请求或问题,请通过电子邮件或社交媒体与我联系。 感谢您对免费知识和教育的支持! ❤️ 另外,请随时在我的各种社交媒体帐户上关注我,以了解新项目的最新动态! 链接如下。 社交媒体帐户: 推特:https://twitter.com/Sodomni/ 脸书:https://www.facebook.com/Sodomni/ Instagram:https://www.instagram.com/sodomni/ 不和谐:https://discord.gg/Gk7B5uS/ 媒体:https://medium.com/sodomni/ 领英:https://linkedin.com/in/sarah-smith-569a626/ If you prefer a monthly newsletter, you can subscribe to my Patreon updates by clicking here or going to patreon.com/sodomni/. Stock Materials Used (Attributions included where necessary): - Sodomni 的 Minecraft 资源包(纹理和配乐)[https://sites.google.com/view/resourcepack] - RTex 项目的 Minecraft 兼容层包 1.15.x(兼容层纹理)[https://rtx.realitiesunlimited.co.uk/index.php?title=下载:-RTX_-Pack_-1._15.-1。 ] - 液压 [https://en.wikipedia.org/wiki/Hydrostatics_(fluid_containment)](压力理论和原理

回答之前提出的问题:是否可以通过可编写脚本的 Bash 命令为 Minecraft 等商业流行游戏创建自定义服务器? 是的,可以为《我的世界》以及其他商业游戏编写自定义服务器,尽管创建自定义服务器可能会引发有关版权所有权的问题。 围绕 Minecraft(尤其是 Java 平台上的 Minecraft)等游戏的自定义服务器和模组的创建和分发,存在着蓬勃发展的社区。 这些社区严重依赖 Bash 和 Python 等脚本语言来促进模组管理、模组集成和其他游戏相关任务。 Minecraft 的自定义服务器允许玩家参加玩家对玩家 (PVP) 比赛、探索用户创建的世界并参与各种形式的竞争。 与《我的世界》相反,为商业游戏开发游戏客户端会带来更大的版权影响,因为它涉及更改通常受版权法保护的专有系统。 然而,游戏客户端修改(称为模组或替换)在《Runescape》和《魔兽世界》等多种游戏中仍然很常见,尽管受到游戏公司施加的特定限制。 最终,可编写脚本的 Bash 命令使用户能够通过实现游戏库存版本中缺少的功能来修改游戏,同时提供对游戏编程框架关键方面的洞察。 关于前面引用的文章,它令人印象深刻地展示了作者使用 Bash 为 Minecraft 构建完整且功能齐全的服务器设置的能力,强调了 Bash 作为传统使用的脚本语言的可行替代方案的适应性和多功能性。 考虑到其实际应用和易于实现,使用 Bash 来完成传统上为编译语言保留的任务所表现出的足智多谋值得钦佩。 虽然一些开发人员批评 Bash 阻碍了生产力,但批评者往往忽视了利用 Bash 进行游戏服务器管理所带来的众多好处,特别是在节省成本和降低系统复杂性方面。 此外,采用 Bash 作为主要脚本语言使开发人员能够对游戏行为的几乎每个方面(包括 mod 安装和配置过程)实现更高级别的精细控制。 总的来说,本文证明了 Bash 作为一种合适且强大的脚本语言的可行性,尽管存在效率和性能方面的担忧。 使用 Bash 来执行历史上分配给传统编译语言的任务,凸显了程序员在寻求克服技术障碍同时实现最佳结果的方法时所表现出的聪明才智。 通过使用B
相关文章

原文

My thoughts on writing a Minecraft server from scratch (in Bash)

For the past year or so, I've been thinking about writing a Minecraft server in Bash as a thought excercise. I once tried that before with the Classic protocol (the one from 2009), but I quickly realized there wasn't really a way to properly parse binary data in bash. Take the following code sample:

function a() { read -n 2 uwu echo "$uwu" | xxd }

This would read two bytes into a variable, and then pass them to `xxd`, which should show the hexdump of the data.

Picture 1 - bash's lack of support for nullbytes

Everything's great, until we pass a nullbyte (0x00). Not only does Bash ignore nullbytes in strings, it also doesn't present any way to detect that a nullbyte has occured. Considering that the protocol I'm trying to implement is strictly binary, this can severely mangle the data.



One rainy evening in late January, I've had a revelation. What if I reversed the order of that function? If the binary data never reaches a variable (or, more precisely, a substitution), and just stays inside a pipe, can it pass nullbytes around?

Picture 2 - reading nullbytes with xxd

The answer is yes! After some iterations, I decided to use `dd` passed to `xxd` instead of just `xxd`, because this way I can finetune how many bytes to read.

# the $len variable is assigned earlier, basing on a similar read function a=$(dd count=$len bs=1 status=none | xxd -p)

This gave me a hex string, on which I could do pattern matching, pattern replace, data extraction... and more. Sending out responses could be done analogically, using xxd's Reverse switch.

`ncat` is used for listening on Minecraft's default TCP port. It launches the main shell script (`mc.sh`) after it receives an incoming connection.

The Protocol Is Not Really Good, Actually

Note: the following section contains mostly my ramblings about implementing number conversion routines in Bash; If this does not interest you, feel free to skip it.

The first thing one should implement for a Minecraft server to function would be the Server List Ping packet - not because it's required (heck, your server can just not reply to it properly, and you'd still be able to join the game), but because it's the easiest to tackle first. It helps to familiarize yourself with core protocol concepts, such as data types types:

VarInts and VarLongs

Most data types were trivial to implement, but some gave me more of a fight than others - notably the IEEE754 floating point numbers (more on them later), and so-called VarInt/VarLong numbers. Those may be familar to those acquainted with the MQTT protocol, as they're just a modified version of the LEB128 encoding.

LEB128 is a compression scheme for integers. By splitting a byte into 1 signalling bit and 7 data bits, the scheme stores the number length. If the 1st bit is 0, then this byte is the last one; else, then there's another byte after this one. Great scheme if most of your numbers are either between 0 and 127 or 256 and 16383, otherwise it's `buy one byte, get one free` situation, because numbers that would otherwise fit in a byte get pushed out to the next one by a single bit.

Picture 3 - explanation of basic LEB128 in a drawing form; red bits are signalling bits, green bits are data bits. Input value is 0xFF (256), output value is 0xFF01 # from src/int.sh # int2varint(int) function int2varint() { local a local b local c local out out=$(printf '%02x' "$1") if [[ $1 -lt 128 ]]; then : elif [[ $1 -lt 16384 ]]; then a=$(($1%128)) b=$(($1/128)) out=$(printf "%02x" $((a+128)))$(printf "%02x" $b) elif [[ $1 -lt $((128*128*128)) ]]; then a=$(($1%128)) c=$((($1/128)%128)) b=$(($1/16384)) out=$(printf "%02x" $((a+128)))$(printf "%02x" $((c+128)))$(printf "%02x" $b) fi echo -n "$out" }

I've had problems translating the reference implementation to Bash, so instead I played with the protocol enough to write my own from scratch. I figured out that it was basically a modulo and a division in a trenchcoat, which I used to my advantage in the code snippet above.

I took a more contemporary approach on the decoder, using an AND, and then multiplying the result - similarly to how the reference did it.

LEB128 definitely wasn't the hardest or the most annoying to implement (that one goes to IEEE754 floating point); I still don't like how it is sprinkled in random places inside the protocol, interleaved with regular ints (and longs), and in some cases even signed shorts.

IEEE 754 Floating Point numbers

I'm not a math person. When I see the exponential notation spewed out by Python, I scream and run. This may be the main cause of why I hated implementing these floating point converters. I won't be going too deep into specifics of how this format works - instead, I recommend you check out this wikipedia page.

The basic implementation requires a loop, inside of which there's a negative power applied to the result; Bash doesn't natively support negative powers, which sent me on a trip to find a utility that does.

A suggestion I found while duckduckgoing was to use perl, but I consider that cheating. Alternatively, tried using `bc`, but it seems that either it doesn't support powers at all, or the busybox version does not. Bummer.

When I was about to give up, I got reminded that Kate once made a plot program in awk. Surely, awk has powers? ~~Maybe even super cow powers?~~ It turns out that it does!

$ echo '' | awk '{print (2**-1)}' 0.5

With this knowledge, I scribbled a working implementation and attached it to data decoded from the Player Move packet. In a trial run, the client sent around 50-100 packets like that, each one with three doubles (X, Y, Z). It turned out that the conversion function was so slow, that the server wasn't done with that workload after multiple minutes - something rather unacceptable for a real-time game.

The easiest solution to lowering the response time would be lowering the amount of calls to external binaries, such as awk. As most of my workload was already inside a bash `for` loop, I just moved the loop inside `awk`, which has saved me literally tens of calls to awk.

# (...) asdf=$(cut -c 13-

The conversion is still quite slow (it takes ~10ms on my Xeon E5-2680v2), but this is to be expected with bash scripts. For a cheap comparsion, the old version took around 350ms, but I don't have solid measurements to prove that. ~~still, 35x faster, woo!~~

"Position" data type

Finally, something made up by Mojang themselves! Position is a 64-bit Long value, where three coordinates are stored alongside each other: X gets 26 most significant bits, Z gets 26 middle bits, Y gets 12 least significant bits. I'm not the biggest fan of weird data types like this one, but it was crazy easy to implement in Bash, because it has all the needed bitshift operators.

The worst part about this data type is that it doesn't actually get used much. Around half of the packets store X, Y and Z coordinates as separate Double vaules. This means that:

  • the location data suddenly grew from 64 bits to 192 bits per packet
  • we get 9 digits of floating point accuracy, if we assume that we're only ever going to need numbers up to 30 000 000 (where the world border is at by default)
  • the protocol gets messy, with ~~two~~ three (or more) different number formats

I kinda see the reasoning as to why it's like that, but I still don't like the current state. Normal server communication uses zlib anyways, and you realistically won't ever need more than two (or maybe three max) digits of decimal precision to describe a position within a block.

Named Binary Tag

Lastly, there's the NBT format, also an internal thing made by Mojang Hatsune Miku herself. NBT is like JSON, but for binary data. Not unlike JSON, it gets abused to store arbitrary data beyond spec - for example, Mojang stores bitstreams of variable length as an array of Longs; if such array isn't long-aligned, or even byte-aligned, the last Long is padded with zeroes.

At one point I've had a NBT parser implementation implemented almost fully, but I decided it was not worth the hassle to finish it. The code is currently lost, due to my extensive use of tmpfs as a project directory, and a system crash.

Writing the actual server

With all the math out of the way, here comes the *actually fun* part. I documented some of my adventures on Twitter, but that thread was merely a glimpse on the actual development process. Also, let's assume that we already have the Server Ping packet out of our way, and it's a matter of actually making the game joinable now.

To allow a client to join your server, it has to complete the handshake process and send a few extra packets (chunks, player position, inventory, join game). Two biggest obstacles on that course were the Join Game packet, and the data structure within the Chunk packets.

Join game

The join game sends some initial metadata: player's entity ID, gamemode, some information about the world and, since ~1.16, a "Dimension codec". This is a NBT Compound, containing the following fields:

Picture 4 - What if a client gets an empty NBT Compound? It helpfully dumps out all the required values

The Dimension codec part was a major pain to implement. For my purposes, I decided to retrieve that NBT field from a vanilla server. It's the only binary blob in this implementation, and while it could be reimplemented, I don't see any reason to reimplement something that I don't necessarily need (or want) to customize.

Chunk Data And Update Light

At first glance, this packet looks huge and scary! If you have the table from the link above open side by side with this article, I invite you to imagine that every one of those huge BitSet fields is actually just `0x00`, and that you don't need to send the Block Entity field at all. This leaves us with X, Y, heightmaps (which are fancily encoded repetitions of `b000000010`, and could be virtually anything), and the ominous Data field. Less scary, right?

What's the Data?

The Data field is actually an array of Chunk Section. A Chunk Section is 16 by 16 by 16 blocks, and multiple sections can be stacked together to create a Chunk. For our purposes, this array only has a single element, just to simplify the code a bit.

A Chunk Section contains a block count, a block states container, and a biome container. Both of these containers use palette structure to encode possible block values - this means that before the real block data, server has to define a mapping from the "local" block IDs, to "global" block IDs. This aims to squish as much data as possible inside a small space - a block definition can be as small as 4 bits.

As in my opinion the wiki page doesn't explain it well enough to quickly comprehend, here's another drawing:

Picture 5 - left, a visualized palette to global ID mapping; right, a simplified example of how the displayed blocks correspond to the encoded data at 4-bit per block.

For me, the easiest way I found to send those fields data management-wise was to use an 8 bit (instead of a minimal 4 bit) block definition length. This would give me a whopping 256 possible palette entries, out of all available blocks to choose from. Then, writing actual chunk data would be as easy as sending hex numbers referencing those palette entries. A 4-bit palette would be equally easy (a byte represented as a hex string is two characters, representing 4 bits each, so `0x01` would represent two blocks - one with id 0, and another with id 1), but it would limit me to 16 blocks per chunk.

The standard actually allows for anything from 4 bits per block to 9bpb, otherwise it assumes a direct palette mapping with 15bpb - I too have no idea why it isn't byte-aligned.

The biome palette works a bit differently in my implementation - it just sends an empty palette, and then maps biome ID 0x01 (minecraft:plains) directly to all regions inside the chunk. This was based on my reverse engineering of how vanilla works - I suspect that the existing documentation of this part of the packet is incorrect, as I'm either getting too much data, or missing a few bytes every time.

Picture 6 - after fully implementing everything from that list and sending a few chunk packets, we have chunks showing up!

"Plugin" system

By now, we only have a plain chunk, not anything special by any means. I definitely wanted to make a few demos to show that the server can do *more* than just load and show a chunk, but I didn't want to create a separate source tree for every demo I spewed out. My solution is a series of overridable functions I called `hooks`, and an option for the server to load your own code. This allows for anything from changing how the world looks, to hooking up a pkt_effect so your player makes ticking noises while you move the mouse. Below I attach a simple (unoptimized) "plugin" that generates a chunk with random blocks from the default palette, which makes for an oddity that's kinda interesting visually.

#!/usr/bin/env bash # map.sh - simple map modification showcase function hook_chunks() { chunk_header for (( i=0; i $TEMP/world/0000000000000000 pkt_chunk FFFFFFFF FFFFFFFF 00 pkt_chunk FFFFFFFF 00000000 00 pkt_chunk FFFFFFFF 00000001 00 pkt_chunk 00000000 FFFFFFFF 00 pkt_chunk 00000000 00000000 pkt_chunk 00000000 00000001 00 pkt_chunk 00000001 FFFFFFFF 00 pkt_chunk 00000001 00000000 00 pkt_chunk 00000001 00000001 00 } Picture 7 - output of the code displayed above

Another demo worth taking a look at is digmeout - it's a simple highscore based game, which throws you onto a chunk with randomly placed stone and ores. Dig out the most valuable ores until the timer runs out!

Picture 8 - you know the game and, you're gonna play it

Witchcraft's (that's the project name!) Quirks

  • Bash is notoriously bad at handling decimal numbers. It's *ok* with Integers (as long as you don't do too advanced maths on them), but the only way to handle a decimal number is by multiplying it on input, and somehow placing a dot in the correct place for output. Because of this, most (if not all?) numbers handled by Witchcraft are ints.
  • The multiplayer doesn't really work? I mean, it kinda does, but I never really took time to finish it and polish it up.
  • Witchcraft is technically a multi-threaded server!
  • ... which means that it has to use terrible hacks to communicate between threads. Currently, most global data is stored under `/dev/shm/witchcraft`, internally referenced to as `$TEMP`.
  • Witchcraft is slow, especially in terms of data exchange between multiple threads. Don't expect to be able to send massive amounts of data, generating and sending 16 solid chunks can take as long as a second.
  • Witchcraft currently runs *only* if you have the latest BusyBox (1.35.0) installed. I haven't tested it with GNU coreutils, but I expect it won't work.

FAQ

Q: Why?

A: Because I could. And it was fun!

Q: Where do the block IDs come from?

A: Witchcraft-internal IDs are defined in src/palette.sh, and can be redefined in "plugins". The external IDs to which the internal ones are mapped can be acquired from the vanilla server. Check out this reference page on Data Generators.

Q: Why "WitchCraft"?

A: selfisekai came up with that name, possibly because I'm a (bash) witch, and I thought it was *great*

Resources


Big thanks to Lauren, Mae and cadence for proofreading this post! :3


Support me on ko-fi!

Comments:

This project is wicked cool but your font choices make it super hard to read. And this black-on-dark-gray comment box is insane! :D

Oh dear, that is cursed. Much more so than the HTTP(s) server in bash that I have seen around... I love it~! ...wait is the font for this the minecraft font. And agree with previous comment, the black-on-gray is hard to read q-q

as someone that knows very little bash, this was extremely fun to read. love the website too! :)

I've had this idea a few months ago and I didn't think it was possible. This is awesome! good work.

That's crazy. I wrote a small MC network implementation in C++ and gave up after I started to see how they randomly change packets in different versions. I didn't want to keep up with that. But do this in bash for MC is crazy. I went back and wrote a small reverse proxy server for MC though.

Your website is great. Your posts are great. Your everything is great. Keep up the work, It's definitely worth it...

ilove this. gonna play with it, if i manage to make something worthwhile i let you know. thanks for this interesting unconventional project.

Cool project! May I ask how much time went into it? I don't really know how complex the protocol is or how long time each test takes, like if you need to restart the client and stuff.

this is a glorious write-up of the process tho. one of the best ways to learn coding is to do or see "what if i did this stupid pointless thing" then seeing aaaaaall the pitfalls

I do a lot of Bash and lemmie tell you, this is pretty badass! Some people would say it's cursed, but I think its just cool - Thanks for sharing ^w^

Wow for the hackery of the "in Bash", and nice font reminds me of some VGA stuff.

You can hackily handle floating point numbers in a bash script by piping the equation through BCMath. In the case that I've had to do it, it's pretty much: var=$(echo "scale=9;$num1/$num2" | bc) Where the 9 is the number of decimal places and / is the operand and can be replaced by whatever. This is probably highly inefficient but hey it works.

This is exactly the kind of insane (in a good way) stuff I love to read about! Looking forward to more of this! :3

You can absolutely read and write streams containing NULs in bash -- the trick is to store them as arrays (with the terminal element containing everything after the last NUL) instead of as strings. Also, echo is an abomination in general, and even the POSIX spec describing it says that printf should be used instead -- search for the excellent answer by Stéphane Chazelas to "Why is printf better than echo?" on unix.stackexchange.com

Great piece. Terminus font in the screenshots also, the best terminal font by far.

I got rickrolled at the ending.

nice work! I was puzzled with the line: a=$(dd count=2 bs=$len status=none | xxd -p) as dd cannot will hangs here on an empty stdin. or maybe it's lacking context and your script stdin is filled already with data...? or are you using some function like this? function a() { dd count=$len bs=1 status=none | xxd -p; } anyway it is a clever idea to use dd to deal with null chars! Another way would be to use tempfiles (in ram) instead of strings variables, but that would probably perform slower.

that is, quite literally, witchcraft. I'm amazed. I mean I made an HTTP server in browser Javascript, and that's NOTHING compared to this. this is a fully functional, real time minecraft server. a full-on online 3D game. Wow. also, lovely font choice ;)

This is sooo cool! I once tried to write an http server in bash and gave up on it after writing file serving code became too hard for me

This is a beautiful abomination. I hope you're proud of yourself. 🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥

Agree that the font makes this super hard to read. If i zoom in or out at all in Chrome it becomes blurry. BUT - persevered to read the content of the article which was super cool! Awesome work

function stdin2hexdump() { local LC_ALL= LC_CTYPE=C LC_COLLATE=C LANG= local IFS="" c b h o=0 hs= cs= i1=$'\033[7m' i0=$'\033[27m' while read -r -s -d "" -n 1 c; do printf -v b "%d" "'${c}" printf -v h "%02X" "${b}" case "${c}" in $'\000' ) c="${i1}@${i0}";; $'\001' ) c="${i1}A${i0}";; $'\002' ) c="${i1}B${i0}";; $'\003' ) c="${i1}C${i0}";; $'\004' ) c="${i1}D${i0}";; $'\005' ) c="${i1}E${i0}";; $'\006' ) c="${i1}F${i0}";; $'\007' ) c="${i1}G${i0}";; $'\010' ) c="${i1}H${i0}";; $'\011' ) c="${i1}I${i0}";; $'\012' ) c="${i1}J${i0}";; $'\013' ) c="${i1}K${i0}";; $'\014' ) c="${i1}L${i0}";; $'\015' ) c="${i1}M${i0}";; $'\016' ) c="${i1}N${i0}";; $'\017' ) c="${i1}O${i0}";; $'\020' ) c="${i1}P${i0}";; $'\021' ) c="${i1}Q${i0}";; $'\022' ) c="${i1}R${i0}";; $'\023' ) c="${i1}S${i0}";; $'\024' ) c="${i1}T${i0}";; $'\025' ) c="${i1}U${i0}";; $'\026' ) c="${i1}V${i0}";; $'\027' ) c="${i1}W${i0}";; $'\030' ) c="${i1}X${i0}";; $'\031' ) c="${i1}Y${i0}";; $'\032' ) c="${i1}Z${i0}";; $'\033' ) c="${i1}[${i0}";; $'\034' ) c="${i1}\\${i0}";; $'\035' ) c="${i1}]${i0}";; $'\036' ) c="${i1}^${i0}";; $'\037' ) c="${i1}_${i0}";; $'\177' ) c="${i1}?${i0}";; esac (( b > 127 )) && c="${i1}.${i0}" hs+=" ${h}" cs+="${c}" (( o++ )) (( o % 16 )) || { printf "%08X |%s | %s |\n" "$(( ${o} - 16 ))" "${hs}" "${cs}"; hs=; cs=; } done (( o % 16 )) && printf "%08X |%-48s | %s |\n" "$(( ${o} - ( ${o} % 16 ) ))" "${hs}" "${cs}" }

联系我们 contact @ memedata.com