蒙戈出血解释 (Mēnggē chūxiě jiěshì)

蒙戈出血解释 (Mēnggē chūxiě jiěshì)
MongoBleed Explained Simply

原始链接: https://bigdata.2minutestreaming.com/p/mongobleed-explained-simply

## MongoBleed：MongoDB 严重漏洞 MongoBleed (CVE-2025-14847) 是一个影响几乎所有自 2017 年以来 MongoDB 版本的严重漏洞，源于其 zlib 压缩处理中的缺陷。它允许攻击者读取未初始化的堆内存——可能泄露敏感数据，如密码、API 密钥和客户信息——*无需*身份验证。该漏洞利用方式是发送一个经过精心设计的压缩消息，并声称其大小异常巨大。这迫使 MongoDB 分配一个大的内存缓冲区，然后不正确地处理实际数据大小，留下未引用的堆空间可访问。通过发送无效的 BSON 对象，攻击者可以触发服务器将此内存中的数据泄露到错误消息中。该漏洞存在大约八年才被发现，并于 2025 年 12 月中旬开始修复。虽然 MongoDB 声称没有证据表明该漏洞已被利用，但其易用性和长时间的暴露窗口引发了担忧。已为受支持的版本提供补丁，但较旧的、已停止支持的版本（3.6、4.0、4.2）仍然存在漏洞。缓解措施包括更新到最新补丁或禁用 zlib 网络压缩。超过 213,000 个公开可访问的 MongoDB 实例可能面临风险。

## MongoBleed & 暴露的数据库 - 摘要最近出现了一种名为“MongoBleed”的漏洞，影响MongoDB数据库。Hacker News上的讨论强调了这些数据库经常暴露在互联网上，使其容易受到攻击。虽然SQL数据库较少公开访问，但评论员认为，使用MongoDB的组织倾向于优先考虑易于设置（例如无模式设计），而不是强大的安全措施，这可能导致广泛暴露。 Shodan的数据表明，有大量MongoDB实例暴露在外：最近的一次扫描发现了超过213,000个。这引发了对像MongoBleed这样的漏洞的潜在影响的担忧，因为潜在的宽松安全措施导致了一个巨大的攻击面。对话暗示了选择MongoDB以方便使用与忽略关键安全配置之间存在关联。

原文

MongoBleed, officially CVE-2025-14847, is a recently-uncovered extremely sensitive vulnerability affecting basically all versions of MongoDB since ~2017.

It is a bug in the zlib message compression path in MongoDB.

It allows an attacker to read off any uninitialized heap memory, meaning anything that was allocated to memory from a previous database operation could be read.

The bug was introduced in 2017. It is dead-easy to exploit - it only requires connectivity to the database (no auth needed). It is fixed as of writing, but some EOL versions (3.6, 4.0, 4.2) will not get it.

Let’s get a few basics out of the way before we explain the bug:

MongoDB uses its own TCP wire protocol instead of e.g HTTP. This is standard for databases, especially ones chasing high performance.
Mongo uses the BSON format for messages. It’s basically binary json but with some key optimizations. We will talk about one later because it is essential to the exploit.
Mongo doesn’t have endpoints or RPCs. It only uses a single op code called OP_MSG.
The OP_MSG command contains a BSON message. The contents of the message denote what type of request it is. Concretely, it’s the first field of the message that marks the request type.
The request can be compressed. In that case, an OP_COMPRESSED message is sent which wraps the now-compressed OP_MSG BSON.
The request then looks like this:

     OP_COMPRESSED message
┌────────────────────────────┐
│ standard header (16 bytes) │
├────────────────────────────┤
│ originalOpcode (int32)     │
│ uncompressedSize (int32)   │
│ compressorId (int8)        │
│ compressed OP payload      │
└────────────────────────────┘

The first part of the exploit is to get the server to wrongfully think that an overly-large OP_MSG is coming.

An attacker can send a falsefully large `uncompressedSize` field, say 1MB, when in reality the underlying message is 1KB uncompressed.

This will make the server allocate a 1MB buffer in memory to decompress the message into. This is fine.

The critical bug here is that, once finished decompressing, the server does NOT check the actual resulting size of the newly-uncompressed payload.

Instead, it trusts the user’s input and uses that as the canonical size of the payload, even if it got a different number.

The result is an in-memory representation of the BSON message which looks something like this:

[ 1KB of REAL DATA |      999KB of UNREFERENCED HEAP GARBAGE       ]
                   ↑                                               ↑
        actual length (1KB)                     user input length (1MB)

Like in every programming language, when a variables in the code goes out of scope, the runtime marks the memory it previously took up as available.

In most modern languages, the memory gets zeroed out. In other words, the old bytes that used to take up the space get deleted.

In C/C++, this doesn’t happen. When you allocate memory via `malloc()`, you get whatever was previously there.

Since Mongo is writen in C++, that unreferenced heap garbage part can represent anything that was in memory from previous operations, including:

Cleartext passwords and credentials
Session tokens / API keys
Customer data and PII
Database configs and system info
Docker paths and client IP addresses

[ REAL BSON DATA | password: 123 | apiKey: jA2sa | ip: 219.117.127.202 ]

Now that the server has wrongfully allocated some potentially-sensitive data to the input message, the only thing left for the attacker is to somehow get the server return the data.

As mentioned, BSON is Mongo’s way of serializing JSON. As mentioned on its site, it was designed with efficiency in mind:

3. Efficient
Encoding data to BSON and decoding from BSON can be performed very quickly in most languages due to the use of C data types.

C famously uses null-terminated strings. A null-terminated string means that a null byte is used to mark the end of the string:

char* s = "hello"
// in memory, this is represented as an array of characters with the last element being the null terminator: h e l l o \0

The way such strings get parsed is very simple - the deserializer reads every character until it finds a null terminator.

If you recall, I said earlier that the first field of the BSON message denotes what type of “RPC” the command is.

As such, the first thing a server does when handling an incoming message over the wire is… parse the first field!

Because fields are strings, and strings are null-terminated CStrings, the deserializing logic in the MongoDB server parses the field until the first null terminator found.

An attacker can send a compressed, invalid BSON object that does NOT contain a null terminator. This forces the server to continue scanning through foreign data in the wrongly-allocated memory buffer until it finds the first null terminator (\0)

# Conceptual
[ REAL DATA |             UNREFERENCED HEAP GARBAGE                 ]
# Practical Example
[ { "a      | password: 123\0 | apiKey: jA2sa | ip: 219.117.127.202 ]

As the first null terminator is right after the password, the server would now think that the first field of the BSON is:

"a      | password: 123"

Obviously that is an invalid BSON field, so the server responds with an error to the client. In order to be helpful, the response contains an error message that shows which field was invalid:

{
  "ok": 0,
  "errmsg": "invalid BSON field name 'a      | password: 123'",
  "code": 2,
  "codeName": "BadValue"
}

Boom. The attacker successfully got the server to leak data to it.

Any serious attacker would then run this over and over again, thousands of time a second, until they believe they’ve scanned the majority of the database’s heap. They can then repeat this ad infinitum.

The impact of this is particularly nasty, because the request-response parsing cycle happens before any authentication can be made. This makes sense, since you cannot begin to authenticate a request you still haven’t deserialized.

This allows any attacker to gain access to any piece of potentially-sensitive data. The only thing they need is internet access to the database.

Exposing your database to the internet is a practice that’s heavily frowned upon. At the same time, Shodan shows that there are over 213,000 publicly-accessible Mongo databases.

The PR that introduced the bug was from May 2017. This means that, roughly from version 3.6.0, any publicly-accessible MongoDB instance has been vulnerable to this.

It is unknown whether the exploit was known and exploited by actors prior to its disclosure. Given the simplicity of it, I bet it was.

As of the exploit’s disclosure, which happened on 19th of December, it has been a race to patch the database.

Sifting through Git history, it seems like the fix was initially committed on the 17th of December. Interestingly enough, it was only merged a full 5 days after - on the 22nd of December (1-line fix btw).

That beig said, MongoDB 8.0.17 containing the fix was released on Dec 19, consistent with the CVE publish data. But JIRA activity shows that patches went out on the 22nd of December.

Because there’s no official timeline posted, members of the community like me have to guess. As of writing, 10 days later in Dec 28, 2025, Mongo have still NOT properly addressed the issue publicly.

They only issued a community disclosure of the CVE a full five days after the publication of it. It is then, on the 24th of December, that they announced that all of their database instances in their cloud service Atlas were fully patched.

I believe this implies that all Atlas databases exposed to the internet were vulnerable to this issue for almost a week. By default, Atlas databases use an IP allowlist for connectivity. But users could configure it to allow connections from anywhere.

Mongo says that they haven’t verified exploitation so far:

“at this time, we have no evidence that this issue has been exploited or that any customer data has been compromised”

Mitigation is admittedly very easy, you have one of two choices:

Update to the newest patch
Disable zlib network compression

I found the latter wasn’t circulated a lot in online talk, but I understand is just as good as a short-term mitigation.

The tech lead for Security at Elastic coined the name MongoBleed by posting a Python script that acts as a proof of concept to exploiting the vulnerability: https://github.com/joe-desimone/mongobleed

https://cyberplace.social/@GossiTheDog/115786817774728155

🎅mongobleed - poc for CVE-2025-14847. Leaks data from mongodb instances due to flaw in zlib message decompression. Reminiscent of heartbleed ❤️

12:39 AM · Dec 26, 2025 · 68.6K Views

11 Replies · 109 Reposts · 554 Likes

This is particularly interesting, because despite being different systems, Mongo competes with Elastic on Vector Search, Text Search and Analytical use cases.

The exploit allows attackers to read arbitrary heap data, including user data, plaintext passwords, api keys/secrets, and more.
It is performed by leveraging a simple, malformed zlib-compressed request.
MongoDB versions from 2017-2025 are vulnerable to this exploit.
Rough timeline:
On Dec 24th, MongoDB reported they have no evidence of anybody exploiting the CVE. Given the fact this exploit lived on for ~8 years, and their honey-pot cloud service Atlas took a full 5 days to patch since the official CVE publish date… I find that hard to believe.
MongoDB have not apologized yet.
There are over 213k+ potentially vulnerable internet-exposed MongoDB instances, ensuring that this exploit is web scale:

蒙戈出血解释 (Mēnggē chūxiě jiěshì) MongoBleed Explained Simply

3. Efficient

蒙戈出血解释 (Mēnggē chūxiě jiěshì)
MongoBleed Explained Simply