模糊安全并非坏事。

模糊安全并非坏事。
Security through obscurity is not bad

原始链接: https://mobeigi.com/blog/security/security-through-obscurity-is-not-bad/

## 混淆安全：在现代世界仍然适用最近的一次在线辩论强调了“混淆安全”的持久价值——通过隐藏系统内部运作来保护系统的方法。虽然人们常以“混淆安全不可靠”来否定它，但作者认为这是对它的误解。真正的安全不应*仅仅*依赖于混淆，但将其作为*额外*的一层可以显著提高攻击者的门槛。核心思想是增加恶意行为者在时间和资源上的成本。混淆细节，例如在WordPress中使用非默认数据库前缀，或从游戏二进制文件中删除调试符号（如Valve在CS:GO中所做的那样），迫使攻击者付出更多努力。即使在人工智能取得进展的情况下，反混淆和逆向工程仍然需要大量的计算资源。作者引用了一个通过人工智能辅助解决的CTF挑战，花费了大约300美元和4.5小时的计算时间。这说明了虽然人工智能正在改进，但它并没有使混淆变得毫无用处。攻击者必须权衡潜在的回报与克服这些障碍的成本。因此，纵深防御策略应将混淆与健全的传统安全措施结合起来。

## 混淆安全：一种细致的观点一则 Hacker News 的讨论围绕着经常被忽视的“混淆安全”概念。核心论点并非混淆*本身*就足以构成安全——这一点普遍认为是有缺陷的（违反克克霍夫斯原则）。相反，对话强调了它作为*额外*防御层面的价值。用户指出，混淆可以减少安全日志中的噪音，降低警报疲劳，并降低存储成本。例如，更改默认 WordPress 数据库前缀或登录 URL，展示了简单的混淆措施如何显著减少自动攻击尝试。然而，反方观点提出，依赖混淆可能会降低实施健全、基本安全措施的动力。另一位用户将此比作强制密码重置的转变，指出策略与人为行为的互动可能会无意中*削弱*安全性。最终，共识倾向于将混淆视为一种有用的，但并非主要的，安全组成部分。

原文

Escaping the crowded echo chamber

I was recently reading a post by a user on a web development forum. This user, whom we’ll call Mini, was asking the community whether it was worth using JavaScript obfuscation for some of the scripts running on their website. Their main goal was to make it harder for data-scraping bots to reverse engineer and replicate the API requests powering the page.

Then I saw it: like a solo LGTM comment on a +4,156/-1,640 line PR, a comment from another user whom we'll call Echo:

Security through obscurity is bad

What was worse was that this comment had many upvotes, likely from others who had heard the phrase once and simply channelled their inner parrot to repeat it forever.

I decided to reply to Echo's comment and share my thoughts:

Security through obscurity is NOT bad.
Security ONLY through obscurity is bad (Kerckhoffs's Principle).
Security through obscurity, as an additional layer, is good!

At first, I thought this was what Echo actually meant, but to my surprise, Echo believed that all forms of obscurity were redundant and should not be used at all. They also specifically argued that, in the modern day, AI had made getting around any sort of obscurity trivial.

In this post, I will explain why Echo is wrong and why security through obscurity has its place.

Don't show your working out

Security through obscurity is the practice of reducing exposure by keeping an application's inner workings or implementation details less visible to attackers. Unlike in mathematics, you do not want to show your working out.

It's the digital equivalent of hiding a spare key under the doormat instead of leaving it in the lock. In this scenario, a malicious actor might not bother looking under the doormat and might just leave. Congratulations, obscurity just saved you a break-in. They might still find the key, but they may check a nearby potted plant or mailbox first. That costs time, and time is money. To a malicious actor, the longer they spend chasing dead ends, the more likely they are to give up and move on.

Now, of course, proper security here would be not hiding a spare key near the door at all, but instead leaving it with a trusted family member or friend. Relying only on obscurity for security is bad. You should always secure your applications to the degree warranted, then sprinkle some obscurity on top to make the endeavour of attacking you more expensive. This is simply one part of a defence-in-depth strategy.

Four-panel infographic about security through obscurity using a house key analogy: key left in the door, key hidden under a doormat, key hidden under a pot, and finally a burglar shrugging while the caption says proper security should come first.

Obscurity in the real world

Of course, some examples might help drive the point home. Here are some specific examples I have personally encountered.

WordPress database table prefix

There is a long-standing security recommendation to change WordPress's default database table prefix to a random one. For example, wp_users becomes wp_8df7b8_users. This is often dismissed as "worthless" because it is security through obscurity.

I used to run this very blog on WordPress. Back in 2015, one of the plugins I used had an SQL injection vulnerability that allowed malicious actors to dump the databases of websites using it. These actors had bots scouring the web for vulnerable WordPress targets.

My website was vulnerable. However, I was not impacted by any attacks, and I updated the plugin to a patched version a few days later. While other sites were "nulled" and destroyed, I was spared. I later found a PoC script on GitHub showcasing the exploitation. Using that PoC on my own site failed with a generic error like Table 'wordpress.wp_users' doesn't exist. Therefore, while I was likely still vulnerable and could have been exploited with different SQL queries, the standard query targeting most users did not impact me.

I was spared thanks to that additional layer of security through obscurity. AI tooling today could keep trying different queries, and it may produce good results for malicious actors, but tokens still cost money. The more time and money the bot spends, the more likely it is to give up and move on. It's a battle of sustained resistance.

CSGO's debug symbol leak

I ran an Australian and New Zealand-based CSGO community server called Invex Gaming for several years. As a server operator, I tried to distinguish the servers I ran by adding unique custom mods. To do this, I used an amazing platform called SourceMod, which allowed you to write custom plugins and extensions. To write useful mods, you would often have to find and call functions directly in the game's binaries. CSGO ships with binaries such as engine.dll, client.dll, and server.dll. These binaries contained much of the game logic we wanted to invoke.

For example, I might want one of my mods to programmatically set a player's health. To do this, my script, running on the CSGO server, had to call the right function in the game. In this case, CBaseEntity::SetHealth is one such function, and calling it programmatically would allow me to set the health of any entity.

This is what the function signature looks like:

class CBaseEntity {
  public:
    virtual void SetHealth(int health) = 0;
};

Now, this is a common and well-known function that SourceMod maps for us correctly. But there are many functions in the game that are less common or not well known. How do we find and use these functions in our scripts? We have to find the function in the binary by reverse engineering it with tools like IDA Pro or Ghidra. Once we have identified it, we can build a stable reference to it using signature scanning or an offset in a virtual function table.

Unfortunately, we cannot access Valve's source code for the game. When looking at reverse-engineered game code, we see a mess of compiled code that takes significant effort to reverse and document properly. Function names, variable names, and data types or structures are not included. This is because it is common practice to strip away debug symbols from game binaries. Debug symbols are metadata generated during compilation that map a program's machine code back to its original human-readable source code. They are extremely useful for reverse engineering and understanding the code.

One day, Valve accidentally pushed an update for the macOS version of CS:GO that included the full, unstripped Mach-O debug symbols in the .dylib binaries. This exposed much more of the game's internals at the time. This led to a rush of server operators using the new information to create new and exciting scripts. Unfortunately, cheat developers also used it to further develop their cheats. A classic double-edged sword.

This is a prime example of Valve valuing the additional layer of security provided by obscurity. Valve has to ship the game in binary form, because the game runs on our machines when we play it. Valve chooses to strip debug symbols from its binaries because doing so is highly effective at reducing the efficiency of cheat developers. Shortly after this release, Valve realised its mistake and re-released the same version with the debug symbols stripped.

Obfuscated code

I do my fair share of malware analysis and CTFs every now and again for fun. It is extremely common to run into obfuscated code, which is source code that has been intentionally complicated to make it harder for humans and tools to understand while remaining fully functional. The malware industry is a billion-dollar industry, and nobody relies on obscurity more than malicious actors. The more obfuscated the malicious payload, the less likely security researchers and tools are to understand what is going on.

On the flip side, enterprises like Google also use JavaScript obfuscation to hide sensitive logic in the browser. A great example is Google reCAPTCHA, where the obfuscation is often heavy and sophisticated in order to make it harder for bots to understand the checks being performed and automate solving them. Netflix also uses obfuscation in its browser-side DRM components to help protect the logic that lets your browser play the video without exposing everything needed to easily extract and save a playable copy. Riot Games also uses obfuscation around parts of the communication between its kernel-level anti-cheat system, Riot Vanguard, and its servers to make it harder for cheaters to fake a clean signal while cheats are running.

Now, there was a suggestion that advances in AI have made obfuscation obsolete. I disagree. While AI tools are good at deobfuscating code, it is still often a slow and expensive process. I do believe a strong model will eventually reach a solution, but it will take time and money. Again, the longer and more expensive it is, the more likely people are to give up and move on.

I do not have concrete data to share on this topic, but I do have some anecdotal evidence. I attempted a hard PWN-style CTF challenge last year that I was not able to solve on my own. Using an LLM, Claude Opus 4.5, and giving it all the information, binaries, and local tools needed to solve the challenge, it still failed at first.

It was not until 4.5 hours of non-stop token burning and many trial-and-error iterations later that the LLM was able to find a solution. This endeavour used 61 million input tokens and 11 million output tokens, or roughly $300 USD. While I was willing to spend that much to gauge the model's ability to solve the challenge, it is important to keep in mind that we already knew a solution existed because this was a CTF challenge with an intended solution. Would malicious actors be willing to spend that much per attempt across a large enterprise attack surface, where results are far from guaranteed? How long are they willing to iterate on one specific angle? That uncertainty is exactly where obscurity still has value.

Spread the word

It should be clear by now that security through obscurity still has its place as an additional security layer in the modern world, even with AI-assisted tooling.

So I no longer want to hear the phrase "security through obscurity is bad".

From now on, let's spread these two statements instead:

Security ONLY through obscurity is bad

Security through obscurity, as an additional layer, is good!