那些等号到底是怎么回事？

那些等号到底是怎么回事？
What's up with all those equals signs anyway?

原始链接: https://lars.ingebrigtsen.no/2026/02/02/whats-up-with-all-those-equals-signs-anyway/

最近在推特上出现大量带有等号 (=) 的旧电子邮件片段，引发了人们的兴趣。这并非编码或扫描错误，而是过时电子邮件编码方法的遗留问题。在电子邮件的早期，长文本行对服务器来说是个问题。“Quoted Printable”编码通过插入一个“=”字符后跟换行符来解决这个问题，以表示同一行的延续。然而，当这些电子邮件在不同的操作系统格式（Windows 到 Unix）之间转换时，发生了解码错误。一个有缺陷的算法试图删除等号后的字符，导致字母丢失。此外，等号*也*被用来表示电子邮件中的特殊字符——例如带重音的字母或空格。看起来处理这些电子邮件的人使用了简单的查找和替换，而不是正确的解码器，这加剧了问题。本质上，这些等号既是技术历史的标志，*也是*处理不当的体现。

黑客新闻新的 | 过去的 | 评论 | 提问 | 展示 | 工作 | 提交登录那些等号到底是怎么回事？ (ingebrigtsen.no) 19 分，by todsacerdoti 22 分钟前 | 隐藏 | 过去的 | 收藏 | 1 评论 lordnacho 8 分钟前 [–] 我喜欢HN总是能浮现出我脑海中已经存在的疑问的答案，而无需我主动思考。我也在阅读关于新的埃普斯坦文件，想知道是什么文本伪像导致了那种外观。回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系搜索：

原文

For some reason or other, people have been posting a lot of excerpts from old emails on Twitter over the last few days. The most vital question everybody’s asking themselves is: What’s up with all those equals signs?!

And that’s something I’m somewhat of an expert on. I mean, having written mail readers and stuff; not because I’ve been to Caribbean islands.

I’ve seen people confidently claim that it’s a code, or that it’s an artefact of scanning and then using OCR, but it’s neither — it’s just that whoever converted these emails to a readable format were morons.

What’s that you say? “Converted?! Surely emails are just text!!” Well, if you lived in the stone age (i.e., the 80s), they mostly were, but then people invented things like “long lines” and “rock döts”, and computers had to “encode” the mail before sending.

The artefact we see here is from something called “quoted printable”, or as we used to call it when it was introduced: “Quoted unreadable”.

To take the first line. Whoever wrote this, typed in the following in their mail reader:

we talked about designing a pig with different non- cloven hoofs in order  to make kosher bacon

We see that that’s a quite a long line. Mail servers don’t like that, so mail software will break it into two lines, like so:

we talked about designing a pig with different non- =
cloven hoofs in order  to make kosher bacon

See? There’s that equals sign! Yes, the equals sign is used to say “this should really be one single line, but I’ve broken it in two so that the mail server doesn’t get mad at me”.

The formal definition here is important, though, so I have to be a bit technical here: To say “this is a continuation line”, you insert an equals sign, then a carriage return, and then a line feed.

Or,

=CRLF

Three characters in total, i.e., :

... non- =CRLF
cloven hoofs...

When displaying this, we remove all these three characters, and end up
with:

... non- cloven hoofs...

So what’s happened here? Well, whoever collected these emails first converted from CRLF (i.e., “Windows” line ending coding) to “NL” (i.e., “Unix” line ending coding). This is pretty normal if you want to deal with email. But you then have one byte fewer:

... non- =NL
cloven hoofs...

If your algorithm to decode this is, stupidly, “find equals signs at the end of the line, and then delete two characters, and then finally the equals sign”, you should end up with:

... non- loven hoofs...

I.e., you lose the “c”. That’s almost what happened here, but not quite: Why does the equals sign still remain?

This StackOverflow post from 14 years ago explains the phenomenon, sort of:

Obviously the client notices that = is not followed by a proper CR LF sequence, so it assumes that it is not a soft line break, but a character encoded in two hex digits, therefore it reads the next two bytes. It should notice that the next two bytes are not valid hex digits, so its behavior is wrong too, but we have to admit that at that point it does not have a chance to display something useful. They opted for the garbage in, garbage out approach.

That is, equals signs are also used for something else besides wrapping long lines, and that’s what we see later in the post:

   =C2   please note

If the equals sign is not at the end of a line, it’s used to encode “funny characters”, like what you use with “rock döts”. =C2 is 194, which is a first character in a UTF-8 sequence, and the following char is most likely a =A0: =C2=A0 is “non-breakable space”, which is something people often use to indent text (and the “please note” is indented) and you see =A0 in many other places in these emails.

My guess is that whoever did this part just did a search-replace for =C2 and/or =A0 instead of using a proper decoder, but other explanations are certainly possible. Any ideas?

Anyway, that’s what’s up with those equals signs: 1) “it’s technical”, and 2) “whoever processed these mails are incompetent”. I don’t think 2) should be very surprising at this point, do you?

那些等号到底是怎么回事？
What's up with all those equals signs anyway?

Like this:

Related Articles

那些等号到底是怎么回事？ What's up with all those equals signs anyway?

Like this:

Related Articles

那些等号到底是怎么回事？
What's up with all those equals signs anyway?