一窥 Reddit 的反垃圾信息内部机制

一窥 Reddit 的反垃圾信息内部机制
A peek into Reddit's anti-spam internals

原始链接: https://lyra.horse/blog/2026/06/reddit-spam-internals/

2021年，Reddit移动端应用“Relay”出现了一个技术漏洞，致使该平台的内部反垃圾信息元数据短暂地向身为版主的作者公开。通常情况下，Reddit会隐藏帖子被删除的具体原因，仅显示“自动”或“垃圾信息”等通用标签。这次短暂的泄露为人们提供了一个罕见的机会，得以深入了解Reddit的后端系统——特别是“spamurai”和较旧的基于Python的过滤器——是如何标记并删除内容的。作者发现，Reddit运用了多种复杂的启发式方法，包括使用Perspective API（进行情感分析）、全域名封禁、正则表达式匹配以及URL重定向检测（如检测嵌入的Google Analytics ID）。数据还显示，Reddit的内部系统会追踪用户指纹、账号注册时长和ISP数据，以识别潜在的垃圾信息发送者。作者多年来一直保留着这份研究，以防止恶意行为者利用这些漏洞。然而，他们决定在2026年将其发布，理由是现代大语言模型（LLM）的进步迫使各大平台彻底改革其垃圾信息检测机制，从而使这些旧技术在很大程度上已不再适用。这些发现为Reddit审核基础设施的演变提供了一个独特的历史视角，展示了其从早期的CRM114过滤器向现代人工智能驱动系统的过渡。

这篇 Hacker News 帖子讨论了一篇关于 Reddit 反垃圾信息基础设施的博文。用户将 Reddit 面临的技术挑战——如基于规则的过滤器、机器学习和 IP 指纹识别——与电子邮件垃圾信息过滤器的演变进行了比较。讨论的核心点在于非垃圾信息类机器人活动的复杂性。评论者推测，Reddit 可能在自动化机器人流量与官方人工审核之间寻求平衡，并指出大语言模型（LLM）降低了大规模生产高质量欺诈内容的门槛。一些用户认为，Reddit 可能有意允许某些机器人活动，以增加用户数量或影响舆论。该帖还提到了关于 Hacker News 自身“二次机会池（second-chance pool）”的元讨论，该机制偶尔会重新展示旧贴以提升其可见度，这常导致用户产生技术故障或似曾相识的错觉。此外，一些用户表达了对不透明的账户“影子封禁（shadowbanning）”的沮丧，并希望平台审核工具能提供更高的透明度。

5 years ago, back when I still used Reddit, something unusual happened. My app of choice, Relay for reddit, was bombarding me with a bunch of weird notifications about removed spam.

Getting these notifications wasn’t unusual in and of itself - I was a moderator of a few fairly small subreddits that’d from time to time get posts automatically removed for spam. However, when I went to actually look at the removed spam, I saw something I was never meant to see.

I saw Reddit’s anti-spam internals.

so that's about it.

Removed: spamurai (*Removing potential spam content from unproved user*: comment `t1_pupp13` (0.7294469 perspective spam) by u/GoodBoyBacon (0.06 days old, spammy: 11, hosted: false, -1 karma, 4 reports, org: `ComcastCable`, email: gmail.com) in r/GoodBoysOnly (guest) posting nil from `oauth.reddit.com` via `nil` from UA: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36 Edg/95.0.1020.30, RHS: oc:ac:kT:lw:bV:aX:af:a6:l5:y3:aT:m9:pt:f3:hZ:az:aR:aQ, LANG: en-US,en,q=0.9, TLS: j7bXVc3l/qer8FRj2aEiqOrx1ro=DDZ0TViWlY5HYgOPw1SZqDxwiO8= - referrer: https://www.reddit.com/, thumbnail: `` - ) • GoodBoyBacon • 1 points • 27 min

You see u/BadGuy67? He's the same guy as https://www.reddit.com/r/ReallyBadGuys/comments/qw3rt1/if_ur_a_bad_guy_post_here_please/

Removed: Reddit (shadowban applied on 10-27-2021) • GoodBoyBacon • 0 points • 1 hr

I'm not the same guy as that other guy please read my comment

How Reddit moderation works

So Reddit is a site comprising of smaller sub-communities, which are called subreddits. For example, /r/mylittlepony is a subreddit for fans of My Little Pony. These subreddits can be created by anyone, and they are moderated by a group of community moderators appointed by the creator of the subreddit.

If we go^{on /r/mylittlepony we can see the list of moderators on the sidebar:}

These moderators can remove posts, ban users, manage modmail etc, but they are just normal Reddit users.

If you’re a moderator you can see who removed a post or comment:

Removed: rebane2001 • ExampleUser • 1 points • 1 hr

I'm breaking the rules 😈

This includes the automod - a rules-based moderation system:

Removed: AutoModerator • ExampleUser • 1 points • 1 hr

bad word

But then you’ll sometimes also see the mysterious “Auto”:

Removed: Auto • ExampleUser • 1 points • 1 hr

hi

This is what happens when something gets caught in Reddit’s mysterious spam filters, or when Reddit’s sitewide admins remove something manually.

In the moderator log, they’ll show up as “reddit” and “Anti-Evil Operations”:

filter by action: filter by moderator:

You cannot be reasoned with.

These sitewide spam removals is what the rest of this post is going to be about.

Oopsie

What happened to me back in 2021 was that due to some kind of an error on Reddit’s side, the usual Removed: Auto text had been replaced with the actual removal reason. Why this happened to me I do not know - it returned back to normal after an hour or so. All I was left with was a bunch of screenshots I managed to take while this stuff was still going on.

But that doesn’t mean we can’t speculate!

Up until 2017, Reddit’s source code was publicly available. Of course, a lot has changed since then, but we can still analyze the archived code and hypothesize what might be happening.

The function responsible for moderator removals is POST_remove:

def POST_remove(self, thing, spam):
    """Remove a link, comment, or modmail message."""
    ...
    admintools.spam(thing, auto=False,
                    moderator_banned=not c.user_is_admin,
                    banner=c.user.name,
                    train_spam=train_spam)

We can see it calls admintools.spam with a few arguments, notably: moderator_banned, which marks whether something was removed by a moderator or an admin, and banner, which notes down the username of whoever did the ban action.

Poking around a bit more, we find the get_mod_attributes function:

# Comments added by me for the blogpost
def get_mod_attributes(item):
    data = {}
    # If user is logged in and a moderator
    if c.user_is_loggedin and item.can_ban:
        data["num_reports"] = item.reported
        data["report_reasons"] = Report.get_reasons(item)

        ban_info = getattr(item, "ban_info", {})
        # If post was removed
        if item._spam:
            data["approved_by"] = None
            # If post was removed by a mod
            if ban_info.get('moderator_banned'):
                # Show the banner name
                data["banned_by"] = ban_info.get("banner")
            else: # else, if post was removed by an admin
                # Hide the banner name
                data["banned_by"] = True
        else:
            data["approved_by"] = ban_info.get("unbanner")
            data["banned_by"] = None
    else:
        data["num_reports"] = None
        data["report_reasons"] = None
        data["approved_by"] = None
        data["banned_by"] = None
    return data

This is the part of the API that actually returns us the information about removals - the banner in ban_info is the red text I was seeing Relay. And it seems like it will only get returned if the removal was by a moderator, not an admin. But where does that Auto text come from? Reddit’s API only returns an actual username, or True.

Turns out that it’s actually coming from Relay^itself:

// reddit/news/oauth/reddit/model/base/RedditLinkComment.java
if (this.bannedBy.equalsIgnoreCase("true")) {
    this.bannedBy = "Auto";
} else if (this.bannedBy.equalsIgnoreCase("null")) {
    this.bannedBy = "";
}

Okay, that explains that. But where am I getting these internal messages from?

Well, it seems like Reddit is re-using the banner field for internal removal reasons:

def POST_submit(self, form, jquery, url, selftext, kind, title,
                sr, extension, sendreplies, resubmit):
    """Submit a link to a subreddit."""
    ...
    if not is_self:
        ban = is_banned_domain(url)
        if ban:
            g.stats.simple_event('spam.domainban.link_url')
            admintools.spam(l, banner = "domain (%s)" % ban.banmsg)
            hooks.get_hook('banned_domain.submit').call(item=l, url=url,
                                                        ban=ban)

The above code snippet runs whenever a new link is posted. It checks whether the domain is spam, and if it is it removes it with the banner set to “domain (REASON)”.

We can see it in action with this removed post for example:

1

I_EAT_PONIES in MyLittleOutOfContext

Removed: domain (banned as an experiment to see what happens with tubmlr spam ring. - em 5/31/12) • 0 Comments • 24.media.tumblr.com • 9 yrs

Seems like em was playing around with auto-removing all tubmlr [sic] links on Reddit in 2012?

Anyways, it seems like Reddit is stuffing its internal spam removal reasons in the banner field, but making it so that only sitewide admins can see them. And something in a codepath similar to get_mod_attributes was broken for a couple hours, allowing me to see those reasons.

Let’s take a look at the kinds of reasons I managed to get a glimpse of!

domain (2012 - present)

The first category is the domain removals, as shown earlier. Nearly all of these are just Removed: domain (spam), though I did find this gem in there:

1

presafur in MyLittleOutOfContext

Removed: domain (le sexxxxy sex spam) • 0 Comments • www.example.com • 5 yrs

Perhaps I’m just childish, but I find the idea of someone going le sexxxxy sex spam while working on a spamfilter rather amusing.

Reddit probably had some issues with Tumblr spam, because in addition to the tubmlr removal we saw earlier there was also this:

1

JackofH3art in MyLittleOutOfContext

Removed: domain (ban - 11/12/12 mg ) • NSFW • 0 Comments • bartl3by.tumblr.com • 8 yrs

I’m quite certain that this removal was targeted at Tumblr in general, and not the specific blog linked, since bartl3by.tumblr.com seems to be a legitimate (although somewhat perverse) blog.

I believe domain removals are the only type of anti-spam we can actually see in the public Reddit source code. Though, even that is partially hidden.

spammit (2012 - present)

The next category is spammit, which somehow analyses a post and gives it a percentage rating:

1

Kyderra in MyLittleOutOfContext

Removed: spammit(72.98% spammy) • 0 Comments • dashie.mylittlefacewhen.com • 8 yrs

Yes, there’s no space between spammit and the parenthesis.

The percentages of removed posts were generally fairly high, with the lowest one being 39.71% spammy and highest one 98.19%.

That being said, spammit doesn’t seem like a very accurate anti-spam measure for the subs I moderate because it seemed to hit a lot of legitimate Imgur posts with a 70-98% spammy rating.

bans (2016 - present)

Next, we have post removals for banned users.

1

kaitlynwwrettin in MyLittleOutOfContext

Removed: banned user • 0 Comments • www.example.com • 3 yrs

Some of them are marked with just a Removed: banned user, though others get a fancy Removed: Reddit (banall performed).

1

KerryVinebt403 in MyLittleOutOfContext

Removed: Reddit (banall performed) • 0 Comments • example.com • 3 yrs

The posts I saw being removed like this were all very obvious spam. Mostly just ads for all kinds of services. I suspect this is the admins seeing an obvious spambot and just nuking it from orbit.

shadowbans (2016 - present)

It’s known that Reddit shadowbans its users. A shadowban is a silent ban where seemingly nothing happens to your account and you’re still able to post/comment, but nobody else will be able to see your posts and comments. In fact, there’s even a subreddit for checking whether you’re shadowbanned.

But now we can actually see what a shadowban looks like to admins:

1

pickertramontana in trixiemasterrace

Removed: Reddit (shadowban applied on 11-06-2019) • 0 Comments • self.trixiemasterrace • 1 yr

I’m not going to share the specific conversation here, but there was a really funny comments thread going on where a person was blaming mods for removing all of their comments while in reality being shadowbanned by Reddit.

spamurai (2020 - present)

Now we get to the most interesting part of the entire spam filter thing. Unlike spammit, spamurai is a system that does have some public references to it. According to slide #28, Reddit uses Minsky for “ML”, and Spamurai for “Rules”. I’m not sure how this is calculated into the removal reasons, so I’m just going to ignore it and assume everything is spamurai.

First up, there seems to be some sort of a spamurai subsystem called echelon. It seems to remove certain keywords, such as the EqG elsagate spam seen below, and lewd (OF? Snapchat?) stuff like puppyvids.69.

1

mypham71375 in Pony_irl

Removed: spamurai (echelon: Equestria Girls Princess Animation Series) • 1 Comments • youtube.com • 2 yrs

Then, there’s some targeted removals, such as this one that targets clothing spam.

Removed: spamurai (approval required on hyperlink comment from high spam score account (suspected shirt affiliate spam)) • Adventurous-ties • 1 points • 5 months

Dog-Women consulting company will open the Ukrainian pharmaceutical market for you!

This market has been opened to you by the Dog-Women consulting company!

So what would u like to order?

star ratings

Limited time deal

-50%349KARMA

699 KARMA

About this item

With this item you can see the Perspective scores of absolutely everything!

Buy New

349KARMA

In stock

Buy now

No refunds after purchase.

No refunds.

Legal note: This is a joke, no pharmaceuticals can actually be ordered from this blog post about Reddit spam filters.

And some more general rules-based filters, such as this one for account age.

Removed: spamurai (comment from account under 30 minutes matching spam conditions) • NewUser67 • 1 points • 23 min

fuck you fuck you fuck you

But alright, let’s try to figure out what’s going on with the infodump removals like the one I put in the banner art of this post.

1

AnywhereAlone6851 in Pony_irl

Removed: spamurai (*Removing potential spam content from unproved user*: link `t3_phc4xx` (0.12571795 perspective spam) by u/AnywhereAlone6851 (2.948587962963 days old, spammy: 4.5, hosted: false, 28 karma, 5 reports, org: `Skyinfo Online`, email: gmail.com) in r/Pony_irl (guest) posting pinterest.com from `oauth.reddit.com` via `nil` from UA: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36, RHS: oc:ac:kT:lw:bV:aX:af:a6:l5:y3:aT:m9:pt:f3:hZ:az:aR:aQ, LANG: en-US,en,q=0.9, TLS: SwxwvfHLtTxt/9qbo1dvBLEMSIQ=tT1LosI8/xDmUS7LMVuhb/olIJQ= - referrer: https://www.reddit.com/, thumbnail: `https://b.thumbs.redditmedia.com/K_Q91r66a3AEopEbzGkjkxHOpisoQbxa3hIoHxDerjc.jpg` - ```18 Random Facts That Will Blo ``` https://www.reddit.com/r/Pony_irl/comments/phc4xx/18_random_facts_that_will_blo/ ) • 0 Comments • pinterest.com • 1 month

That is a lot of information in there! Let’s break it down bit by bit:

link t3_phc4xx: this is the “fullname” ID of the post, it’s what shows up in urls except it is prefixed: t1 is comment, t2 is user, t3 is post, t4 is private message, and t5 is subreddit.

0.12571795 perspective spam: this is almost certainly using the Perspective API. Perspective is a free^{Google^{service that uses machine learning to “reduce toxicity online”.}}

I’m sure of this because perspective is a pretty unique word, the Perspective docs display sample results with similar score numbers (e.g. 0.24173126 and 0.4445836), and Perspective’s case studies page contains this quote from the CTO of Reddit:

“As Reddit scales, the integrity of our platform and ensuring healthy discourse among users and communities remains a priority. Perspective has been a valuable tool as we continue to strengthen the safety measures and tooling that we have in place.”

—Chris Slowe, Chief Technology Officer at Reddit

It seems like Reddit is using Perspective’s “experimental” SPAM attribute here though, which is intended to detect spam instead of toxicity. The data for this is trained on a SINGLE DATASET of the comments and moderation in the New York Times, which I find pretty interesting.

Unfortunately, since February 2026, we can no longer create a new Perspective API project on Google Cloud, so it is not possible to try it out anymore.

Well, that is unless we can find some leaked API keys :3. Which I may or may not have teehee..

$ curl 'https://commentanalyzer.googleapis.com/v1alpha1/comments:analyze?key=AIzac29ycnkgdGhpcyBpcyBjZW5zb3JlZCBsb2w' \
    --request POST \
    --header "Content-Type: application/json" \
    --data '{
              "comment": {
                "text":"18 Random Facts That Will Blo "
              },
              "requested_attributes": {
                "SPAM": {"score_type": "PROBABILITY"}
              }
            }'
{
  "attributeScores": {
    "SPAM": {
      "spanScores": [
        {
          "begin": 0,
          "end": 30,
          "score": {
            "value": 0.12571794,
            "type": "PROBABILITY"
          }
        }
      ],
      "summaryScore": {
        "value": 0.12571794,
        "type": "PROBABILITY"
      }
    }
  },
  "languages": [
    "en"
  ],
  "detectedLanguages": [
    "en"
  ]
}

..and thus, we can be 100% sure that this is the API Reddit used, because we get back the same^{0.12571795 as we saw in spamurai earlier.}

This is interesting because it means that this entire time it was possible for a bad actor to bypass one of the primary spamurai criterias by just changing their message until it’s non-spammy for Perspective’s free API.

It’s not even hard to do so, as SPAM score is extremely sensitive to changes of just a few characters:

$ query='Puppygirl Consulting is the best way to grow your revenue'
$ ./perspective.sh "$query"
0.8638981: Puppygirl Consulting is the best way to grow your revenue
$ for letters in {a..z}{a..z}
    do ./perspective.sh "$query $letters" | grep "0.0"
  done
0.010811162: Puppygirl Consulting is the best way to grow your revenue qp

You can see how going through all 2-letter combinations got us from a 86% spam score down to 1%, which is significantly less than pretty much any normal text.

It also ignores numbers and case for some reason:

$ ./perspective.sh 'Hi there, please call my work phone at 567890'
0.81438655: Hi there, please call my work phone at 567890
$ ./perspective.sh 'hi THEre, pleaSE Call my woRk phonE aT 022102'
0.81438655: hi THEre, pleaSE Call my woRk phonE aT 022102

As well as alternate alphabets:

$ ./perspective.sh 'привет'
0.35077864: привет
$ ./perspective.sh 'наххуи'
0.35077864: наххуи

Which means you can sometimes lower your spam score just by using cyrillic characters:

$ query='Buy my product'
$ ./perspective.sh "$query"
0.6473346: Buy my product
$ ./perspective.sh "$(echo -n $query | sed s/p/р/)"
0.4452748: Buy my рroduct

Anyways, moving on…

by u/AnywhereAlone6851: username, self-explanatory

2.948587962963 days old: account age as days, which is a pretty good indicator of spam accounts and ban evaders. But it does give us one interesting detail - I believe the account age is represented in seconds, because all the examples I have come out to a round number when multiplied by 86400 (amount of seconds per day).

spammy: 4.5: not sure what this is, could be the Minsky thing from earlier? Or the spammit score from earlier? Or a combination of multiple spamurai rules?

hosted: false: not sure, maybe to detect known hosting provider ip ranges?

28 karma: self-explanatory, karma is often used as a measure of an account’s presence

5 reports: total number of reports an account and its posts have received

org: Skyinfo Online: the ISP of user. This can tell you where the user is from and whether they’re using a VPN. In this case we can see that the spam is coming from Bangladesh, because that’s where SkyInfo Online operates from. Their website is incredible.

email: gmail.com: e-mail domain of the user

in r/Pony_irl (guest): the subreddit that the post is in. I believe the (guest) means that the user is not a subscriber of the subreddit. I assume that it would say something like “member”, “approved”, or “moderator” in other cases, but I don’t actually have any examples of that.

posting pinterest.com: the domain that’s being linked to

from oauth.reddit.com via nil: the user was authenticated with Reddit’s oauth flow, which is the default, and I belive the nil (Lua’s null) is where the name of a custom client would go, e.g. Relay in my case. I don’t have any examples of this being anything other than nil, so this is just speculation.

from UA: Mozilla/5.0 (Win…: this is the user agent string of the browser that’s being used. It tells us that the person is posting from the Chrome browser on Windows 8.1.

RHS: oc:ac:kT:lw:bV:aX…: this seems to be some sort of a fingerprinting hash Reddit uses. I believe this is Reddit’s own engineering and not an existing open-source solution. This hash is the exact same between this Chrome 93 example, and the Edge 95 example from the beginning. This leads me to conclude that the hash fingerprints browsers (Edge and Chrome are both Chromium) and is meant to detect scripts pretending to be a browser.

LANG: en-US,en,q=0.9: the value of the accept-language header, it tells websites what languages you’d like to see websites in. This can be used to detect potential VPN usage, e.g. if someone has Latvian as their language but is joining from a New York IP.

TLS: SwxwvfHLtTxt/9qbo…: this is TLS fingerprinting similar to JA3. It seems to be Reddit’s own engineering though, not an existing implementation.

referrer: https://www.reddit.com/: this is the page the user got onto Reddit from. Sometimes when opening Reddit links directly from other sites, your votes are not counted to discourage brigading, and this is what the referer is used for. In the case of spamurai it might be useful if the referer is something like buy-reddit-comments.info (or more realistically, a platform such as Fiverr).

thumbnail: https://…: the auto-generated thumbnail

- ```18 Random Facts That Will Blo ```: the markdown body of the post/comment

https://www.reddit.com/r/Pony_irl…: link to the post/comment

So that’s the full spamurai infodump with no clear reason for removal. There are also examples of spamurai clearly using the same data but with specific rules, such as the use of the spammy score here:

Removed: spamurai (URL-only comment from account with high spammy score) • GoodBoyBacon • 0 points • 24 min

https://www.reddit.com/r/ReallyBadGuys/comments/qw3rt1/if_ur_a_bad_guy_post_here_please/

Or the use of the perspective score here:

Removed: spamurai (REPORT: High spam perspective score on comment with hyperlink reported for spam. Removed but can be re-approved by mod.) • BrattyErmine12 • 0 points • 11 months

Coins are a virtual good you can use to award exemplary posts or comments. Support Reddit and encourage your favorite contributors to keep making Reddit better.

GET COINS

Here’s what you can buy with coins

Spend your coins on these Awards reserved exclusively for the finest Reddit contributors. Awarding a post or comment highlights it for all to see, and some Awards also grant the honoree special bonuses.

📷

Silver Award

Shows a Silver Award on the post or comment and ... that’s it. You’ll need 100 Coins.

The perspective spam score for the above post is either 0.9761621 or 0.9782609^{. Also what’s interesting is that the specific rule there got triggered by someone reporting it for spam - thus we learn that sometimes user reports have an effect even without moderator intervention.}

It’s also interesting how some of the removals adjust based on mod actions:

1

muyuwobsoq9q in Pony_irl

Removed: spamurai (High karma-to-spam ratio on link content from 6+ spammy score account; mod approval of this content will reduce future removals) • 1 Comments • youtube.com • 2 yrs

misc

There’s also a bunch of removals that don’t really neatly fit into any of the above categories.

For example, Pinterest redirect links get removed:

Removed: pinterest redirect • 22_ghost_22 • 1 points • 4 months

https://pin.it/Sc4mUr1

As do mega.nz links:

Removed: streamer spam • EPIC_Gamer67 • 1 points • 11 months

https://mega.nz/folder/Ep1cV1d30s

The decryption key is

SW52YWxpZCBiYXNlNjQgc3RyaW5n

In the case of the comment above, it was actually a legitimate link to some archived YouTube videos, so it was falsefully removed.

Another banned kind of link is a freely available subdomain:

1

cpsryan in UnusAnnusArchival

Removed: freely available subdomains • (Unus Annus Archiving) • 0 Comments • self.UnusAnnusArchival • 11 months

In the above case the post didn’t contain those kinds of links per se, but it did contain a magnet link that reddit found and linkified the 2ftracker.opentrackr.org^{inside of. I’m not sure why opentrackr gets matched under “freely available subdomains” though.}

But speaking of trackers, certain strings are straight up regex banned:

1

e4e5x0q8e1p in MyLittleOutOfContext

Removed: Matched forbidden regex u'torenteu' • 2 Comments • self.MyLittleOutOfContext • 10 yrs

Now, this one is super interesting to me because nowhere in the post does the string torenteu appear, yet we still somehow match our regex?

The reason this happens is because Reddit uses the unidecode library^{to convert post titles into ascii:}

$ python2
Python 2.7.18 (default, Dec  9 2024, 19:35:20)
[GCC 9.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import unidecode
>>> unidecode.unidecode(u"레인보우 대시 프레젠츠 26화 토렌.트 150927 26화 torrent HD 고화질")
'reinbou daesi peurejenceu 26hwa toren.teu 150927 26hwa torrent HD gohwajil'
>>>

It then processes the string a bit more and arrives at "reinbou_daesi_peurejenceu_26hwa_torenteu_150927", which does match the u’torenteu’ regex.

I was curious as to whether this filter still exists, so I made some test posts on a subreddit I moderate using an alt account:

overview comments submitted

sorted by:

there doesn't seem to be anything here

Incorrect combination :3c

There is no reason clicking here should do anything, and yet...

It’s hard to say? It seems like the string “torenteu” by itself does not get removed, so I assume the other removals are based on various other kinds of spam heuristics?

Something I did find interesting is that UA-12345678- got removed, but UA-49307539- did not! It’s interesting because there used to be a filter for that specific phrase too:

Removed: Failed inspection: Phrase(s) [u'UA-49307539-'] • c4c3u5o8c7n • 0 points • 11 months

다시보기 강아지들 토렌.트 torrent 토렌 DVD 1080p 720p HD Full HD DVD 1080p MKV

강아지들 토렌.트 file

1080p MKV 다시보기 강아지들 토렌.트 토렌.트 토렌 Torrent Comprehensive 720p HD

Coverage aggregated from sources all 토렌.트 파일 (Torrent) :

파일 받기 : 다시보기 강아지들 토렌.트 Torrent

.

Though, this case is a little more curious than just that. Once again, the removal phrase does not appear in the comment, but this time not even after running the text transformations!

The trick here is that the comment contains a link that goes through several redirects and then ends up on some Korean forum. And looking at the source code of said forum:

<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-anal [break] ytics.com/analy [break] tics.js','ga');

  ga('create', 'UA-49307539-2', 'auto');
  ga('send', 'pageview');

</script>

Aha! So the “inspection” means that Reddit literally opens the URL, follows redirects, and looks for the pattern on the page. In this case, the pattern matched is a Google Analytics ID, so that if the same spam ring was to change their IP and domain, the spam filter would still catch them.

I wanted to try this on my own account, so I put the string <pre>UA-49307539-2</pre> on a website and posted a link to it on Reddit.

My test account (5 years old!) got banned immediately, and all of its post history got wiped too. RIP /u/popstonia.

For this reason, I changed the real number in this blogpost to UA-49307539-, which is in reality a random number - I would rather not put a piece of text out there that can kill people’s accounts through just posting it.

I tried to recreate the ban with a friend’s account which had a little more history on it, and that one ended up being fine. So I’m guessing this string only killed my test account because it was already a certain level of suspicious for the anti-spam filters.

I don’t actually know for sure whether the filter is still active, or whether my account getting nuked was a coincidence, but I’m choosing not to publicize the specific string used just to be safe.

Alright so this next one has a pretty interesting removal message:

1

Redstoner7 in UnusAnnusArchival

Removed: spam https://www.reddit.com/message/messages/8edha7 • (Unus Annus Archiving) • 0 Comments • self.UnusAnnusArchival • 11 months (edited 11 months)

The post here isn’t all that interesting, but what is is that it got immediately removed. I have no idea why the removal message would link to a specific Reddit PM? Is this the message that was sent to the user? Was it sent to admins? Was it sent to modmail? Is it a DMCA message?

I wanted to get to the bottom of this so I did some OSINTing and tracked down the person who made that post, Aria. I asked her to check the message link, but it turns out that it was not sent to her account. So this got me even more curious.

Now, something we do have is the id of the message - 8edha7. Just like the other ids on Reddit, it is sequential. This meant I could figure out when this linked message was sent based on the messages in my own message history. And this message appears to land in the latter half of May 2017!

I still don’t know what this message is and why it is linked in the removal reason, but it is a rather old message from before the Reddit account or the subreddit were even created.

0

neynime in MyLittleOutOfContext

Removed: Janitor russian girls chat: Submitted by banned user neynime • 0 Comments • example.com • 5 yrs

I’m not sure what’s up with this one. Like obviously it’s just sex spam or whatever, but what’s up with that removal message? Is it talking about Janitor russian girls who chat, or is there a reddit janitor who did the removal? Why is it submitted by banned user? Is this like a banall from before that was a thing? So many questions, and unlike the previous post I can’t even reach out to anyone to ask about it.

Okay, but there’s one more removal I found pretty interesting, and it’s this one here:

Removed: some pages have personal info - 11/15/12 mg • gnbman • 1 points • 8 years

https://encyclopediadramatica.se/thumb/8/8a/Woll_Smoth_original.jpg/180px-Woll_Smoth_original.jpg

For those out of the loop, Encyclopedia Dramatica is a parody wiki site centered around internet culture and making fun of people. It’s pretty much like if 4chan was in charge of Wikipedia. A lot of the pages are pretty mean to their subjects, sometimes - as you might deduce from the removal message above - to the point of digging up and documenting their personal history.

Thus, it seems like mg decided to ban the entire domain and auto-remove any links to it. I believe this is noteworthy as it is the only removal here that is not just spam, but instead a legitimate website that Reddit did not like the content of.

Reddit engineering

So that was all I was able to deduce from what I saw myself, but as it turns out, Reddit has been writing about their anti-spam systems too!

There’s a post from 2023 on /r/RedditEng titled Protecting Reddit Users in Real Time at Scale that talks about internal systems called Rule-Executor-V1 (REV1), REV2, and Snooron.

The timeline is a bit messy, but how I understand it is that REV1 was created in 2016, then Snooron was developed in 2021 to modernize REV1, and two years later everything was migrated to REV2? I wonder if that migration is what led to me seeing the admin removal messages back in 2021.

Both REV1 and REV2 run off of Lua rules such as this:

if body_match("some bad text") then
  action(user)
end

This leads me to believe that REV1 is what we know as spamurai. The timeline seems to match, and we’ve seen samurai emit strings such as “nil” that you’d expect from Lua.

There have been fairly recent user reports of posts getting removed by the users /u/Safety_Spamurai and /u/bot-bouncer, so the spamurai name is still at least somewhat in use, even for REV2 or snooron.

But we also saw removals such as u’torenteu’ and u’UA-49307539-’, which are clearly Python2.7 unicode strings. The former was way before 2016, so that makes sense, but what about the latter removal that we only saw in like 2020?

Well, REV1 also ran on Python2.7, so I think there are two possible conclusions: either the REV1 code calls out the URL inspection code written in Python2.7, or the inspection code is entirely separate from REV1/spamurai. I suspect the latter, because all of the spamurai removal messages seem to be prefixed with “spamurai”.

I also learned that, according to this talk, snooron runs on Flink Stateful Functions, classifies posted images, runs OCR on said images, and uses Python3 for its workers.

I also found this Australian eSafety PDF which lists Reddit as using, as of 2024, the Hive AI for OCR and image/video classification, but also the Google Vision OCR API.

They explain that Hive’s OCR supports 12 languages, and thus they also need Google’s OCR to support a lot more of them. They also mentioned that they’re working on an internal tool that would support 80 languages.

Though, the text classification itself is done in-house using snooron. Snooron also has internal image hash-matching functionality. I don’t know whether this is just using existing anti-abuse/anti-terrorism hash databases, or if it’s also Reddit’s own hashes for common spam and such.

Going back in time, I also found this ticket from 2009 where spez[A] confirms that a user called crm114 is a spam filter that can be trained by moderators. CRM114 is an old open-source spam classification software that, among other things, lets you “train” it with data to make its detection more reliable.

This is also why the admintools.spam method in Reddit’s source code has a train_spam keyword - it decides whether the anti-spam filters should be trained off of the performed moderator action. So, approve good posts in your sub if you want less false-positives?

Why now?

So why release all this information now and not 5 years ago? I believe the information in this post, if released back in 2021, would’ve been catastrophic for Reddit’s spam issues. I don’t care too much about large companies, but covering internet forums in spam is not something I strive to do. In 2026 however, I believe this information is no longer dangerous to publicly share.

First of all, the Perspective API is shutting down by the end of this year. I doubt Reddit is still using this API, and even if they are, they’ll have to migrate off of it soon anyways. Secondly, there’s that elephant in the room. LLMs have changed the game and revolutionized… the spam industry. And thus, I think it’s safe to assume that Reddit has had to overhaul a lot in their anti-spam systems to make it work in the year of 2026.

afterword

hiii! probably not the blogpost you were expecting, but hopefully a fun one nontheless! ^^

as usual, i did the whole “handwritten html/css, no images, no external resources, no javascript” thing for this post too (46kB gzipped btw!), but while recreating the old reddit ui i was pleasantly surprised by just how nice its code is! it feels like it was written by someone who actually loves html and css and wants to give me a warm hug. i was amused by the css actually using the orangered color by name, a rare sight these days!

anyways, some other updates - many of you are probably awaiting my x86css blog post, and it is hopefully coming out at some point, but in the meanwhile i gave a talk about the project at css day (which was a really fun event!!). unfortunately, the recordings for the talk will initially be behind a paywall, but they should become public eventually. i’m also trying to get the same talk accepted at 40c3, in which case the recordings will be available immediately. besides that, i’ll likely be doing a few other talks this year too - check my slides page for up to date info.

other than that i’m really hoping to host x3ctf again this year. we’re still not sure when it is happening, but i think we are all aiming to make it happen before the end of the year.

thank you so much for reading <3

If you’d like to reach out, feel free to message me on my socials or at lyra.horse [at] gmail.com.

Discuss this post on: twitter, mastodon, lobsters