龙虾主页的工作原理
How the Lobsters front page works

原始链接: https://atharvaraykar.com/lobsters/

## 龙虾社区首页算法:摘要 龙虾社区,一个面向计算机的社区,使用“热度”分数来对首页故事进行排名:**热度 = -1 x (基础 + 顺序 x 符号 + 年龄)**。热度越低(越负),排名越高。 **基础** 是一个基于故事标签的小的初始调整(有些标签会被惩罚,例如“抱怨”),并对自提交链接略有提升。**顺序** 反映了参与度,根据故事评分(赞成票)和评论积分(赞成票价值的一半)进行对数计算。备受争议且评论众多的故事不会被不成比例地提升。**符号** 惩罚了负面评分(被标记)的故事,抵消评论的赞成票。**年龄** 随时间线性增加,将较旧的故事推下去。 该算法优先考虑新的、引人入胜的内容,同时惩罚负面情绪。作者认为,虽然该算法很可靠,但它并非龙虾社区质量的决定性因素——这得益于版面管理、专注和社区文化。然而,他们发现个人参与度确实很重要,尤其是早期投票,并鼓励用户积极贡献以塑造社区氛围。

一个黑客新闻的讨论集中在在线社区使用的排名算法上。 初始帖子链接到一篇文章,详细介绍了Lobsters(一个链接聚合网站)如何对其首页内容进行排名。Lobsters的算法同时优先考虑点赞数(U)和帖子年龄(S,以自2020年1月1日以来的秒数计算),并由“浮力”修正因子(B)调整。公式为:Score = log10(U) + (S / (B * 86,400))。 对话还涉及Bear Blog,一个具有类似排名系统的博客平台,并链接到现有的黑客新闻帖子,解释其首页的运作方式。 用户讨论加入Lobsters的难度,有人建议为开源项目做贡献作为获得邀请的潜在途径。 基本上,该帖子探讨了在线展示热门和及时内容背后的机制。
相关文章

原文

Lobsters is a computing-focused community centered around link aggregation and discussion.

The code is open source, so I had a look at how the front page algorithm works.

This is it:

$$\textbf{hotness} = -1 \times (\text{base} + \text{order} \times \text{sign} + \text{age})$$

$$\text{hotness} \downarrow \implies \text{rank} \uparrow$$

The page is sorted in ascending order by \( \textbf{hotness} \). The more negative the value of \( \textbf{hotness} \), the higher the story ranks.

You can skip straight to the interactive front page to help get a feel for the front page dynamics.

Base

The \( \textbf{base} \) is added to the order term to incentivise certain types of posts, and influence the initial ranking. It is the sum of the hotness modifiers (a value between \( -10 \) and \( +10 \) of all the tags in that story).

$$\textbf{base} = \sum_{t \in \text{tags}} \text{hotness\_mod}_t + \begin{cases} 0.25 & \text{if self-authored link} \\ 0 & \text{otherwise} \end{cases}$$

Some tags (like culture or rant) have negative "hotness modifiers", which penalises their initial rank. Authors submitting their own content get a tiny boost, which is mildly surprising given the otherwise strict self-promo rules. The \( \textbf{base} \) has a modest effect on the hotness compared to \( \textbf{order} \) and \( \textbf{age} \).

Order

The value of \( \textbf{order} \) is derived from the engagement that a story gets.

$$\textbf{order} = \log_{10}\left(\max\left(|\text{score} + 1| + \text{cpoints}, 1\right)\right)$$

The progression of the order term is logarithmic—this means going from 0 to 100 votes increases the rank far more than going from 1000 to 1100 votes.

The \( \textbf{cpoints} \) is added to the story score, which accounts for non-submitter comment upvotes (a comment upvote is worth half a story upvote). If the \( \textbf{base} \) is negative (as is the case for a freshly submitted rant), then this term is zeroed, making the comments effectively contribute nothing to the rank.

$$ \text{comment\_points} = \begin{cases} 0 & \text{if } \text{base} < 0 \\ \frac{1}{2}\sum(\text{comment\_scores} + 1) & \text{otherwise} \end{cases} $$

$$ \textbf{cpoints} = \min(\text{comment\_points}, \text{story\_score}) $$

The \( \textbf{cpoints} \) can never exceed the story score. Therefore, stories that have a low score but lots of highly upvoted comments—perhaps a signature of controversy-generating low-quality submissions—do not get boosted by comment upvotes.

There are some details around merged stories that I am leaving out for the sake of simplifying this explanation. But it roughly does what you'd expect.

Sign

If a story gets flagged enough to make the story score negatively (a flag is effectively a downvote), the \( \textbf{sign} \) becomes negative.

$$ \textbf{sign} = \begin{cases} -1 & \text{if score} < 0 \\ +1 & \text{if score} > 0 \\ 0 & \text{otherwise} \end{cases}$$

The \( \textbf{sign} \) negates the effect of comment upvotes when the story scores zero, and make them contribute negatively to the rank when the story scores below zero.

Age

The value of \( \textbf{age} \) is fixed at the time of submission. This is the unix timestamp at which the story was created, divided by a configurable \( \textbf{hotness\_window} \) time. The \( \textbf{hotness\_window} \) is 22 hours by default—this means that the value of \( \textbf{age} \) increases by \( \text{1} \) unit every 22 hours.

$$\textbf{age} = \frac{\text{created\_at\_timestamp}}{\text{hotness\_window}}$$

This value grows linearly with every newer story, pushing older stories down the rankings. The main tension in this algorithm is the fact that the \( \textbf{order} \) (dictated by score) grows logarithmically, so upvotes need to increase exponentially over time to counter the effect of \( \textbf{age} \) in order to stay on the front page. Father time comes for us all.

In a nutshell

$$\textbf{hotness} = -1 \times (\text{base} + \text{order} \times \text{sign} + \text{age})$$

$$\text{hotness} \downarrow \implies \text{rank} \uparrow$$

Where \( \textbf{base} \) is initialised based on the tag and who submitted the story. The \( \textbf{age} \) increases linearly for every new submission and the \( \textbf{order} \) for a story, as determined by votes, increases logarithmically.

Explore

Heads up—enable JavaScript to make this part work. This was mostly vibecoded, with me verifying that the results match the algorithm.

There's a gisthost link if you want to play with it as a standalone tool.

Thoughts

The algorithm is solid. It allows new stories to get their time in the sun, and correctly penalises low-quality content that generates a lot of heated discussion. If there is heated discussion, it's usually over highly-upvoted posts. Over time, age always dominates upvotes, so no story can really stick around that long in the front page. There are gates that stop overly flagged stories from making any progress up the ranks.

That said, I don't think the algorithm really makes the site what it is. The character of the site is more the result of its opinionated moderation, narrow computing focus and the gradual acculturation through the invite system. Compared to many other forums, there's less junk and also little outright hostility, racism, sexism or other isms. The community has surfaced lots of niche topics and writers, which I enjoy.

Yet, my experience on the website has been far from ideal. For me, this is rooted in a disconnect of values with the group most engaged on the site, whose votes and discussions drive the climate. I do not appreciate the cynicism worn with pride, the unproductive gotchas, the long polemics that reveal that the commenter hasn't read beyond the title, the throwaway venting and the debates where it is clear that neither side wants to actually refine their world model. It has driven me away from engaging more on the site.

Studying the algorithm has shown me that disengaging would make my problem worse—a single user's participation can be worth a lot. Early upvotes really count, and can easily boost a post to the front page. If you are lurking on the site and are unsatisfied, consider exercising your votes and submissions more. Post the more nuanced, friendly and curious comments that you'd like to see more of. It really does matter. I will likely change how I participate on the site as a result of this.

After all, there aren't all that many relatively quiet and straightforwardly serious public forums to contrast the twitters and HNs of the world that can surface niche computing curiosities.

联系我们 contact @ memedata.com