(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=41406453

小型网络平台 kagi.com 旨在通过 RSS 源收集独立且独特的网站。 然而,由于设计问题,平台上的滚动受到限制且令人困惑。 为了避免黑客针对非基于 JavaScript 的网站进行攻击,建议添加验证码并实施速率限制或防火墙等安全措施。 该平台仅保存提要更新中的基本信息,并每周运行手动更新,导致落后于最新内容。 有可能与 Archive Team 合作以进一步发现博客文章,但内容多样性可能不适合大多数提要读者。 作者建议现代搜索引擎有目的地隐藏较小的网站以支持其专有内容。 相反,用户鼓励自己发现这些利基网站,并将其与旧式的“冲浪”或“网环”进行比较。 一些用户更喜欢小型、无广告的网站,批评谷歌有偏见的搜索结果优先考虑商业/广告充斥的网站。 作者提到了名为 Geminispace 的微博协作目录,需要特定的 Gemini 阅读器,并对匿名支付选项表示赞赏。 总体而言,本文强调了在大公司和社交媒体平台的影响之外保留独立、个性化内容的重要性。

相关文章

原文


Our contribution to the small web: https://kagi.com/smallweb

After opening this web page, I pressed down arrow a few times to scroll the page. At first, I didn't understand why it only scrolled a few pixels.

It looks like there's a scrollable area within a scrollable area. The outermost scrollable area only scrolls a few pixels.

This is a badly designed web page.



I know the friction to add websites is the point, but might I recommend a way to add our own websites without having to promote two others. My rinkydink website qualifies, but all the other small websites I know are all already on the list.



Careful, if you show noscript/basic (x)html does a good enough job, you will get attacked by big tech shadow-paid hackers (or idiotic ones), that to force you to use their javascripted grotesquely massive and complex web-engines.

...



Well, maybe not a static document, but as soon as you have some basic HTML forms doing a good enough job, I would not be surprised to it gets attacked by big tech shadow-paid hackers (or idiotic ones) to push forward their massive and complex javascripted web engines which they have control over.

Look at whom the crime is a benefit in the end.



What? This can easily be averted by adding a captcha to the form (server-side validation so no JS needed) and/or some sort of rate limiter or firewall, e.g. blocking any IP address that sends too many requests too quickly.



Yes, semi-regularly, I did a fresh update since your message.

To reduce storage size, only the title/link/metadata of latest post from each feed is saved. I run the crawler manually, aiming for weekly, but sometimes less frequently. So this won't catch every post and it lags behind quite a bit. I'm hitting some hosting sites faster than I'd like, especially ones that support custom domain names, so I'm planning on fixing up the rate-limiting strategy before I put it in a daily cron job.

There's a plan for ArchiveTeam to use the RSS feed as another way of discovering blog posts to archive. I don't think it's generally useful to point your feed reader at it as there's quite a diverse collection of content.



I feel like the internet needs a giant directory of indie websites. So you can actually surf around and find them.

The big modern search engines almost have to be intentionally hiding these websites because they're nearly impossible to find without using an alternative engine like wiby.me or search.marginalia.nu.



I was just going to post a comment similar to this. We've swung towards walled gardens of piles of content instead of graphs of individually curated links.

Exactly that "surfing" or "webring" or "stumbleupon" style of actually browsing in a larger content rather than searching or push-promote within that pile of content.



The walled gardens are better for many of the internet's main uses.

If I need to find out what vodka to buy I Google with site:reddit.com and pick the post that's obviously written by alcoholics. The small web can't touch that.



I don’t think Google hides small sites as much as people are really good at SEO for Google specifically.

Like my blog has literally 0 SEO and you’ll never find it, but a friend of mine has a blog where he does not post very often, but spends a lot of effort on SEO and it’s very easy to run into his blog.

The SEO meta destroyed small blogs.



It's impossible to say this is something they do, but it's worth noting that Google also has an economic incentive to mostly show commercial/ad-ridden results, as leading users to blogs with no adsense on them make them less money; so it would at least be in their interest to let the search results look like they do.

To fully understand Google you need to look at them not as a service that brings websites to people, but directs people to websites.



Contains the quote above and “The goals of the advertising business model do not always correspond to providing quality search to users.”

So what we observe in the deterioration of Google search was predicted by its creators, who made the deliberate decision to let this happen by accepting advertising.



Google went public in 2004, after that I don't think any amount of founder idealism could have saved it from shareholder pressure. If anything it's remarkable they held out as long as they did.



It’s so unfortunate that no one inside Google is taking any decision to clearly make things worse. It’s simply the structure of their business that is fundamentally wrong, and their founders had correctly identified the problem right at the start.



ooh.directory is fantastic, I particularly liked the stance that only add a few sites a week are added, which allows to "digest" these sites. Sadly, no new sites have been added since nearly three month. I assume this is just an instance of "Life happens" - it is a single person venture after all - but if there were a dozen similar attempts at handpicking and cataloguing the "good web", it would not hurt.



A lot of the good stuff got sucked in to walled gardens. People’s personal home pages or tacky MySpace pages were definitely more fun than the current semiprofessional content scroll. Forums like this very one were mostly subsumed in to Reddit. Nevermind the death of the bbs (not actually the internet I realise)



There was a push during Covid on Gemini pages. I did that for awhile, but the lack of real formatting and not being able to cross link articles became a stopper.

You can see get to some of them here

Collaborative Directory of Geminispace: gemini://cdg.thegonz.net/

But you need a Gemini reader



Seems like a small web deserves a small client. Why use a "big web" client to read the small web. "Big web" clients are funded by advertising or advertising companies.

Bias disclosure: I have used a text-only client for the last 30 years.



Wonderful collection of how-to’s to run your own server. Thanks for sharing.

Might I suggest (in the interest of privacy) that you give donators the option to use a Silent Payment address instead of a naked BYC address? I noticed you have a Monero address as well, so I assume you care about privacy



i've been publishing things as html2 pages, but not interconnected in any way. so each page (or sometimes group of pages) will be dedicated to an exploration of a single subject. i then send those pages to people who i think might be interested in them. that's all, they otherwise don't see the greater internet. of course people are free to add them to link aggregators, etc. but i don't police this practice. i simply don't care for my output to be consumed by general public, or by llms, or by corporate media, or by whomever who is not my friend or in my immediate immediate circle of friends



Thank you for this. It has inspired me to delete my Reddit account and create an HN account. This gives me hope that the web can survive the social media era.



mobile devices, app-ification and the social media that really started to kill the small web, kind of ironically.

and if you're a front end developer it was apple launching the meta viewport tag in 2007 killed the simple front end.

联系我们 contact @ memedata.com