Amazonbot is finally respecting robots.txt

原始链接: https://xeiaso.net/notes/2026/amazonbot-respecting-robots-txt/

Hacker Newsnew | past | comments | ask | show | jobs | submitloginAmazonbot is finally respecting robots.txt (xeiaso.net)34 points by xena 1 hour ago | hide | past | favorite | 2 comments help jacobn 2 minutes ago | next [–] I just complained to them the other day! They were scraping our weather website to no end, very much including the disallowed path prefixes.Did end up just adding them to our WAF blocklist, which is weirdly ironic - hosting on their infra & using their services to block their AI scraper...replybstsb 4 minutes ago | prev [–] > Get Outlook for Macthis bit made me laugh. was the email drafted in Outlook? was it sent to some sort of forwarding mailbox, or did they just BCC every customer in?reply Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact Search:
相关文章

原文

Loading...

You are seeing this because the administrator of this website has set up Anubis to protect the server against the scourge of AI companies aggressively scraping websites. This can and does cause downtime for the websites, which makes their resources inaccessible for everyone.

Anubis is a compromise. Anubis uses a Proof-of-Work scheme in the vein of Hashcash, a proposed proof-of-work scheme for reducing email spam. The idea is that at individual scales the additional load is ignorable, but at mass scraper levels it adds up and makes scraping much more expensive.

Ultimately, this is a placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG: via how they do font rendering) so that the challenge proof of work page doesn't need to be presented to users that are much more likely to be legitimate.

Please note that Anubis requires the use of modern JavaScript features that plugins like JShelter will disable. Please disable JShelter or other such plugins for this domain.

联系我们 contact @ memedata.com