(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43398837

Hacker News 上的一个帖子讨论了一篇关于AI爬虫不尊重网站礼仪的文章。评论者们表达了对AI公司,特别是那些资源雄厚的公司,将数据获取置于版权和网站稳定性等伦理考虑之上的担忧。一位评论者指出,这些爬虫在公众关注到它们的行为后能够停止活动,这表明它们是故意无视规则,直到可能面临后果才采取行动。另一位评论者希望数据投毒能够得到更广泛的应用,以此作为对抗这种激进数据抓取的防御机制。总体而言,人们认为AI爬虫故意无视既定规范,只在被迫时才处理后果,这促使人们需要采取反制措施。

相关文章
  • (评论) 2024-07-31
  • (评论) 2024-04-29
  • AI爬虫还没有学会与网站友好相处。 2025-03-18
  • (评论) 2025-03-18
  • (评论) 2024-06-16

  • 原文
    Hacker News new | past | comments | ask | show | jobs | submit login
    AI crawlers haven't learned to play nice with websites (theregister.com)
    21 points by belter 1 hour ago | hide | past | favorite | 3 comments










    The owners don't even care about copyright and their deep-pocketed owners. They will certainly not care about playing nice with random websites, or any conventions that may exist in any domain.


    Is there any indication that the owners of those crawlers intend for them to play nice?

    I think this form the article is telling:

    > The Register asked Schubert about this in early January. "Funnily enough, a few days after the post went viral, all crawling stopped," he responded at the time. "Not just on the Diaspora wiki, but on my entire infrastructure. I'm not entirely sure why, but here we are."

    So the owners of those crawlers certainly have the ability to stop the traffic, they're just choosing not too...until publicity makes their actions risky.



    That's intentional.

    They take the data and deal with the consequences later because there's none so far. I hope data poisoning see more traction as a possible countermeasure.







    Join us for AI Startup School this June 16-17 in San Francisco!


    Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



    Search:
    联系我们 contact @ memedata.com