(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43392875

三个工程师组成的小型团队开发了Jitty,一款用于搜索英国待售房屋照片的搜索引擎。他们对传统的房地产门户网站不满意,因此创建了一个爬虫程序来收集房地产经纪人网站上的房源信息。Jitty 使用大型语言模型和其他模型来理解房产细节,例如楼层类型、位置和面积。其关键功能是基于图像的搜索,用户可以通过将每张照片嵌入向量空间,根据特定的视觉元素查找房屋。例如,搜索“教堂改建,价格最高 100 万英镑”或“带有壮丽景色的浴缸”。一位用户询问该爬虫程序是否尊重 robots.txt 以及是否实现了请求延迟。另一位用户称赞了这项工作,并询问了技术栈和盈利计划。

三个工程师组成的小型团队开发了Jitty,一款用于搜索英国待售房屋照片的搜索引擎。他们对传统的房地产门户网站不满意,因此创建了一个爬虫程序来收集房地产经纪人网站上的房源信息。Jitty 使用大型语言模型和其他模型来理解房产细节,例如楼层类型、位置和面积。其关键功能是基于图像的搜索,用户可以通过将每张照片嵌入向量空间,根据特定的视觉元素查找房屋。例如,搜索“教堂改建,价格最高 100 万英镑”或“带有壮丽景色的浴缸”。一位用户询问该爬虫程序是否尊重 robots.txt 以及是否实现了请求延迟。另一位用户称赞了这项工作,并询问了技术栈和盈利计划。
相关文章
  • (评论) 2024-06-16
  • (评论) 2025-03-17
  • (评论) 2025-03-18
  • (评论) 2024-07-27
  • AI爬虫还没有学会与网站友好相处。 2025-03-18

  • 原文
    Hacker News new | past | comments | ask | show | jobs | submit login
    Show HN: We made a photo search engine for homes for sale
    9 points by travisleestreet 2 hours ago | hide | past | favorite | 2 comments
    We're a small team of 3 engineers, and wanted to make a better way to look for for properties online. Traditional portals (esp. outside of US) are just a bit rubbish for doing anything other than basic searching.

    So we've:

    - Created a crawler that reads through estate agents websites to find homes for sale

    - Parses that through a series of LLMs and other models to understand each home in depth (e.g. floor type, location, total sqft)

    - Parses every photo through an embeddings vector space so that people can search for whatever they want.

    Check it out: https://jitty.com

    Currently only for home sales (not rentals), and only in the UK.

    Some examples:

    - "Beautiful church conversions up to £1m" https://jitty.com/for-sale/price-up-to-1000000-gbp/detached-...

    - "Floor to ceiling libraries with a ladder" https://jitty.com/for-sale/look-for-floor_to_ceiling_librari...

    - "Home in london, under £1m, with big beautiful windows" https://jitty.com/for-sale/london/price-up-to-1000000-gbp/de...

    - "Bathtubs with an epic view" https://jitty.com/for-sale/look-for-bathtubs_with_an_epic_vi...











    Does your crawler respect robots.txt? Does it request pages with nice delays so that it doesn't bring down servers? Related: https://news.ycombinator.com/item?id=43422413


    Nice work! Curious about your tech-stack, if you don't mind sharing.

    Also what are your plans for monetizing, if any?







    Join us for AI Startup School this June 16-17 in San Francisco!


    Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



    Search:
    联系我们 contact @ memedata.com