(评论)

(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43392875

三个工程师组成的小型团队开发了Jitty，一款用于搜索英国待售房屋照片的搜索引擎。他们对传统的房地产门户网站不满意，因此创建了一个爬虫程序来收集房地产经纪人网站上的房源信息。Jitty 使用大型语言模型和其他模型来理解房产细节，例如楼层类型、位置和面积。其关键功能是基于图像的搜索，用户可以通过将每张照片嵌入向量空间，根据特定的视觉元素查找房屋。例如，搜索“教堂改建，价格最高 100 万英镑”或“带有壮丽景色的浴缸”。一位用户询问该爬虫程序是否尊重 robots.txt 以及是否实现了请求延迟。另一位用户称赞了这项工作，并询问了技术栈和盈利计划。

三个工程师组成的小型团队开发了Jitty，一款用于搜索英国待售房屋照片的搜索引擎。他们对传统的房地产门户网站不满意，因此创建了一个爬虫程序来收集房地产经纪人网站上的房源信息。Jitty 使用大型语言模型和其他模型来理解房产细节，例如楼层类型、位置和面积。其关键功能是基于图像的搜索，用户可以通过将每张照片嵌入向量空间，根据特定的视觉元素查找房屋。例如，搜索“教堂改建，价格最高 100 万英镑”或“带有壮丽景色的浴缸”。一位用户询问该爬虫程序是否尊重 robots.txt 以及是否实现了请求延迟。另一位用户称赞了这项工作，并询问了技术栈和盈利计划。

相关文章

（评论） 2024-06-16

（评论） 2025-03-17

（评论） 2025-03-18

（评论） 2024-07-27

AI爬虫还没有学会与网站友好相处。 2025-03-18

原文

Hacker News new | past | comments | ask | show | jobs | submit

login

		Show HN: We made a photo search engine for homes for sale
		9 points by travisleestreet 2 hours ago \| hide \| past \| favorite \| 2 comments

		We're a small team of 3 engineers, and wanted to make a better way to look for for properties online. Traditional portals (esp. outside of US) are just a bit rubbish for doing anything other than basic searching. So we've: - Created a crawler that reads through estate agents websites to find homes for sale - Parses that through a series of LLMs and other models to understand each home in depth (e.g. floor type, location, total sqft) - Parses every photo through an embeddings vector space so that people can search for whatever they want. Check it out: https://jitty.com Currently only for home sales (not rentals), and only in the UK. Some examples: - "Beautiful church conversions up to £1m" https://jitty.com/for-sale/price-up-to-1000000-gbp/detached-... - "Floor to ceiling libraries with a ladder" https://jitty.com/for-sale/look-for-floor_to_ceiling_librari... - "Home in london, under £1m, with big beautiful windows" https://jitty.com/for-sale/london/price-up-to-1000000-gbp/de... - "Bathtubs with an epic view" https://jitty.com/for-sale/look-for-bathtubs_with_an_epic_vi...

gardaani 0 minutes ago | [–]

Does your crawler respect robots.txt? Does it request pages with nice delays so that it doesn't bring down servers? Related: https://news.ycombinator.com/item?id=43422413

equilibrium 17 minutes ago | [–]

Nice work! Curious about your tech-stack, if you don't mind sharing.

Also what are your plans for monetizing, if any?

Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

联系我们 contact @ memedata.com