如何实现一个房产推荐引擎?
我说的是电影/物品推荐之类的东西,但房地产似乎更棘手。当访问网站并搜索 RE 时,应该向用户提供一些建议。让我们将该任务分成两个任务:
a)用户尚未输入任何个人信息 - 基于项目的推荐 b) 用户已经输入了他/她的详细信息,例如收入、位置等。 - 基于项目/用户的推荐
对于任务 a),我想到的第一件事是开始建模 RE 特征,但使用一些范围而不是精确值。例如:
面积(m2)
- 40 - 50 我们可以将其标记为“1”
- 50 - 70 是“2”
- 等等...
价格:
- 20 - 30,000 欧元将标记为 1
- 30 - 40 将是 2
- 等等...
靠近市中心:
- 1 表示 RE 位于市中心
- 2 区或距离中心 2/3 公里
- 3 表示区域 3 或距中心 7 公里
因此,有了范围,我们就可以为每个区域分配一个向量RE 属性允许我们使用:欧几里德距离、皮尔逊相关性和一些最近邻算法。
请评论我的方法或建议一种新方法。
I am talking about something like movie/item recommendation, but it seems that real estate is more tricky. When visiting a web-site and doing some search for RE, the user should be presented with some suggestions. Let's separate the task in two tasks:
a) the user has still not entered any personal info - item based recommendation
b) the user has already entered his/hers details such as income, location, etc. - item/user based recommendation
The first thing that comes to my mind for task a) is to start modeling RE features, but using some ranges instead of exact values. For example:
Area in m2
- 40 - 50 we can mark it for "1"
- 50 - 70 is "2"
- etc ...
Price:
- 20 - 30 thousands € will be marked as 1
- 30 - 40 will be 2
- etc ...
Proximity to city center:
- 1 for the RE being within the city center
- 2 for Zone 2 or up to 2/3 kilometers from center
- 3 for Zone 3 or 7 kilometers from center
So having ranges lets us assign a vector to each RE property which will allows us to use: Euclidean distance, Pearson correlation and some nearest neighbor algorithms.
Please comment on my approach or suggest a new one.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您已经拥有一个拥有足够流量的网站,您可以尝试纯粹的协同过滤方法,即查看该房产的人也查看了这些其他房产。您可以使用皮尔逊相关系数来获得良好的结果。
2 个 RE 之间的相似度可以定义为:
当用户查看属性 RE 时,您可以根据与显示的属性的相似度分数对所有其他 RE 属性进行排序,并显示前几个。
您可以在此基础上添加一些明显的过滤器,例如房产位置、价格范围等。
您还可以按照您的建议定义相似性,并将两者的结果混合起来,以便从没有如果使用纯粹的协同过滤算法,进入的机会就很大。
If you already have a website with enough traffic, you can try a pure collaborative filtering approach, i.e people who viewed this property also viewed these other properties. You could use the Pearson correlation there for good results.
Similarity between 2 RE can be defined as
When a user is viewing property RE you can sort all other RE properties based on the similarity score with the property being shown and show the top few.
You could add some obvious filters on top of this like the location of the property, the price range etc.
You can also define the similarity as you have suggested and mix the results from both for good representation from new RE entries which do not have a high chance of getting in if a pure collaborative filtering algorithm is used.