Solr 分数中的随机噪声
我正在寻找一种将随机噪声引入我的评分函数的方法,但我不知道如何最好地进行。
一些背景:
我们使用 Solr 作为 Web 应用程序,为机构管理大量照片。
一位客户对评分有一个有趣的要求:
- “质量”字段,由编辑维护,从 1(最高)到 3(最低);
- “日期”字段,增加更多最近的照片;我可能会使用对数函数;
然而,由于库存照片市场的运作方式,这可能会导致许多相似的照片一起出现。 他们的要求是大幅提升“质量”,但引入一些随机性,以便照片不会按严格的日期顺序出现。
有什么想法吗?
编辑:一个关键要求是拥有“稳定”的查询结果:如果我搜索两次“热带岛屿”,我可以获得略有不同的结果集,但如果我要求第一页,然后是第二页,然后是第一页,我最好得到相同的结果:)
I am looking for a way of introducing random noise into my scoring function, and I'm at a loss on how to best proceed.
Some background:
We use Solr for a web application that manages large-ish sets of photos for agencies.
One customer has an interesting requirement for scoring:
- 'quality' field, maintained by editors, from 1 (highest) to 3 (lowest);
- 'date' field, boosting more recent photos; I would probably use a logarithmic function;
However, due to how the stock photo market works, this will likely result in many similar photos appearing together.
Their request is to give 'quality' a large boost, but introduce some randomness so that photos will not appear in a strict date order.
Any idea?
EDITED: a key requirement is to have "stable" query results: if I search twice for "tropical island" I can get a slightly different result set, but if I ask for the first page, then the second, then the first, I'd better get the same results :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用 FunctionQueries 来完成此操作。为每张照片添加一个随机数接近 1(例如 0.99、1.02)的字段,并在产品功能查询中使用它来更改“自然”分数。
You could do this with FunctionQueries. For each photo add a field with a random number close to 1 (e.g. 0.99, 1.02) and use it in a product function query to alter the "natural" score.
事实证明,我解决问题的第一种方法是正确的,并且我遇到了一个很小的实现错误。如果它对其他人有帮助:
RandomSortField 确实具有我需要的特征(即,为同一查询返回可重复的结果)。
暂时把 FunctionQuery 放在一边,即使是一些微不足道的东西,例如:
sort=quality_i asc, date_d desc, random_12345 desc
也将接近我的要求。
但是,当使用太阳黑子红宝石宝石时,无法传递种子,这就是之前欺骗我的原因:我最终每次都使用不同的种子,从而获得“真实”的随机结果。
Turns out my first approach to solving the problem was the correct one, and I had a trivial implementation bug. In case it helps others:
RandomSortField does have the characteristics I need (that is, returning repeatable results for the same query).
Leaving aside the FunctionQuery for a moment, even something trivial like:
sort=quality_i asc, date_d desc, random_12345 desc
will approximate my requirements.
However, when using the Sunspot ruby gem, there's no way of passing the seed, and that's what was tricking me earlier: I ended up using a different seed each time, thus getting "true" random results.