后处理 solr 的分面搜索结果
我不确定如何处理以下问题。所以我希望能在这里得到一些想法或类似的东西。 我将 lucene 与 solr 一起使用。每个文档(在 lucene 中索引)都有一个日期字段和一个主题字段(带有一些关键字)
通过使用分面搜索,我能够计算每个关键字在特定日期的频率。
示例 1(伪代码):
1st search where date=today:
web=>70
apple=>35
blue=>32
2nd search where date=yesterday:
web=>65
blue=>55
apple=>5
但现在我想将结果合并到一个 solr/lucene 查询中,以便计算哪些词频增长非常强劲,而女巫则不然。 结果可能是:
示例 2:
one search merging both querys from example 1
web=>(70,65) <- growth +7,69%
blue=>(32,55) <- growth -41,81%
apple=>(34,5) <- growth +680%
是否可以(并且有用)在 solr 内进行此合并(和计算),或者最好启动 2 个 solr 查询(参见示例 1)并使用 PHP 对结果进行后处理?
比你!
I'm not sure how to handle the following issues. So i hope, to get here some ideas or something like that.
I'm using lucene with solr. Every document (which is indexed in lucene) has an date-field an an topic - field (with some keywords)
By using faceted search, i'm able to calculate the frequency of every keyword at an specific date.
Example 1 (pseudo code):
1st search where date=today:
web=>70
apple=>35
blue=>32
2nd search where date=yesterday:
web=>65
blue=>55
apple=>5
But now i would like to combine the results into one solr/lucene query in order to calculate which word-frequency grows very strong and witch doesn't.
An result could be:
Example 2:
one search merging both querys from example 1
web=>(70,65) <- growth +7,69%
blue=>(32,55) <- growth -41,81%
apple=>(34,5) <- growth +680%
Is it possible (and useful) to do this consolidation (and calclulation) inside solr or is it better to start 2 solr querys (see example 1) an postprocess the results with PHP?
Than you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您预先拥有构面值,则可以使用构面查询来执行此操作,即类似于
facet.query=category:web AND date:[2011-06-14T00:00:00Z TO 2011-06-14T23: 59:59Z]&facet.query=类别:网络和date:[2011-06-13T00:00:00Z TO 2011-06-13T23:59:59Z]&...
因此您可以对构面值 * 日期进行笛卡尔积。否则,要在 Solr 中执行此操作,我认为您必须编写一些自定义 Java 分面代码。或者在客户端进行,如您提到的那样进行多个查询。
If you have the facet values a priori, you could do this with facet queries, i.e. something like
facet.query=category:web AND date:[2011-06-14T00:00:00Z TO 2011-06-14T23:59:59Z]&facet.query=category:web AND date:[2011-06-13T00:00:00Z TO 2011-06-13T23:59:59Z]&...
so you would do the cartesian product of facet values * dates.Otherwise, to do this inside Solr I think you'd have to write some custom Java faceting code. Or do it client-side, with multiple queries as you mentioned.