如何从主 sphinx 索引中删除已删除的条目?
我有一个使用 sphinx 主/增量索引方案的网站。 main 每天重建一次,delta 每 5 分钟重建一次。这对于索引新提交的项目非常有效。
问题是,需要像添加项目一样频繁地从索引中删除项目,并且通常删除的项目较旧,因此它们已经驻留在主索引中。因此,删除项目后,它们仍会在搜索结果中显示长达 24 小时(直到凌晨 1 点重建 main 为止)。
我该如何解决这个问题?
I have a site that utilizes the main/delta indexing scheme for sphinx. main gets rebuilt daily, delta is rebuilt every 5 minutes. This works well for indexing newly submitted items.
The problem is, items needed to be dropped from the index just as frequently as they are added, and typically the dropped items are older, so they already reside in the main index. So after item is deleted, they still appear in the search results for up to 24 hours (until 1am when main is rebuilt).
How can I solve this issue?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
![扫码二维码加入Web技术交流群](/public/img/jiaqun_03.jpg)
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
请参阅 sql_query_killlist http://sphinxsearch.com/docs/current 的文档。 html#conf-sql-query-killlist
简而言之,您需要在增量索引中执行以下操作:
See documentation for the sql_query_killlist http://sphinxsearch.com/docs/current.html#conf-sql-query-killlist
In nutshell you need following in the delta index:
Sphinx 最近引入了实时索引功能,您可以在其中动态添加、更新和删除索引。然而,它仅在 1.10 中可用,而且看起来仍然很原始。
http://www.sphinxsearch.com/docs/current.html#rt-indexes
或者,您可以更频繁地完全重新索引。如果有大量删除,每 24 小时似乎有点长。作为最后的努力,您可以随时在应用程序中检查返回的 ID 确实仍然存在,然后将其过滤掉。
Sphinx recently introduced a realtime indexes feature, where you can add, update, and delete indexes on the fly. However, it's only available in 1.10, and still seems pretty raw.
http://www.sphinxsearch.com/docs/current.html#rt-indexes
Alternatively, you could full re-index more often. Every 24 hours seems kind of long, if you have a lot of deletions. As a last-ditch effort, you could always check back in your application that the IDs being returned do indeed still exist, and then filter those out.
一种选择是在索引中定义一个属性,然后更改它的值以“忽略”某些文档。
例如,我的索引有一个 flag_ignore 属性。所有搜索都会被过滤,因此只有那些 flag_ignore=0 的文档才会被匹配。
当文档需要立即从索引中消失时,我调用 Sphinx->UpdateAttributes() 并将值设置为 1,这将使该文档从任何后续搜索中消失。
One option is to define an attribute in the index, and then change the value of it to "ignore" certain documents.
For example, my indexes have a flag_ignore attribute. All searches are filtered so only those documents with flag_ignore=0 are matched.
When a document needs to disappear right now from the index, I call Sphinx->UpdateAttributes() and set the value to 1, which will make the document disappear from any following search.