如何更新 Sphinx 主索引和增量索引

发布于 2024-10-11 01:08:47 字数 865 浏览 6 评论 0原文

我已阅读 Sphinx 文档和各种资源,但我对维护主索引和增量索引的过程感到困惑。请告诉我这是否正确:

  • 有一个按 last_update_time 分区搜索索引的表(不是教程中的 id http://sphinxsearch.com/docs/1.10/delta-updates.html)

  • 每 15 分钟更新一次增量索引。 delta索引只抓取已经更新>的记录last_update_time

    indexer --rotate --config /opt/sphinx/etc/sphinx.conf 增量
    
  • 通过合并增量每小时更新主索引:

    索引器 --merge 主增量 --merge-dst-range 已删除 0 0 --rotate
    

预查询 SQL 会将 last_update_time 更新为 NOW(),从而重新分区索引

困惑:合并会运行预查询 SQL 吗?

  • 主索引更新后,立即更新增量索引进行清理:

    indexer --rotate --config /opt/sphinx/etc/sphinx.conf 增量
    

编辑:删除记录如何工作?由于增量索引将包含已删除的记录,因此只有在增量索引合并到主索引后,记录才会从搜索查询中删除。

I've read the Sphinx documentation and various resources, but I am confused about the process of maintaining main and delta indexes. Please let me know if this is correct:

  • Have a table that partitions the search index by last_update_time (NOT id as in the tutorial http://sphinxsearch.com/docs/1.10/delta-updates.html)

  • Update the delta index every 15 minutes. The delta index only grabs records that have been updated > last_update_time:

    indexer --rotate --config /opt/sphinx/etc/sphinx.conf delta
    
  • Update the main index every hour by merging delta using:

    indexer --merge main delta --merge-dst-range deleted 0 0 --rotate
    

The pre query SQL will update last_update_time to NOW(), which re-partitions the indexes

Confusion: Will the merge run the pre query SQL?

  • After the main index is updated, immediately update the delta index to clean it up:

    indexer --rotate --config /opt/sphinx/etc/sphinx.conf delta
    

EDIT: How would deletion of records even work? Since the delta index would contain deleted records, records would only be removed from search queries after the delta index was merged into main?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

2024-10-18 01:08:47

要处理删除,您需要查看killlist,它基本上定义了删除标准:

http://sphinxsearch.com/docs/manual-1.10.html#conf-sql-query-killlist

在一个例子中,我让我们构建我们的主要每日清晨,然后简单地运行每 5 分钟更新一次增量(包括杀戮列表)。

关于合并的东西,我不确定,因为我从未使用过它。

To deal with the deletes you need to take a look at the killlist, it basically defines removal criteria:

http://sphinxsearch.com/docs/manual-1.10.html#conf-sql-query-killlist

In an example I have we build our main daily, early morning then simply run a delta update (including the killlist) every 5 minutes.

On the merge stuff, I'm not sure as I've never used it.

身边 2024-10-18 01:08:47

这只是工作的一半。删除的内容必须由kill list(现在称为kbatch)处理,然后delta将不会显示删除的结果。但如果你合并——它们就会重新出现。要解决此问题 - 您必须执行

indexer --merge main delta --merge-dst-range deleted 0 0 --rotate

此操作,但为了使其正常工作 - 您需要将“已删除”属性添加到每个已删除的结果中。然后合并过程将过滤掉已删除=1的结果,并且主索引中不会有已删除的结果。

This is only half of the job. Deleted stuff must be taken care by kill list (kbatch now it is called) and then delta will not show the deleted results. But if you merge - they will reappear. To fix this - you have to do

indexer --merge main delta --merge-dst-range deleted 0 0 --rotate

But in order for this to work - you need an attribute "deleted" to be added to every result that was deleted. Then merge process will filter out results that have deleted=1 and main index will not have deleted results in it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文