当您向 SQL 添加数据时,Sphinx 自动更新索引吗?

发布于 2024-12-06 14:10:31 字数 111 浏览 2 评论 0原文

我很好奇当您添加新的 SQL 数据时 Sphinx 是否会自动更新其索引,或者您是否必须专门告诉它重新索引您的数据库。

如果没有,是否有人有一个示例来说明如何在数据库数据更改时自动执行此过程?

I am curious as to whether or not Sphinx will auto update its index when you add new SQL data or whether you have to tell it specifically to reindex your db.

If it doesn't, does anyone have an example of how to automate this process when the database data changes?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

甜妞爱困 2024-12-13 14:10:31

答案是否定的,您需要告诉 sphinx 重新索引您的数据库。

您需要了解一些步骤和要求:

  1. Main 和 delta 是要求
  2. 第一次运行您需要索引您的主索引。
  3. 第一次运行后,您可以通过旋转它来索引 delta(以确保服务正在运行并且当时可以使用网络上的数据)
  4. 在进行下一步之前,您需要创建一个表来标记您的“最后索引行”。最后索引的行 ID 可用于下一个索引增量并将增量合并到主索引中。
  5. 您需要将增量索引合并到主索引。
    如在 sphinx 文档中 http://sphinxsearch.com/docs/current.html#index-合并
  6. 重新启动sphinx服务。

    提示:使用 C# 或其他语言创建您自己的可以执行索引的程序。你可以试试windows的任务计划也可以。

这是我的会议:

source Main
{
type            = mysql

sql_host        = localhost
sql_user        = root
sql_pass        = password
sql_db          = table1
sql_port        = 3306  # optional, default is 3306
sql_query_pre = REPLACE INTO table1.sph_counter SELECT 1, MAX(PageID) FROM table1.pages;
sql_query       = \
    SELECT  pd.`PageID`, pd.Status from table1.pages pd
    WHERE pd.PageID>=$start AND pd.PageID<=$end \
    GROUP BY pd.`PageID`

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT MIN(PageID),MAX(PageID)\
              FROM tabl1.`pages`
sql_range_step      = 1000000
}


source Delta : Main
{
sql_query_pre = SET NAMES utf8

sql_query = \
    SELECT  PageID, Status from pages \
    WHERE PageID>=$start AND PageID<=$end 

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT (SELECT MaxDoc FROM table1.sph_counter WHERE ID = 1) MinDoc,MAX(PageID) FROM table1.`pages`;
sql_range_step      = 1000000
}


index Main
{
source          = Main
path            = C:/sphinx/data/Main
docinfo         = extern
charset_type        = utf-8
}


index Delta : Main
{
    source = Delta
path = C:/sphinx/data/Delta
charset_type = utf-8
}

The answer is no and you need to tell sphinx to reindex your db.

There are some steps and requirements which you need to know:

  1. Main and delta are requirement
  2. First run you need to index your main index.
  3. After the first run, you may index delta by rotating it ( to make sure the service is running and the data on the web is can be used at the time )
  4. Before you go further step, you need to create a table to mark your "last indexed rows". THe last indexed rows ID could be used for the next indexing delta and merging delta into main.
  5. You need to merge your delta index to the main index.
    as inside the sphinx documents http://sphinxsearch.com/docs/current.html#index-merging
  6. Restart sphinx service.

    TIPS: Create your own program that could execute the index by using C# or other languages. You may try the task schedule of windows also can.

Here is my conf:

source Main
{
type            = mysql

sql_host        = localhost
sql_user        = root
sql_pass        = password
sql_db          = table1
sql_port        = 3306  # optional, default is 3306
sql_query_pre = REPLACE INTO table1.sph_counter SELECT 1, MAX(PageID) FROM table1.pages;
sql_query       = \
    SELECT  pd.`PageID`, pd.Status from table1.pages pd
    WHERE pd.PageID>=$start AND pd.PageID<=$end \
    GROUP BY pd.`PageID`

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT MIN(PageID),MAX(PageID)\
              FROM tabl1.`pages`
sql_range_step      = 1000000
}


source Delta : Main
{
sql_query_pre = SET NAMES utf8

sql_query = \
    SELECT  PageID, Status from pages \
    WHERE PageID>=$start AND PageID<=$end 

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT (SELECT MaxDoc FROM table1.sph_counter WHERE ID = 1) MinDoc,MAX(PageID) FROM table1.`pages`;
sql_range_step      = 1000000
}


index Main
{
source          = Main
path            = C:/sphinx/data/Main
docinfo         = extern
charset_type        = utf-8
}


index Delta : Main
{
    source = Delta
path = C:/sphinx/data/Delta
charset_type = utf-8
}
云归处 2024-12-13 14:10:31

有关实时索引的sphinx文档部分中所示

实时索引(为了简洁起见,称为 RT 索引)是一个新的后端,可让您即时插入、更新或删除文档(行)。

因此,要动态更新索引,您只需要进行如下查询

{INSERT | REPLACE} INTO index [(column, ...)]
VALUES (value, ...)
[, (...)]

As found in the sphinx documentation part about real-time indexes

Real-time indexes (or RT indexes for brevity) are a new backend that lets you insert, update, or delete documents (rows) on the fly.

So to update an index on the fly you would just need to make a query like

{INSERT | REPLACE} INTO index [(column, ...)]
VALUES (value, ...)
[, (...)]
怎言笑 2024-12-13 14:10:31

扩展 Anne 的答案 - 如果您使用 SQL 索引,它不会自动更新。您可以在每次更改后管理重新索引的过程 - 但这可能会很昂贵。解决这个问题的一种方法是使用包含所有内容的核心索引,然后使用具有相同结构的增量索引来仅对更改进行索引(这可以通过布尔值或时间戳列来完成)。

这样,您就可以超级定期地重新索引增量索引(更小,因此更快),然后不定期地一起处理核心和增量(但仍然最好至少每天执行一次)。

但除此之外,新的 RT 指数值得关注 - 您仍然需要自己更新内容,并且它与数据库无关,因此这是一种不同的思维方式。另外:RT 索引不具备 SQL 索引所具有的所有功能,因此您需要决定什么更重要。

To expand on Anne's answer - if you're using SQL indices, it won't update automatically. You can manage the process of reindexing after every change - but that can be expensive. One way to get around this is have a core index with everything, and then a delta index with the same structure that indexes just the changes (this could be done by a boolean or timestamp column).

That way, you can just reindex the delta index (which is smaller, and thus faster) on a super-regular basis, and then process both core and delta together less regularly (but still, best to do it at least daily).

But otherwise, the new RT indices are worth looking at - you still need to update things yourself, and it's not tied to the database, so it's a different mindset. Also: RT indices don't have all the features that SQL indices do, so you'll need to decide what's more important.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文