如何自动从数据库中索引solr中的数据
我的应用程序有 MySql 数据库。我实现了 solr 搜索并使用 dataimporthandler(DIH) 将数据库中的数据索引到 solr 中。我的问题是:有什么方法可以让数据库更新后我的 solr 索引自动获取数据库中添加的新数据的更新。 。这意味着我不需要每次数据库表更改时手动运行索引过程。如果是,那么请告诉我如何实现这一点。
I have MySql database for my application. i implemented solr search and used dataimporthandler(DIH)to index data from database into solr. my question is: is there any way that if database gets updated then my solr indexes automatically gets update for new data added in the database. . It means i need not to run index process manually every time data base tables changes.If yes then please tell me how can i achieve this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我认为 Solr 不可能让您在数据库发生任何更新时索引数据。
但可能有这样的可能性,在触发器的帮助下 - 可以从触发器运行外部应用程序。
编写一个 CRON 来触发
PHP
脚本,该脚本从数据库读取数据并在Solr
中对其建立索引。为CRUD
操作编写一个触发器(调用此脚本)并将其转储到数据库中,因此,每当数据库发生问题时,此触发器将调用上述脚本并可能发生索引。请参阅:
从 MySQL 触发器调用 PHP 脚本< /a>
自动调度:
请参阅这篇文章如何在 Solr 中安排数据导入,了解有关安排的更多信息。第二个答案解释了如何使用 Cron 导入。
I don't think there is a possibility in Solr which lets you index the data when any updates happens to DB.
But there could be possibilities like, with the help of Triggers - there is a possibility to run an external application from triggers.
Write a CRON to trigger
PHP
script which does reading from the DB and indexing it inSolr
. Write a trigger (which calls this script) forCRUD
operation and dump it into DB, so, whenever something happens to DB, this trigger will call the above script and indexing could happen.Please see:
Invoking a PHP script from a MySQL trigger
Automatic Scheduling:
Please see this post How can I Schedule data imports in Solr for more information on scheduling. The second answer, explains how to import using Cron.
由于您最初使用 DataImportHandler 将数据加载到 Solr 中...您可以创建 Delta Import使用 curl 从 cron 作业执行的处理程序,以定期添加更改数据库到索引。另外,如果您需要更多实时更新,正如@Rakesh 建议的那样,您可以在数据库中使用触发器,并启动对 Delta DIH 的curl 调用。
Since you used a DataImportHandler to initially load your data into Solr... You could create a Delta Import Handler that is executed using curl from a cron job to periodically add changes in the database to the index. Also, if you need more real time updates, as @Rakesh suggested, you could use a trigger in your database and have that kick off the curl call to the Delta DIH.
您可以使用浏览器和任务管理器导入数据。
在 Windows 服务器上执行以下步骤...
转到管理工具 =>任务计划
单击“创建任务”
现在将使用选项卡打开创建任务的屏幕
常规、触发器、操作、条件、设置。
在常规选项卡中输入任务名称“Solrdataimport”,并在说明中输入“导入 mysql 数据”
现在转到“触发器”选项卡,在“设置检查每日”中单击“新建”。在“高级设置”中重复任务每隔...将时间放在您想要的任何位置。单击“确定”
现在转到“操作”按钮,单击“新按钮”,在设置中放入“程序/脚本”“C:\Program Files (x86)\Google\Chrome\Application\chrome.exe”,这是 chrome 浏览器的安装路径。在“添加参数”中输入http://localhost:8983/solr/ #/collection1/dataimport//dataimport?command=full-import&clean=true 然后单击“确定”
使用上述所有过程数据导入将自动运行。如果停止 Imort 过程,请执行以下操作上述所有过程只需更改“操作”选项卡下的程序/脚本“taskkill”代替“C:\Program Files (x86)\Google\Chrome\Application\chrome.exe”在参数中输入“f /im chrome.exe”
根据需求设置触发时机
you can import the data using your browser and task manager.
do the following steps on windows server...
GO to Administrative tools => task Schedular
Click "Create Task"
Now a screen of Create Task will be open with the TAB
General,Triggers,Actions,Conditions,Settings.
In the genral tab enter the task name "Solrdataimport" and in discriptions enter "Import mysql data"
Now go to Triggers tab CLick new in Setting check Daily.In Advanced setting Repeat task every ... Put time there whatever you want.click OK
Now go to Actions button click new Button IN setting put Program/Script "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" this is the installation path of chrome browser.In the Add Arguments enter http://localhost:8983/solr/#/collection1/dataimport//dataimport?command=full-import&clean=true And click OK
Using the all above process Data import will Run automatically.In case of Stop the Imort process follow the all above process just change the Program/Script "taskkill" in place of "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" under Actions Tab In arguments enter "f /im chrome.exe"
Set the triggers timing according the requirements
您正在寻找的是“增量导入”,并且许多其他帖子都包含有关该内容的信息。我创建了一个 Windows WPF 应用程序和服务,以便按定期计划向 Solr 发出命令,因为如果您有很多核心/环境,那么使用 CRON 作业和任务计划程序会有点难以维护。
https://github.com/systemidx/SolrScheduler
您基本上只需将 JSON 文件放入指定文件夹中即可它将使用 REST 客户端向 Solr 发出命令。
What you're looking for is a "delta-import", and a lot of the other posts have information about that covered. I created a Windows WPF application and service to issue commands to Solr on a recurring schedule, as using CRON jobs and Task Scheduler is a bit difficult to maintain if you have a lot of cores / environments.
https://github.com/systemidx/SolrScheduler
You basically just drop in a JSON file in a specified folder and it will use a REST client to issue the commands to Solr.