Solr 中的 DIH 调度

发布于 2024-11-16 13:20:46 字数 1630 浏览 8 评论 0原文

我刚刚开始使用 Solr,并将其部署在 Tomcat 上并运行。我已经设置了架构和数据导入处理程序,并且它很好地索引了文件。现在我想安排这个 dataImportHandler 每小时左右运行一次。

此处有一个 wiki 页面,详细介绍了这些文件。

但没有关于在哪里创建文件以及如何部署它们的说明。

之前在 Stack Overflow 上已经提出了类似的问题 此处

答案是“创建类 ApplicationListener、HTTPPostScheduler 和 SolrDataImportProperties”。我不知道应该在哪里创建课程。但我猜测了一下,下载了最新的每晚构建版本,并在 org.apache.solr.handler.dataimport.scheduler 包中创建了类(从 wiki 页面复制粘贴这些类)。我编译并运行 ant dist 命令来创建可部署的 jar 文件。

我按照 wiki 中的说明配置了 dataimport.properties,然后按照上面答案中的说明在 web.xml 文件中添加了侦听器。但是当我启动Tomcat solr时并没有部署。

我在日志文件中看到此错误消息:

INFO: Starting Servlet Engine: Apache Tomcat/7.0.14
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.HostConfig deployDescriptor
INFO: Deploying configuration descriptor solr.xml from /home/sabman/programs/apache-tomcat-7.0.14/conf/Catalina/localhost
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.HostConfig deployDescriptor
WARNING: A docBase /home/sabman/programs/apache-tomcat-7.0.14/webapps/solr.war inside the host appBase has been specified, and will be ignored
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.SetContextPropertiesRule begin
WARNING: [SetContextPropertiesRule]{Context} Setting property 'debug' to '0' did not find a matching property.
Jun 21, 2011 5:20:48 PM org.apache.catalina.core.StandardContext startInternal
SEVERE: Error listenerStart

我必须从 web.xml 中删除侦听器代码,才能像以前一样工作。

知道我可能做错了什么吗?

I have just started playing around with Solr and I have it deployed and running on Tomcat. I have the schema and data import handler set up and it indexes the files just fine. Now I want to schedule this dataImportHandler to run every hour or so.

There is a wiki page detailing the files here.

But there are not instructions on where to create the files and how to deploy them

A similar question has been asked on Stack Overflow before here.

The answer was to "Create classes ApplicationListener, HTTPPostScheduler and SolrDataImportProperties". I don't know where I should be creating the classes. But I took a guess and I downloaded the latest nightly build and created the classes in the org.apache.solr.handler.dataimport.scheduler package (copy pasting the classes from the wiki page). I compiled and ran the ant dist command to create the deployable jar files.

I configured the dataimport.properties as per the instructions in the wiki and then added the listener in the web.xml file as instructed in the answer above. But when I started Tomcat solr was not deployed.

I see this error message in the log file:

INFO: Starting Servlet Engine: Apache Tomcat/7.0.14
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.HostConfig deployDescriptor
INFO: Deploying configuration descriptor solr.xml from /home/sabman/programs/apache-tomcat-7.0.14/conf/Catalina/localhost
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.HostConfig deployDescriptor
WARNING: A docBase /home/sabman/programs/apache-tomcat-7.0.14/webapps/solr.war inside the host appBase has been specified, and will be ignored
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.SetContextPropertiesRule begin
WARNING: [SetContextPropertiesRule]{Context} Setting property 'debug' to '0' did not find a matching property.
Jun 21, 2011 5:20:48 PM org.apache.catalina.core.StandardContext startInternal
SEVERE: Error listenerStart

I had to remove listener code from the web.xml for it work as it was before.

Any idea about what I could be doing wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

墟烟 2024-11-23 13:20:46

我从 Solr 邮件列表中得到了这样的回复:

Wiki 页面描述了一个调度程序的设计,该设计尚未提交给 Solr(我检查过)。前几天我确实看到了一个补丁(参见 https://issues.apache.org/jira /browse/SOLR-2305),但它看起来没有经过很好的测试。

我认为你现在基本上被 cron 之类的东西困住了。如果您的应用程序是用 java 编写的,请查看 Quartz 调度程序 - http://www.quartz-scheduler。组织/

I got this reply from the Solr mailing list:

The Wiki page describes a design for a scheduler, which has not been committed to Solr yet (I checked). I did see a patch the other day (see https://issues.apache.org/jira/browse/SOLR-2305) but it didn't look well tested.

I think that you're basically stuck with something like cron at this time. If your application is written in java, take a look at the Quartz scheduler - http://www.quartz-scheduler.org/

抚你发端 2024-11-23 13:20:46

请参阅我的 TimerHttpTask 以获取定期调用任何 HTTP 链接的简单 WAR。例如,该链接可以是用于启动增量导入的 DIH 链接。该项目是LGPL。 JNDI 用于调度作业而无需重新构建 WAR。下面的示例指示 TimerHttpTask 使用固定延迟调用 URL,初始延迟为 15 秒,此后每延迟 60 秒。

Jetty JNDI 配置

<Call name="setProperty">
    <Arg>TIMEAPI-UTC-NOW</Arg> 
    <Arg>FD|15000|60000|http://www.timeapi.org/utc/now.json</Arg>
</Call>

Tomcat JNDI 配置

TIMEAPI-UTC-NOW="FD|15000|60000|http://www.timeapi.org/utc/now.json"

See my TimerHttpTask for a simple WAR to periodically call any HTTP link. For example, the link can be a DIH link to start a delta import. The project is LGPL. JNDI is used to schedule job(s) without re-building the WAR. The examples below direct TimerHttpTask to call a URL using Fixed Delay with an initial delay of 15 sec and every 60 thereafter.

Jetty JNDI Configuration

<Call name="setProperty">
    <Arg>TIMEAPI-UTC-NOW</Arg> 
    <Arg>FD|15000|60000|http://www.timeapi.org/utc/now.json</Arg>
</Call>

Tomcat JNDI Configuration

TIMEAPI-UTC-NOW="FD|15000|60000|http://www.timeapi.org/utc/now.json"
蝶舞 2024-11-23 13:20:46

如果您复制了 ApplicationListener 等的源代码并运行了构建,您可能需要检查这些文件是否实际上已编译到您的发行版中。您可以通过打开 war 文件并查看是否有包含您提到的那些类的 .class 文件的 jar 来完成此操作,或者查看 .war 中的类目录以查看它们是否存在。如果不是,那么它们将不会加载到 Web 应用程序中(因此部署失败)。

您可能必须自己编译它们(创建自己的已编译类的 jar 文件)并手动将 jar 文件包含在 war 文件中(至少这将是一个很好的测试)。

您也可以使用 Stackoverflow 帖子中的第二个答案,即从 cron 或任务调度程序调用命令行。

If you copied the source for ApplicationListener, etc and ran a build, you may want to check that the files are actually being compiled into your distribution. You can do that by opening up the war file and looking to see if there is a jar containing .class files for those classes you mentioned or looking in the classes directory in the .war to see if they are there. If they're not then they won't get loaded in the web app (hence the failed deployment).

You may have to compile them on your own (create your own jar file that has compiled classes) and include the jar file in the war file manually (this would be a good test, at least).

You could also just use the second answer from that Stackoverflow post, which was to call the command line from cron or the task scheduler.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文