ASP.NET搜索索引构建策略
这就是我打算做的事情,我会感谢任何人的意见:
我已经在 Asp.net MVC 中构建了一个论坛,现在想要添加 Lucene.Net 进行搜索。我的计划是每 5-10 分钟运行一个索引构建器线程,以使用对每个讨论所做的更改来更新搜索索引。
其工作方式是我在搜索索引中保留索引生成器线程上次运行的日期和时间。然后,在每次执行索引生成器时,我都会从搜索中读回该日期,然后对自该日期和时间以来的任何更改建立索引。完成后,我会更新上次运行条目。
这种方式好不好?有人可以建议一种更好的方法来增量索引论坛应用程序中的更改吗?
This is what I'm planning to do and I'd appreciate anyone's input:
I've built a forum in Asp.net MVC and now want to add Lucene.Net for search. My plan is to run an index builder thread every 5-10 minutes to update the search index with the changes made to the each discussion.
The way it will work is I keep the date and time for the last run of the index builder thread in the search index. Then on every execution of the index builder, I read this date back from the search, then index any changes since that date and time. Once I'm done I then update the last run entry.
Is this way good? Can someone suggest a better way to incrementally index changes in a forum app?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要维护一个计时器...如果索引操作在 5 分钟内没有停止,另一个操作将开始索引相同的更改,因此您也必须检查这种情况。
稍微好一点的方法是简单地使用保持活动状态的专用索引线程。该线程将从上次运行中获取更改并按照您的描述处理它们,但它不会等待。索引操作完成后,它将立即重新启动,不断索引项目。
如果没有更多项目可供索引,线程将休眠 5 分钟(然后在索引完成时再次重新检查更改)醒来)。
这样您就可以确保一次只有一个客户端修改索引。它永远不会占用大量 CPU,如果您对计时器管理不当或突然收到大量帖子,则可能会出现这种情况,并且会随着您的论坛的增长而扩展,而无需时不时地调整索引间隔。
不过,您需要监视线程的运行状况。
You will need to maintain a timer... and if the indexing operation doesn't stop in 5 minutes another one will start indexing the same changes so you'll have to check for such condition as well.
A slightly better way is to simply use a dedicated indexing Thread that stays alive. This Thread will fetch changes from the last run and process them as you describe, but it will not wait. After the index operation finishes, it'll re-start itself right away continually indexing as items are in.
If there are no more items to index, the Thread will then sleeps for 5 minutes (and then re-check for changes again when it wakes up).
This way you can be sure that there will only be one client at a time modifying the indexes. It'll never take up a lot of CPU as might be the case if you mismanaged the timer somehow or you suddenly got a flood of posts, and will scale as your forum grows without needing to adjust the indexing interval every now and then.
You will need to monitor the Thread's health though.