lucene 良好实践和线程安全

发布于 2024-12-27 08:37:40 字数 673 浏览 1 评论 0原文

我正在使用 lucene 来索引文档并执行搜索,然后立即删除它们。 所有这些都可以被认为是一个原子操作,包括以下步骤:

索引(作者)-->搜索(搜索者)-->按分数获取文档 (读者)-->删除文档(读者)

此操作可以由同一索引上的多个并发线程执行(使用 FSDirectory)。

重要说明:每个线程处理一组单独的文档,因此一个线程不会

为此目的而接触另一个线程的文档我有几个问题:

1)我应该使用单个实例(对于所有线程) IndexWriterIndexReaderIndexSearcher? (它们应该是线程安全的)

2)IndexWriter 可以在 IndexReader 删除文档时操作索引吗?我需要关闭一个才能让另一个做它的事情吗? 意思是,一个线程可以写入索引,而另一个线程可以从中删除(正如我之前提到的,我可以保证它们处理单独的数据集)

3)您可能拥有的任何其他良好实践和建议将不胜感激。

多谢!

i'm using lucene to index documents and perform a search after which, i immediately delete them.
all this can be considered as a somewhat atomic action that includes the following steps:

index (writer) --> search (searcher) --> get docs by score
(reader) --> delete docs (reader)

this action can be performed by multiple concurrent threads on the same index (using FSDirectory).

IMPORTANT NOTE: each thread handles a separate set of documents so one thread will not touch another thread's documents

for that purpose i have a few questions:

1) should i use a single instances (for all threads) of IndexWriter, IndexReader and IndexSearcher? (they're supposed to be thread safe)

2) can an IndexWriter manipulate an index while and IndexReader deletes documents? do i need to close one for the other to do its thing?
meaning, can one thread write to an index while another one deletes from it (as i mentioned earlier, i can guarantee that they handle separate sets of data)

3) any other good practices and suggestions you might have will be most appreciated.

thanks a lot!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

深者入戏 2025-01-03 08:37:40

根据 API javadoc,IndexWriterIndexReaderIndexSearcher 是线程安全的:

注意:IndexSearcher 实例是完全线程安全的,这意味着
多个线程可以同时调用其任何方法

注意:IndexReader实例是完全线程安全的,这意味着多个
线程可以同时调用其任何方法。

注意:IndexWriter实例是完全线程安全的,这意味着
多个线程可以同时调用其任何方法

可以打开多个只读IndexReader,但最好共享一个(出于性能原因)。

只能打开一个IndexWriter(并且它将创建一个写锁以防止其他人在同一索引上打开)。当 IndexWriter 持有此锁时,您可以使用 IndexReader 删除文档。 IndexReader 将始终看到打开时的索引,只有在写入器提交读取器重新打开后,写入器所做的更改才会可见。

可以打开任意数量的 IndexSearcher,但最好还是共享一个。即使在修改索引时也可以使用它们。与 IndexReader 的工作方式相同(在重新打开搜索器之前,更改不可见)。

IndexWriter, IndexReader and IndexSearcher are thread-safe according to the api javadoc:

NOTE: IndexSearcher instances are completely thread safe, meaning
multiple threads can call any of its methods, concurrently

NOTE: IndexReader instances are completely thread safe, meaning multiple
threads can call any of its methods, concurrently.

NOTE: IndexWriter instances are completely thread safe, meaning
multiple threads can call any of its methods, concurrently

Multiple read-only IndexReaders can be opened, but it's better to share one (for performance reasons).

Only a single IndexWriter can be opened (and it will create a write lock to prevent others from being opened on the same index). You can use IndexReader to delete documents while IndexWriter holds this lock. IndexReader will always see the index as it was at the time when it was opened, changes done by the writer will be visible only after the writer commits them the reader is reopened.

Any number of IndexSearchers can be opened, but again it's better to share one. They can be used even while the index is being modified. Works the same as for IndexReader (the changes are not visible until the searcher is reopened).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文