Lucene.NET 生命周期管理

发布于 2024-11-02 08:46:16 字数 1250 浏览 0 评论 0原文

假设我对添加和搜索文档有基本的了解。

管理 IndexWriter 和 IndexReader 实例的最佳实践是什么?

目前,我的应用程序创建了 IndexWriter 的单例实例。当我需要进行搜索时,我只需使用以下命令从 IndexWriter 创建一个 IndexSearcher

var searcher = new IndexSearcher(writer.GetReader())

我这样做是因为创建一个新的 IndexReader 会导致索引加载到内存中,然后等待 GC 重新分配内存。这导致了内存不足错误。

当前的实施是否被认为是理想的?此实现解决了内存问题,但存在 write.lock 文件始终存在的问题(因为 IndexWriter 始终被实例化并打开)。这是我在应用程序中遇到的错误的堆栈跟踪。

锁获取超时: NativeFSLock@C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock: System.IO.IOException:进程 无法访问该文件 'C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock' 因为它正在被另一个人使用 过程。在 System.IO.__Error.WinIOError(Int32) 错误代码,字符串可能完整路径)位于 System.IO.FileStream.Init(字符串路径, FileMode模式、FileAccess访问、 Int32 权限、布尔 useRights、 FileShare 共享,Int32 bufferSize, 文件选项选项, SECURITY_ATTRIBUTES secAttrs,字符串 msgPath,布尔值 bFromProxy,布尔值 使用长路径)在 System.IO.FileStream..ctor(字符串 路径、FileMode 模式、FileAccess 访问)在 Lucene.Net.Store.NativeFSLock.Obtain()

我想也许最好创建一个 IndexSearcher 的单例实例用于搜索,然后根据需要在内存中创建一个 IndexWriter 。这样,在更新索引时将创建/删除 write.lock 文件。我看到的唯一问题是 IndexSearcher 实例将变得过时,我需要运行一个任务来重新加载 IndexSearcher(如果索引已更新)。

你怎么认为?

如何通过实时更新处理大型索引?

Let's assume that I have a basic understanding of adding and searching documents.

What would be the best practice for managing instances of IndexWriter and IndexReader?

Currently, my application creates a singleton instance of an IndexWriter. When ever I need to do a search, I just create an IndexSearcher from the IndexWriter by using the following

var searcher = new IndexSearcher(writer.GetReader())

I am doing this because creating a new IndexReader causes the index to get loaded into memory, and then waits for the GC to reallocate the memory. This was causing out of memory errors.

Is this current implementation considered ideal? This implementation has solved the memory issue, but there is an issue with the write.lock file always existing (because IndexWriter is always instantied and opened). Here is the stack trace of the errors I get in the app.

Lock obtain timed out:
NativeFSLock@C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock:
System.IO.IOException: The process
cannot access the file
'C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock'
because it is being used by another
process. at
System.IO.__Error.WinIOError(Int32
errorCode, String maybeFullPath) at
System.IO.FileStream.Init(String path,
FileMode mode, FileAccess access,
Int32 rights, Boolean useRights,
FileShare share, Int32 bufferSize,
FileOptions options,
SECURITY_ATTRIBUTES secAttrs, String
msgPath, Boolean bFromProxy, Boolean
useLongPath) at
System.IO.FileStream..ctor(String
path, FileMode mode, FileAccess
access) at
Lucene.Net.Store.NativeFSLock.Obtain()

I'm thinking maybe it would be best to create a singleton instance of IndexSearcher for searching, and then create an IndexWriter as needed in memory. That way, the write.lock file will be created/deleted when updating the index. The only issue I see with this is that the IndexSearcher instance will become outdated, I would need to have a task running that reloads the IndexSearcher if the index has been updates.

What do you think?

How do you handle a large index with live updating?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

最初的梦 2024-11-09 08:46:16

您应该只使用一个索引编写器,以避免锁定问题。看一下: Lucene.Net 写入/读取同步

You should use only one index writer, to avoid your locking issues. Have a look at : Lucene.Net writing/reading synchronization

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文