Lucene.NET 生命周期管理

发布于 2024-11-02 08:46:16 字数 1250 浏览 9 评论 0原文

假设我对添加和搜索文档有基本的了解。

管理 IndexWriter 和 IndexReader 实例的最佳实践是什么？

目前，我的应用程序创建了 IndexWriter 的单例实例。当我需要进行搜索时，我只需使用以下命令从 IndexWriter 创建一个 IndexSearcher

var searcher = new IndexSearcher(writer.GetReader())

我这样做是因为创建一个新的 IndexReader 会导致索引加载到内存中，然后等待 GC 重新分配内存。这导致了内存不足错误。

当前的实施是否被认为是理想的？此实现解决了内存问题，但存在 write.lock 文件始终存在的问题（因为 IndexWriter 始终被实例化并打开）。这是我在应用程序中遇到的错误的堆栈跟踪。

锁获取超时： NativeFSLock@C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock： System.IO.IOException：进程无法访问该文件 'C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock' 因为它正在被另一个人使用过程。在 System.IO.__Error.WinIOError(Int32) 错误代码，字符串可能完整路径）位于 System.IO.FileStream.Init（字符串路径， FileMode模式、FileAccess访问、 Int32 权限、布尔 useRights、 FileShare 共享，Int32 bufferSize，文件选项选项， SECURITY_ATTRIBUTES secAttrs，字符串 msgPath，布尔值 bFromProxy，布尔值使用长路径）在 System.IO.FileStream..ctor(字符串路径、FileMode 模式、FileAccess 访问）在 Lucene.Net.Store.NativeFSLock.Obtain()

我想也许最好创建一个 IndexSearcher 的单例实例用于搜索，然后根据需要在内存中创建一个 IndexWriter 。这样，在更新索引时将创建/删除 write.lock 文件。我看到的唯一问题是 IndexSearcher 实例将变得过时，我需要运行一个任务来重新加载 IndexSearcher（如果索引已更新）。

你怎么认为？

如何通过实时更新处理大型索引？

原文

Let's assume that I have a basic understanding of adding and searching documents.

What would be the best practice for managing instances of IndexWriter and IndexReader?

Currently, my application creates a singleton instance of an IndexWriter. When ever I need to do a search, I just create an IndexSearcher from the IndexWriter by using the following

var searcher = new IndexSearcher(writer.GetReader())

I am doing this because creating a new IndexReader causes the index to get loaded into memory, and then waits for the GC to reallocate the memory. This was causing out of memory errors.

Is this current implementation considered ideal? This implementation has solved the memory issue, but there is an issue with the write.lock file always existing (because IndexWriter is always instantied and opened). Here is the stack trace of the errors I get in the app.

Lock obtain timed out:
NativeFSLock@C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock:
System.IO.IOException: The process
cannot access the file
'C:\inetpub\wwwroot\htdocs_beta\App_Data\products3\write.lock'
because it is being used by another
process. at
System.IO.__Error.WinIOError(Int32
errorCode, String maybeFullPath) at
System.IO.FileStream.Init(String path,
FileMode mode, FileAccess access,
Int32 rights, Boolean useRights,
FileShare share, Int32 bufferSize,
FileOptions options,
SECURITY_ATTRIBUTES secAttrs, String
msgPath, Boolean bFromProxy, Boolean
useLongPath) at
System.IO.FileStream..ctor(String
path, FileMode mode, FileAccess
access) at
Lucene.Net.Store.NativeFSLock.Obtain()

I'm thinking maybe it would be best to create a singleton instance of IndexSearcher for searching, and then create an IndexWriter as needed in memory. That way, the write.lock file will be created/deleted when updating the index. The only issue I see with this is that the IndexSearcher instance will become outdated, I would need to have a task running that reloads the IndexSearcher if the index has been updates.

What do you think?

How do you handle a large index with live updating?

分享到QQ

分享到微博