如何删除Lucene索引而不影响目录中其他非索引文件?
我想将内存中的 Lucene 索引写回磁盘,超出最初加载的索引。目前,如果我调用 Directory.Copy( _ramDirectory, _fileSystemDirectory, false ) ,它只是将新文件添加到目录中,但将旧的(过时的)文件保留在那里。
我尝试调用:
new IndexWriter( _fsd, _analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED ).Close();
...(在目录中创建一个新的空索引)但这有奇怪的行为,有时会导致整个索引在下次运行程序时被擦除干净。
有什么方法可以简单地获取文件系统索引当前正在使用的文件列表,以便我可以手动删除它们?我不想盲目地删除目录中的所有文件,以防其中存在一些非索引文件。
显然,FSDirectory.ListAll()
列出了物理目录中的所有文件,无论它们实际上是否是索引的一部分。有什么方法可以判断索引是否使用/创建了特定文件?我的意思是,由于 Lucene 奇怪的文件命名约定,我什至无法检查文件扩展名。
I am wanting to write back an in-memory Lucene index to disk, overtop of the originally-loaded index. Currently if I call Directory.Copy( _ramDirectory, _fileSystemDirectory, false )
, it simply adds the new files to the directory but leaves the old (stale) ones there.
I tried calling:
new IndexWriter( _fsd, _analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED ).Close();
...(to create a new empty index in the directory) but this has strange behavior and sometimes results in the entire index being wiped clean on the next run of the program.
Is there any way I can simply get a list of the files a file system index is currently using so I can delete them manually? I don't want to blindly erase all files in the directory in case there are some non-index files there.
Apparently FSDirectory.ListAll()
lists all files in the physical directory, whether or not they are actually part of the index. Is there any way I can tell if a particular file is used/created by the index? I mean I can't even check file extensions due to Lucene's bizarre file naming conventions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我绝对建议您不要在 Lucene 索引文件夹中混合其他文件。
最好的解决方案是使用具有 create 参数的 IndexWriter 构造函数创建一个新索引,这将在该位置创建一个新索引。然后使用 IndexWriter.AddIndexesNoOptimize(Directory[] dirs) 方法将 RamDirectory 添加到 FSDirectory
I'd definitely recommend that you dont mix other files in a Lucene index folder.
The best solution would be to create a new index using the IndexWriter constructor that has the create parameter, which will create a new index at the location. Then you use the
IndexWriter.AddIndexesNoOptimize(Directory[] dirs)
method to add your RamDirectory to the FSDirectoryIndexWriter.DeleteAll()
and optimize.