重建Lucene索引的正确方法是什么
我有一个用 Asp.net MVC 编写的类似论坛的 Web 应用程序。我正在尝试将 Lucene.net 实现为搜索引擎。当我构建索引时,我时不时会遇到与 Lucene 无法重命名 deletable
文件相关的异常。我认为这是因为我每次想重建索引时都会清空索引。下面是处理索引的代码:
public class SearchService : ISearchService
{
Directory IndexFileLocation;
IndexWriter Writer;
IndexReader Reader;
Analyzer Analyzer;
public SearchService(String indexLocation)
{
IndexFileLocation = FSDirectory.GetDirectory(indexLocation, System.IO.Directory.Exists(indexLocation) == false);
Reader = IndexReader.Open(IndexFileLocation);
Writer = new IndexWriter(IndexFileLocation, Analyzer, IndexFileLocation.List().Length == 0);
Analyzer = new StandardAnalyzer();
}
public void ClearIndex()
{
var DocumentCount = Writer.DocCount();
if (DocumentCount == 0)
return;
for (int i = 0; i < DocumentCount; i++)
Reader.DeleteDocument(i);
}
public void AddToSearchIndex(ISearchableData Data)
{
Document Doc = new Document();
foreach (var Entry in Data)
{
Field field = new Field(Entry.Key,
Entry.Value,
Lucene.Net.Documents.Field.Store.NO,
Lucene.Net.Documents.Field.Index.TOKENIZED,
Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);
Doc.Add(field);
}
Field KeyField = new Field(
SearchField.Key.ToString(),
Data.Key,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.NO);
Doc.Add(KeyField);
Writer.AddDocument(Doc);
}
public void Dispose()
{
Writer.Optimize();
Writer.Close();
Reader.Close();
}
}
下面是执行这一切的代码:
private void btnRebuildIndex_Click(object sender, EventArgs e)
{
using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
{
SearchService.ClearIndex();
}
using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
{
Int32 BatchSize = 50;
Int32 Current = 0;
var TotalQuestions = SubmissionService.GetQuestionsCount();
while (Current < TotalQuestions)
{
var Questions = SubmissionService.ListQuestions(Current, BatchSize, "Id", Qsparx.SortOrder.Asc);
foreach (var Question in Questions)
{
SearchService.AddToSearchIndex(Question.ToSearchableData());
}
Current += BatchSize;
}
}
}
为什么 Lucene 抱怨重命名“可删除”文件?
I have a forum like web application written in Asp.net MVC. I'm trying to implement Lucene.net as the search engine. When I build my index, every now and then I get exceptions related to Lucene not being able to rename the deletable
file. I think it's because I empty the index every time I want to rebuild it. Here is the code that deals with indexing:
public class SearchService : ISearchService
{
Directory IndexFileLocation;
IndexWriter Writer;
IndexReader Reader;
Analyzer Analyzer;
public SearchService(String indexLocation)
{
IndexFileLocation = FSDirectory.GetDirectory(indexLocation, System.IO.Directory.Exists(indexLocation) == false);
Reader = IndexReader.Open(IndexFileLocation);
Writer = new IndexWriter(IndexFileLocation, Analyzer, IndexFileLocation.List().Length == 0);
Analyzer = new StandardAnalyzer();
}
public void ClearIndex()
{
var DocumentCount = Writer.DocCount();
if (DocumentCount == 0)
return;
for (int i = 0; i < DocumentCount; i++)
Reader.DeleteDocument(i);
}
public void AddToSearchIndex(ISearchableData Data)
{
Document Doc = new Document();
foreach (var Entry in Data)
{
Field field = new Field(Entry.Key,
Entry.Value,
Lucene.Net.Documents.Field.Store.NO,
Lucene.Net.Documents.Field.Index.TOKENIZED,
Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);
Doc.Add(field);
}
Field KeyField = new Field(
SearchField.Key.ToString(),
Data.Key,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.NO);
Doc.Add(KeyField);
Writer.AddDocument(Doc);
}
public void Dispose()
{
Writer.Optimize();
Writer.Close();
Reader.Close();
}
}
And here is the code that executes it all:
private void btnRebuildIndex_Click(object sender, EventArgs e)
{
using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
{
SearchService.ClearIndex();
}
using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
{
Int32 BatchSize = 50;
Int32 Current = 0;
var TotalQuestions = SubmissionService.GetQuestionsCount();
while (Current < TotalQuestions)
{
var Questions = SubmissionService.ListQuestions(Current, BatchSize, "Id", Qsparx.SortOrder.Asc);
foreach (var Question in Questions)
{
SearchService.AddToSearchIndex(Question.ToSearchableData());
}
Current += BatchSize;
}
}
}
Why does Lucene complain about renaming the "deletable" file?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不确定为什么每次都要重新创建索引。您可以这样追加到索引:
末尾的错误标志告诉 IndexWriter 以追加模式打开(即不覆盖)。
这可能会让你的问题消失。
Not sure why you are recreating the index everytime. You can append to the index thus:
The false flag at the end tells the IndexWriter to open in append mode(i.e. not overwrite).
That might make your problem go away.
事实证明,如果不存在索引文件,那么在 IndexWriter 之前创建 IndexReader 并不是一个好主意。我还意识到,尽管 IndexWriter 的 AddDocument 方法有两个重载(一个带分析器参数,一个不带分析器参数),但只有带分析器参数的方法适合我。
It turned out, if no index files exist, then creating an IndexReader before an IndexWriter is not a good idea. I also realized even though the AddDocument method of IndexWriter has two overloads (one w/ and one w/o Analyzer parameter) only the one with analyzer parameter works for me.