Lucene 索引每次运行时都会变得越来越慢
此代码使用 Lucene.NET 测试索引。
for (int i = 0; i < 10; i++)
{
var stopwatch = Stopwatch.StartNew();
string indexPath = Path.Combine("C:\\lucene\\");
var directory = FSDirectory.Open(new DirectoryInfo(indexPath));
var analyzer = new StandardAnalyzer(LuceneConfiguration.Version);
IndexWriter indexWriter = null;
try
{
indexWriter = new IndexWriter(directory, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);
indexWriter.DeleteAll();
for (int documentNumber = 0; documentNumber < 100; documentNumber++)
{
var document = new Document();
for (int fieldNumber = 0; fieldNumber < 10; fieldNumber++)
{
document.Add(new Field("Field" + fieldNumber, "asdf qwerty Value" + fieldNumber, Field.Store.YES,
Field.Index.ANALYZED));
}
indexWriter.AddDocument(document);
}
indexWriter.Optimize();
}
finally
{
if (indexWriter != null)
{
indexWriter.Close();
}
}
stopwatch.Stop();
Console.WriteLine("Index time: " + stopwatch.Elapsed.TotalMilliseconds);
var reader = IndexReader.Open(directory, true);
var searcher = new IndexSearcher(reader);
var parser = new QueryParser(LuceneConfiguration.Version, "Field0", analyzer);
var query = parser.Parse("asdf");
var collector = TopScoreDocCollector.create(10, true);
searcher.Search(query, collector);
Console.WriteLine("Hits: " + collector.GetTotalHits());
}
Console.ReadKey();
每次运行索引时,索引都会变得越来越慢。如果我在索引后跳过搜索,它不会变慢。仅当我通过调试启动它时才会发生这种情况。如果我在没有调试的情况下启动它,则不会。
可能是什么原因造成的?
This code tests indexing with Lucene.NET.
for (int i = 0; i < 10; i++)
{
var stopwatch = Stopwatch.StartNew();
string indexPath = Path.Combine("C:\\lucene\\");
var directory = FSDirectory.Open(new DirectoryInfo(indexPath));
var analyzer = new StandardAnalyzer(LuceneConfiguration.Version);
IndexWriter indexWriter = null;
try
{
indexWriter = new IndexWriter(directory, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);
indexWriter.DeleteAll();
for (int documentNumber = 0; documentNumber < 100; documentNumber++)
{
var document = new Document();
for (int fieldNumber = 0; fieldNumber < 10; fieldNumber++)
{
document.Add(new Field("Field" + fieldNumber, "asdf qwerty Value" + fieldNumber, Field.Store.YES,
Field.Index.ANALYZED));
}
indexWriter.AddDocument(document);
}
indexWriter.Optimize();
}
finally
{
if (indexWriter != null)
{
indexWriter.Close();
}
}
stopwatch.Stop();
Console.WriteLine("Index time: " + stopwatch.Elapsed.TotalMilliseconds);
var reader = IndexReader.Open(directory, true);
var searcher = new IndexSearcher(reader);
var parser = new QueryParser(LuceneConfiguration.Version, "Field0", analyzer);
var query = parser.Parse("asdf");
var collector = TopScoreDocCollector.create(10, true);
searcher.Search(query, collector);
Console.WriteLine("Hits: " + collector.GetTotalHits());
}
Console.ReadKey();
For each time the indexing is running, the indexing gets slower and slower. If I skip the search after the indexing, it doesn't get slower. This only occur when I start it with debugging. Not if I start it without debugging.
What may cause this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我想知道这是否是 Lucene 尝试自动清除索引目录时第一次出现 IOException。这些情况会发生,因为您的阅读器/搜索器仍然打开,并锁定段文件以进行删除。
I'm wondering if it's the first chance IOExceptions that occur when Lucene tries to auto-clear the index directory. These would occur since your readers/searchers are still open, and locks the segment files for deletion.