如何在 Lucene.Net 中设置索引词长度
如何限制 Lucene.Net 只索引这些长度大于 x 的术语。 我将文档索引为:
String indexDirectory = @"C:\Users\user\Desktop\Index";
String dataDirectory = @"C:\Users\user\Desktop\Data";
StandardAnalyzer analyzer = new StandardAnalyzer();
IndexWriter writer = new IndexWriter(indexDirectory, analyzer);
Document doc = new Document();
Field fPath = new Lucene.Net.Documents.Field("path", dataDirectory, Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.NO);
Field fContent = new Field("content", ReadTextFile(dataDirectory), Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES);
doc.Add(fPath);
doc.Add(fContent);
我正在使用以下代码从 Lucene 索引文件中获取索引术语。
TermFreqVector[] vectors = IndexReader.Open(indexDirectory).GetTermFreqVectors(0);
foreach (Lucene.Net.Index.TermFreqVector vector in vectors)
{
String[] terms = vector.GetTerms();
foreach (String term in terms)
{
// loop through indexed terms
}
}
How can restrict Lucene.Net to index only these terms that has length greater than x.
I am indexing the document as:
String indexDirectory = @"C:\Users\user\Desktop\Index";
String dataDirectory = @"C:\Users\user\Desktop\Data";
StandardAnalyzer analyzer = new StandardAnalyzer();
IndexWriter writer = new IndexWriter(indexDirectory, analyzer);
Document doc = new Document();
Field fPath = new Lucene.Net.Documents.Field("path", dataDirectory, Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.NO);
Field fContent = new Field("content", ReadTextFile(dataDirectory), Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES);
doc.Add(fPath);
doc.Add(fContent);
I am using the following code to get indexed Terms from Lucene Index file.
TermFreqVector[] vectors = IndexReader.Open(indexDirectory).GetTermFreqVectors(0);
foreach (Lucene.Net.Index.TermFreqVector vector in vectors)
{
String[] terms = vector.GetTerms();
foreach (String term in terms)
{
// loop through indexed terms
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以实现自己的分析器,或扩展标准分析器。
示例:
TokenFilter + 分析器
索引:
搜索:
You could implement your own Analyzer, or extend the StandardAnalyzer.
Example:
TokenFilter + Analyzer
Indexing:
Searching :