lucene.net 中的 TermFreqVector 出现问题
我一直在尝试使用 TermFreqVvector 获取文档的术语频率, 这是我的代码,
LuceneStore.Directory dir = LuceneStore.FSDirectory.GetDirectory("e:/indexDir", true);
IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(), true);
Document doc = new Document();
doc.Add(new Field("Content", "This is a beautiful house", Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
writer.AddDocument(doc);
writer.Optimize();
writer.Close();
IndexReader reader = IndexReader.Open(dir);
TermFreqVector termFreq = reader.GetTermFreqVector(0, "content");
string[] term = termFreq.GetTerms();
但我收到错误消息“对象引用未设置到对象的实例” 上线 string[] term = termFreq.GetTerms();
谁能帮忙啊!!!!
i've been trying to get the frequency of terms of a document usin TermFreqVvector,
here is my code,
LuceneStore.Directory dir = LuceneStore.FSDirectory.GetDirectory("e:/indexDir", true);
IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(), true);
Document doc = new Document();
doc.Add(new Field("Content", "This is a beautiful house", Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
writer.AddDocument(doc);
writer.Optimize();
writer.Close();
IndexReader reader = IndexReader.Open(dir);
TermFreqVector termFreq = reader.GetTermFreqVector(0, "content");
string[] term = termFreq.GetTerms();
but i get the error msg "Object reference not set to an instance of an object"
on the line
string[] term = termFreq.GetTerms();
can anyone help!!!!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
GetTermFreqVector
方法被记录为如果尚未设置
storeTermVector
标志则返回 null - 您确定在您的情况下设置了它吗?编辑:我刚刚注意到您在构造函数中使用“Content”作为字段名称,然后在询问术语频率向量时使用“content”。如果字段名称区分大小写,这很容易成为问题。我建议您创建一个常量字符串,在您想要引用该字段的任何地方使用,以保持一致性。
The
GetTermFreqVector
method is documented to return null if thestoreTermVector
flag hasn't been set - are you sure it's set in your case?EDIT: I've just noticed that you're using "Content" as the field name in the constructor, and then "content" when you're asking for the term frequency vector. That could easily be the problem if field names are case-sensitive. I suggest you create a constant string used everywhere you want to refer to the field, for consistency.