TermFreqVector lucene .net
我可以像这样按类别获取文档:
IndexSearcher searcher = new IndexSearcher(dir);
Term t = new Term("category", "Feline");
Query query = new TermQuery(t);
Hits hits = searcher.Search(query);
for (int c = 0; c < hits.Length(); c++)
{
Document d = hits.Doc(c);
Console.WriteLine(c + " " + d.GetField("category").StringValue());
}
现在我想获取点击中文档的 TermFreqVector。我通常会这样做:
for (int c = 0; c < searcher.MaxDoc(); c++)
{
TermFreqVector TermFreqVector = IndexReader.GetTermFreqVector(c, "content");
String[] terms = TermFreqVector.GetTerms();//get the terms
int[] freqs = TermFreqVector.GetTermFrequencies();//
}
但是,我不确定在我的场景中如何执行此操作(即只是获取点击中的文档)。该文档还有一个 db pk。
谢谢。
基督教
I can get docs by category like this:
IndexSearcher searcher = new IndexSearcher(dir);
Term t = new Term("category", "Feline");
Query query = new TermQuery(t);
Hits hits = searcher.Search(query);
for (int c = 0; c < hits.Length(); c++)
{
Document d = hits.Doc(c);
Console.WriteLine(c + " " + d.GetField("category").StringValue());
}
Now I would like to obtain the TermFreqVector for the docs in hits. I would usually do this like so:
for (int c = 0; c < searcher.MaxDoc(); c++)
{
TermFreqVector TermFreqVector = IndexReader.GetTermFreqVector(c, "content");
String[] terms = TermFreqVector.GetTerms();//get the terms
int[] freqs = TermFreqVector.GetTermFrequencies();//
}
However, I am not sure how to do it in my scenario (i.e. just get them for the docs in hits). The docs also have a db pk.
Thanks.
Christian
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
更多
发布评论
评论(1)
IndexReader.GetTermFreqVector
的第一个参数(示例中的“c”)是文档编号。hits.id(c)
将返回第 c 个结果的 ID。所以你会做类似的事情:(作为旁注:Hits类已被弃用;你可能想使用像
HitCollector
这样的东西或不同的搜索重载。)The first parameter to
IndexReader.GetTermFreqVector
("c" in your example) is the document number.hits.id(c)
will return the ID of the cth result. So you'd do something like:(As a side note: the Hits class is deprecated; you probably want to use something like
HitCollector
or a different search overload instead.)