当搜索中使用多个单词时,如何在 Lucene.net 中执行 AND 搜索?
我正在研究 Lucene.net,尝试了解如何在我的应用程序中实现它。
我有以下代码
.....
// Add 2 documents
var doc1 = new Document();
var doc2 = new Document();
doc1.Add(new Field("id", "doc1", Field.Store.YES, Field.Index.ANALYZED));
doc1.Add(new Field("content", "This is my first document", Field.Store.YES, Field.Index.ANALYZED));
doc2.Add(new Field("id", "doc2", Field.Store.YES, Field.Index.ANALYZED));
doc2.Add(new Field("content", "The big red fox jumped", Field.Store.YES, Field.Index.ANALYZED));
writer.AddDocument(doc1);
writer.AddDocument(doc2);
writer.Optimize();
writer.Close();
// Search for doc2
var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "content", new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
var query = parser.Parse("big abcdefg test1234");
var searcher = new IndexSearcher(indexDirectory, true);
var hits = searcher.Search(query);
Assert.AreEqual(1, hits.Length());
var document = hits.Doc(0);
Assert.AreEqual("doc2", document.Get("id"));
Assert.AreEqual("The big red fox jumped", document.Get("content"));
这个测试通过了,这让我有点沮丧。我认为这意味着 Lucene.Net 使用 OR 来搜索术语,而不是 AND,但我找不到任何有关如何实际执行 AND 搜索的信息。
我想要的最终结果是,如果有人搜索“Matthew Anderson”,我不希望它显示引用“Matthew Doe”的文档,因为这在任何方式、形状或形式上都不相关。
I am playing around with Lucene.net to try and get a handle of how to implement it in my application.
I have the following code
.....
// Add 2 documents
var doc1 = new Document();
var doc2 = new Document();
doc1.Add(new Field("id", "doc1", Field.Store.YES, Field.Index.ANALYZED));
doc1.Add(new Field("content", "This is my first document", Field.Store.YES, Field.Index.ANALYZED));
doc2.Add(new Field("id", "doc2", Field.Store.YES, Field.Index.ANALYZED));
doc2.Add(new Field("content", "The big red fox jumped", Field.Store.YES, Field.Index.ANALYZED));
writer.AddDocument(doc1);
writer.AddDocument(doc2);
writer.Optimize();
writer.Close();
// Search for doc2
var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "content", new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
var query = parser.Parse("big abcdefg test1234");
var searcher = new IndexSearcher(indexDirectory, true);
var hits = searcher.Search(query);
Assert.AreEqual(1, hits.Length());
var document = hits.Doc(0);
Assert.AreEqual("doc2", document.Get("id"));
Assert.AreEqual("The big red fox jumped", document.Get("content"));
This test passes, which dismays me a bit. I assume this means that Lucene.Net uses OR for searches between terms and not an AND, but I can't find any information on how to actually perform an AND search.
The end result I am going for is if someone searches for "Matthew Anderson" I don't want it to bring up documents that refer to "Matthew Doe" , as that isn't relevant in any way, shape or form.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
发布评论
评论(2)
当您的查询为 var query = parser.Parse("+big +abcdefg +test1234");
时,您会得到什么?这应该会导致解析器要求所有术语都出现在匹配文档中。另一种可能性是以编程方式构造查询。
BooleanQuery query = new BooleanQuery();
query.add(new BooleanClause(new TermQuery(new Term("field", "big"))), Occur.MUST);
query.add(new BooleanClause(new TermQuery(new Term("field", "abcdefg"))), Occur.MUST);
query.add(new BooleanClause(new TermQuery(new Term("field", "test1234"))), Occur.MUST);
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
A. 如果您要求所有单词都在文档中,但不要求单词连续且按照您指定的顺序:查询
匹配
但不匹配
B. 如果您想匹配一个短语(即需要的所有单词) ; 单词必须是连续的并且按照指定的顺序)而不是:查询
匹配
但不匹配
A. If you require all words to be in a document but don't require the words to be consecutive and in the order you specify: The query
matches
but does not match
B. If you want to match a phrase (i.e. all words required; the words have to be consecutive and in the order specified) instead: The query
matches
but does not match