在 Lucene.Net 索引中搜索 url 字段
我想在 Lucene.net 索引中搜索存储的 url 字段。我的代码如下:
Field urlField = new Field("Url", url.ToLower(), Field.Store.YES,Field.Index.TOKENIZED);
document.Add(urlField);`
indexWriter.AddDocument(document);
我正在使用上面的代码写入索引。
下面的代码用于在索引中搜索 Url。
Lucene.Net.Store.Directory _directory = FSDirectory.GetDirectory(Host, false);
IndexReader reader = IndexReader.Open(_directory);
KeywordAnalyzer _analyzer = new KeywordAnalyzer();
IndexSearcher indexSearcher = new IndexSearcher(reader);
QueryParser parser = new QueryParser("Url", _analyzer);
Query query = parser.Parse("\"" + downloadDoc.Uri.ToString() + "\"");
TopDocs hits = indexSearcher.Search(query, null, 10);
if (hits.totalHits > 0)
{
//statements....
}
但是每当我搜索一个网址(例如:http://www.xyz.com/
)时,我都没有得到任何点击。
不知何故,找到了替代方案。但这适用于索引中只有一个文档的情况。如果有更多文档,下面的代码将不会产生正确的结果。有什么想法吗?请帮助
在编写索引时,使用 KeywordAnalyzer()
KeywordAnalyzer _analyzer = new KeywordAnalyzer();
indexWriter = new IndexWriter(_directory, _analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED);
然后在搜索时,使用 KeywordAnalyzer()
IndexReader reader = IndexReader.Open(_directory);
KeywordAnalyzer _analyzer = new KeywordAnalyzer();
IndexSearcher indexSearcher = new IndexSearcher(reader);
QueryParser parser = new QueryParser("Url", _analyzer);
Query query = parser.Parse("\"" + url.ToString() + "\"");
TopDocs hits = indexSearcher.Search(query, null, 1);
这是因为 KeywordAnalyzer 将整个流“标记化”为 单个令牌。
请帮忙。其紧急。
干杯 苏尼尔...
I want to search a Lucene.net index for a stored url field. My code is given below:
Field urlField = new Field("Url", url.ToLower(), Field.Store.YES,Field.Index.TOKENIZED);
document.Add(urlField);`
indexWriter.AddDocument(document);
I am using the above code for writing into the index.
And the below code to search the Url in the index.
Lucene.Net.Store.Directory _directory = FSDirectory.GetDirectory(Host, false);
IndexReader reader = IndexReader.Open(_directory);
KeywordAnalyzer _analyzer = new KeywordAnalyzer();
IndexSearcher indexSearcher = new IndexSearcher(reader);
QueryParser parser = new QueryParser("Url", _analyzer);
Query query = parser.Parse("\"" + downloadDoc.Uri.ToString() + "\"");
TopDocs hits = indexSearcher.Search(query, null, 10);
if (hits.totalHits > 0)
{
//statements....
}
But whenever I search for a url for example: http://www.xyz.com/
, I am not getting any hits.
Somehow, figured out the alternative. But this works in case of only one document in the index. If there are more documents, the below code will not yield correct result. Any ideas? Pls help
While writing the index, use KeywordAnalyzer()
KeywordAnalyzer _analyzer = new KeywordAnalyzer();
indexWriter = new IndexWriter(_directory, _analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED);
Then while searching also, use KeywordAnalyzer()
IndexReader reader = IndexReader.Open(_directory);
KeywordAnalyzer _analyzer = new KeywordAnalyzer();
IndexSearcher indexSearcher = new IndexSearcher(reader);
QueryParser parser = new QueryParser("Url", _analyzer);
Query query = parser.Parse("\"" + url.ToString() + "\"");
TopDocs hits = indexSearcher.Search(query, null, 1);
This is because the KeywordAnalyzer "Tokenizes" the entire stream as a
single token.
Please help. Its urgent.
Cheers
Sunil...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这对我有用:
这个答案帮助了我:Lucene search by URL
This worked for me:
This answer helped me: Lucene search by URL
尝试在查询周围加上引号,例如。像这样 :
try putting quotes around query, eg. like this :
使用空格或关键字分析器应该可以。
真的有人会搜索“http://www.Google.com”吗?用户似乎更有可能搜索“Google”。
如果 URL 部分匹配,您始终可以返回整个 URL。我认为标准分析器应该更适合搜索和检索 URL。
Using the whitespace or keyword analyzer should work.
Would anyone actually search for "http://www.Google.com"? Seems more likely that a user would search for "Google" instead.
You can always return the entire URL if their is a partial match. I think the standard analyzer should be more appropriate for searching and retrieving a URL.