lucene中转义特殊字符并使用通配符进行查询

发布于 2024-12-10 23:23:02 字数 206 浏览 0 评论 0原文

当我尝试在包含特殊字符的术语中使用通配符进行查询时遇到问题。 例如,如果我索引 "Test::Here",我会使用通配符 ? 搜索 "TE?T\:\:Here" (注意:我转义了 ':')。我没有得到任何结果。我使用标准分析器和查询解析器进行索引和搜索。

有人遇到过类似的问题吗?

I have an issue when I try to query using wildcard in a term that has a special character in it.
As an example if I index "Test::Here",I search using this using wildcard ? for "TE?T\:\:Here" (NOTE: I escaped ':'). I do not get any results. I use standard analyser and queryparser for indexing and searching.

Anyone encountered similar issue?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

月依秋水 2024-12-17 23:23:02

StandardAnalyzer 使用 StandardTokenizer,因此 Test::Here 被视为两个标记:TestHere< /代码>。通配符查询不是通过分析器运行的,因此您最终会将冒号与不包含冒号的术语进行匹配。您需要使用不同的分词器,例如 WhitespaceTokenizer

StandardAnalyzer uses StandardTokenizer, so Test::Here is seen as two tokens: Test and Here. Wildcard queries are not run through an analyzer, so you end up matching colons against the terms that do not contain them. You need to use different tokenizer, for example WhitespaceTokenizer.

隔纱相望 2024-12-17 23:23:02

您无法搜索未编入索引的内容。下面是一段代码,用于查看您索引的内容。

var analyzer = new AnyAnalyzer();
TokenStream tokensTream = analyzer.TokenStream("", new StringReader("Test::Here"));
Lucene.Net.Analysis.Token token = tokensTream.Next();
while (token != null)
{
    Console.Write("[" + token.TermText() + "] ");
    token = tokensTream.Next();
}

You can't search what you haven't indexed. Below is a code to see what you index.

var analyzer = new AnyAnalyzer();
TokenStream tokensTream = analyzer.TokenStream("", new StringReader("Test::Here"));
Lucene.Net.Analysis.Token token = tokensTream.Next();
while (token != null)
{
    Console.Write("[" + token.TermText() + "] ");
    token = tokensTream.Next();
}
雪落纷纷 2024-12-17 23:23:02

Artur 是对的,但还有另一个需要考虑的问题,即 Lucene 中根本不分析通配符术语,因此您必须确保查询术语的大小写与索引术语的大小写匹配(分析后)。

Artur is right, but there is another issue to consider which is that wildcard terms are not analyzed at all in Lucene, so you will have to make sure that the case of your query term matches the case of the indexed term (after analysis).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文