Solr / SolrNet - 使用通配符进行逐字母搜索
嘿伙计们, 我正在尝试为正在编写的应用程序实现一些搜索功能。
在 Tomcat7 上运行的 Solr 1.4.1 使用 View im 索引与 MS SQLServer 的 JDBC 连接 Solr 已完成索引并且索引正在运行。
为了搜索并与 Solr 通信,我创建了一个小测试 WCF 服务(稍后将与我们的主服务一起实现)。
目的是在我们的主应用程序中实现一个文本字段。在此文本字段中,用户可以开始输入诸如画笔之类的内容,并随着输入越来越多的字符逐渐过滤对象列表。
在某种程度上,这与 Solr 配合得很好。我在查询末尾使用通配符星号,因此我抛出了很多请求,例如 p* 帕* 疼痛* 在服务器上绘制*
等,其返回结果很好(实际上速度非常快)。唯一的问题是,一旦用户输入整个单词,查询就是 Paintbrush*,此时 solr 返回 0 个结果。
所以看起来 query+wildcard 只能是 query+something 而不是 query+nothing
我设法让它在 Lucene.Net 下工作,但 Solr 并没有以看起来相同的方式做事。
关于实现这样的功能,您能给我什么建议吗?
自从我使用 SolrNet 以来,没有太多代码可供查看: http://pastebin.com/tXpe4YUe
我想它与分析器和解析器有关,但我还没有深入了解 Solr,不知道该往哪里看:)
Hey Guys,
Im trying to implement some search functionality to an application were writing.
Solr 1.4.1 running on Tomcat7
JDBC connection to a MS SQLServer with the View im indexing
Solr has finished indexing and the index is working.
To search and communicate with Solr i have created a little test WCF service (to be implemented with our main service later).
The purpose is to implement a textfield in our main application. In this text field the users can start typing something like Paintbrush and gradually filter through the list of objects as more and more characters are input.
This is working just fine and dandy with Solr up to a certain point. Im using the Wildcard asterisk in the end of my query and as such im throwing a lot of requests like
p*
pa*
pain*
paint*
etc. at the server and its returning results just fine (quite impressively fast actually). The only problem is that once the user types the whole word the query is paintbrush* at which point solr returns 0 results.
So it seems that query+wildcard can only be query+something and not query+nothing
I managed to get this working under Lucene.Net but Solr isnt doing things the same way it seems.
Any advice you can give me on implementing such a feature?
there isn't much code to look at since im using SolrNet: http://pastebin.com/tXpe4YUe
I figure it has something to do with the Analyzer and Parser but im not yet that into Solr to know where to look :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
词干似乎是导致问题的原因。我使用 text_ws 的克隆而不是该类型的文本来修复它。
我对 scema.xml 的更改: http://pastebin.com/xaJZDgY4
禁用词干并启用小写索引。只要所有查询都是小写的,它们就应该总是给出结果(如果有的话)。
问题似乎是分析器不能使用通配符,因此使用通配符时,使 Johnny 成为 Johni 或 Johnni 结果的逻辑被“破坏”。
如果您面临类似的问题,并且我在这里的解决方案不太有效,您可以将 debugQuery=on 添加到查询字符串中,并查看有关发生情况的更多信息。这帮助我缩小了问题范围。
Stemming seems to be what caused the problem. I fixed it using a clone of text_ws instead of text for the type.
My changes to scema.xml : http://pastebin.com/xaJZDgY4
Stemming is disabled and lowercase indexing is enabled. As long as all queries are in lower case they should always give results (if there at all).
Issue seems to be that Analyzers dont work with Wildcards, so the logic that would make Johnny the result of Johni or Johnni is "broken" when using wildcards.
If your facing similiar problems and my solution here doesnt quite work you can add debugQuery=on to your query string and see a bit more about whats going on. That helped me narrow down the problem.
我不会在 Solr 中实现带有前缀通配符查询的建议。还有其他更适合执行此操作的机制。请参阅:
I wouldn't implement suggestions with prefix wildcard queries in Solr. There are other mechanisms better suited to do this. See: