Solr 阿拉伯语搜索
我想在我的 solr 中实现阿拉伯语搜索,我能够索引文档但无法搜索它们。当我通过 ID 引用文档时,我得到了文档,但当我通过阿拉伯词搜索时却没有,
搜索 URL
http://122.166.9.144:8080/solr/tw/select/?q=تأجير الاهلي
搜索响应
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">18</int>
<lst name="params">
<str name="q">تأجÙر اÙاÙÙÙ</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>
可能是什么问题?
谢谢,
Rohit
编辑请求/响应标头
Response Headers view source
Server Apache-Coyote/1.1
Content-Type application/xml;charset=UTF-8
Transfer-Encoding chunked
Date Mon, 15 Aug 2011 15:37:25 GMT
Request Headers view source
Host 122.166.9.144:8080
User-Agent Mozilla/5.0 (Windows NT 6.0; rv:5.0) Gecko/20100101 Firefox/5.0
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language en-us,en;q=0.5
Accept-Encoding gzip, deflate
Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection keep-alive
I want to implement Arabic Search in my solr, I am able to index the document but not able to search them. When i refer to the documents by ID I get the document, but not when I do a search by arabic words,
Search URL
http://122.166.9.144:8080/solr/tw/select/?q=تأجير الاهلي
Search Response
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">18</int>
<lst name="params">
<str name="q">تأجÙر اÙاÙÙÙ</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>
What could be the problem?
Thanks,
Rohit
Edit Request/Response Header
Response Headers view source
Server Apache-Coyote/1.1
Content-Type application/xml;charset=UTF-8
Transfer-Encoding chunked
Date Mon, 15 Aug 2011 15:37:25 GMT
Request Headers view source
Host 122.166.9.144:8080
User-Agent Mozilla/5.0 (Windows NT 6.0; rv:5.0) Gecko/20100101 Firefox/5.0
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language en-us,en;q=0.5
Accept-Encoding gzip, deflate
Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection keep-alive
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
显然,服务器无法使用正确的字符集解码 URL 中的阿拉伯文本。它看起来隐约像是UTF-8,但认为是Latin-1。您是否尝试过对对话进行wireshark以准确查看哪些URL字节被发送到服务器?
Apparently the server fails to decode the Arabic text in the URL using the right charset. It looks vaguely like it got UTF-8 but thought it was Latin-1. Have you tried wiresharking the conversation to see exactly which URL bytes get sent to the server?