太阳黑子 unicode 搜索在本地工作但不在生产环境中工作
我的应用程序在本地与 Sunspot Solr 配合使用,支持 unicode 没有任何问题。然而,在生产中,使用 Heroku 和 Websolr,所有 unicode 查询返回零结果。我已与 Websolr 确认 支持我可以使用 unicode 直接查询他们的 Solr 系统 而且效果很好。然而,当我从我的生产应用程序查询时,他们 在日志中看到这样的内容:q=أرسنا
所以它似乎与Websolr无关。我也尝试运行 生产模式下的本地应用程序(指向 Websolr),一旦我这样做了 那,查询再次没有返回结果!
我想知道是否有人遇到过类似的问题,我应该去哪里 正在寻找答案?我尝试将 solr 生产日志级别设置为 INFO 或更多信息以查看发送到 Solr 的内容,但由于某种原因 这也没有显示在服务器日志中。
谢谢
I have my app working with Sunspot Solr locally, supporting unicode
with no issues. In production however, with Heroku and Websolr, all
unicode queries return zero results. I have confirmed with Websolr
support I can query directly against their Solr system with unicode
and it works fine. When I query from my production app however, they
saw something like this in the log: q=أرسنا
So it doesn't seem to be related to Websolr. I also tried running the
local app in production mode (pointing to Websolr), and once I do
that, queries return no results again!
I'm wondering if anyone had faced similar problem, and where should I
be looking for answers? I tried to set solr production log level to
INFO or more to see what's being sent to Solr, but for some reason
that's not showing in the server log as well.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
当 Sunspot 切换到使用 HTTP POST 处理其请求时,它(及其依赖项 RSolr)不幸地没有为其 Content-type 标头指定字符集。这会导致 Tomcat 根据 servlet 规范默认为 ISO-8859-1,从而导致 UTF-8 字符的解码不正确。
RSolr 的最新版本 1.0.7 通过使用 UTF-8 字符集指定正确的内容类型标头修复了此问题。因此,看到此错误的 Sunspot 用户应确保其 RSolr gem 依赖项已更新到 1.0.7 或更高版本。
When Sunspot switched to use HTTP POST for its requests, it (and its dependency, RSolr) unfortunately did not specify a charset for its Content-type header. This causes Tomcat to default to ISO-8859-1 as per the servlet spec, resulting in incorrect decoding for UTF-8 characters.
A more recent version of RSolr, 1.0.7, has fixed this by specifying the correct content-type header with a UTF-8 charset. So Sunspot users who see this error should ensure that their RSolr gem dependency has been updated to 1.0.7 or greater.
我不确定,但可能由于某种原因,当您发出请求时,WebSolr 可能不会发送要使用的字符集,因此您的应用程序服务器(我不确定是 JBOSS 还是 Tomcat)会认为它应该使用默认字符集(可以是 ISO-8859-1)。我认为这应该是产品的一个错误。
I am not sure, but may be it seems for some reason while you are making a request WebSolr may not be sending which character set to use, so your application server(I am not sure whether JBOSS or Tomcat) will think that it should use the default character set(which can be ISO-8859-1). I think it should be a bug with the product.