Zend_Search_Lucene:UTF-8 疯狂

发布于 12-09 06:45 字数 616 浏览 2 评论 0原文

我对 Zend_Search_Lucene 和非英语字符(例如德语 ÁÖÜ)有一些问题。 我的数据库返回 UTF-8 格式的字符串,所以我认为一切都会正常工作。

在遇到严重的编码问题后,我在网上搜索并发现以下几行代码解决了大多数人的问题:

Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8');
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive ()

事实上,这并没有解决我的问题。 今天我想出了一个有效的解决方案:(不是 utf8_decode)

$doc->addField(Zend_Search_Lucene_Field::keyword('division', utf8_decode($contact->division)), 'utf-8');

嗯,这工作得很好,但坦率地说,它看起来很奇怪。为什么我必须来回转换字符串? 也许我使用 Lucene 错误?或者这是一个错误?

I hat some problems with Zend_Search_Lucene and non-english characters such as the german ÄÖÜ.
My database returns UTF-8 formatted strings so I thought everything will work just fine.

After having serious encoding problems I searched the web and found, that the following lines of code solved the problems for most people:

Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8');
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive ()

In fact, this did not solved my Problem.
Today I figured out a solution that works: (not the utf8_decode)

$doc->addField(Zend_Search_Lucene_Field::keyword('division', utf8_decode($contact->division)), 'utf-8');

Well, this is working perfectly fine, but frankly it looks quite odd. Why do I have to convert strings back and forth?
Maybe I'm using Lucene wrong? Or is this a bug?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

何以笙箫默2024-12-16 06:45:53

查询和存储数据是两件不同的事情。如果您的查询采用 utf-8 编码,那么您的数据(文档)也应该采用 utf-8 编码,以便与查询匹配。

最后

$doc->addField(Zend_Search_Lucene_Field::keyword('division', utf8_decode($contact->division)), 'utf-8');

应该是

$doc->addField(Zend_Search_Lucene_Field::keyword('division',$contact->division, 'utf-8'));

Querying and storing data are two different things . If your query is encoded in utf-8 then your data (document) should also be utf-8 encoded so to match the query .

Lastly

$doc->addField(Zend_Search_Lucene_Field::keyword('division', utf8_decode($contact->division)), 'utf-8');

shd be

$doc->addField(Zend_Search_Lucene_Field::keyword('division',$contact->division, 'utf-8'));
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文