如何使用 Zend Lucene 搜索邮政编码?
我有非常简单的公司索引,Zend Lucene 使用它来创建索引:
// store company primary key to identify it in the search results
$doc->addField(Zend_Search_Lucene_Field::Keyword('pk', $this->getId()));
// index company fields
$doc->addField(Zend_Search_Lucene_Field::Unstored('zipcode', $this->getZipcode(), 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::Unstored('name', $this->getName(), 'utf-8'));
我可以搜索公司名称,但不能搜索邮政编码。 Zend Lucene Search 索引整数有问题吗? 如果s/o可以透露一些有经验的人,请帮助我。 我只能想象使用 Lucene 按邮政编码搜索是很常见的。
I have very simple company index with Zend Lucene using this to create the index:
// store company primary key to identify it in the search results
$doc->addField(Zend_Search_Lucene_Field::Keyword('pk', $this->getId()));
// index company fields
$doc->addField(Zend_Search_Lucene_Field::Unstored('zipcode', $this->getZipcode(), 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::Unstored('name', $this->getName(), 'utf-8'));
I can search on the company name but not the zipcode. Is there a problem with Zend Lucene Search indexing integers? If s/o could shed some light who was experience, please help me out. I can only imagine using Lucene to search by zipcode is pretty common.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我相信 Zend Lucene 的默认文本分析器默认情况下不会搜索数字。 Zend 附带了几种不同的文本分析器。 使用 TextNum 分析器搜索数字和字符。 zend/search/lucene/analysis/analyzer/common 文件夹中还有一些其他分析器,您可能会发现它们很有用。
您可以使用以下代码更改默认分析器:
I believe the default text Analyzer for Zend Lucene does not search numbers by default. Zend comes packaged with several different text analyzers. Use the TextNum analyzer to search both numbers and characters. There are also a handful of other analyzers in the zend/search/lucene/analysis/analyzer/common folder that you may find useful.
You can change your default analyzer with the following code:
我相信您的问题在于 分析器。
我建议你使用
Zend_Search_Lucene_Field::Keyword
,而不是邮政编码字段的
Zend_Search_Lucene_Field::Unstored
。这样,Lucene 分析器在建立索引时就不会修改邮政编码。
Java Lucene 有 explain() 可用于调试搜索。
您可能需要打印一些临时值来模拟explain(),并查看这是否确实是问题所在。
I believe your problem is with the Analyzer.
I suggest you use
Zend_Search_Lucene_Field::Keyword
,instead of
Zend_Search_Lucene_Field::Unstored
for the zip code field.This way, the Lucene analyzer will not modify the zip code while indexing.
The Java Lucene has explain() which can be used to debug searches.
You may have to print some interim values to simulate explain(), and see whether this is indeed the problem.
例如,如果您搜索 123,您将获得所有包含 123 和 34123 的匹配结果。 因此,您必须确保您的索引和查询字符串是明确的。
我建议将邮政编码索引为字符串,例如“000123”。 之后,您可以使用“000123”在索引上搜索,您将得到正确的结果集,而不是像 34123 这样的结果。您只需将邮政编码转换为“正确”的查询字符串即可。
If you are searching for 123, you will get all hits with 123 as well as 34123 for instance. So you have to make sure, that you're index and your query string is unambiguous.
I suggest to index the zipcode as a string such as "000123". After that you can search on the index with "000123" and you will get the correct resultset and nothing like 34123. you only have to translate the zipcode into the "correct" querystring.