MySQL LIKE %string% 不够宽容。我还有什么可以用的吗?
我有一位客户询问他们的搜索是否可以搜索公司名称,这些名称可以根据用户输入以多种格式进行搜索,例如数据库中存储的公司是 AJR Kelly Ltd,如果用户搜索“ AJR Kelly”被发现,使用;
<cfif pctermsCount gt 0>
AND (LOWER(p.name) LIKE '%#pcTerms#%')
</cfif>
如果他们搜索“Kelly”,则会找到该公司,但如果他们搜索“AJ Kelly”或“AJ Kelly”等字符串的损坏版本,则不会找到该公司。
我能做些什么来让它变得更宽容一点吗?
谢谢。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您使用 MyISAM,则可以使用全文索引。 查看本教程
如果您使用不同的存储引擎,则可以使用第三方全文像sphinx这样的引擎,它可以充当mysql的存储引擎或可以查询的单独服务器。
使用 MySQL 全文索引,对
AJ Kelly
的搜索将匹配AJ Kelly
(不会混淆问题,但 A、J 和 AJ 将被忽略,因为默认情况下它们太短并且)一般来说,全文更加宽容(通常比 LIKE '%string%' 更快),因为允许部分匹配,然后可以根据相关性进行排名。您还可以使用 SOUNDEX 通过索引单词的语音等效项并通过以下方式搜索它们,使搜索更加宽容将 SOUNDEX 应用于您的搜索词,然后使用它们来搜索索引。例如,对于 soundex,
mary
、marie
和marry
将全部匹配。If you're using MyISAM, you can use full text indexing. See this tutorial
If you're using a different storage engine, you could use a third party full text engine like sphinx, which can act as a storage engine for mysql or a separate server that can be queried.
With MySQL full text indexing a search on
A J Kelly
would matchAJ Kelly
(no to confuse matters but A, J and AJ would be ignored as they are too short by default and it would match on Kelly.) Generally Fulltext is much more forgiving (and usually faster than LIKE '%string%') because allows partial matches which can then be ranked on relevance.You can also use SOUNDEX to make searches more forgiving by indexing the phonetic equivalents of words and search them by applying SOUNDEX on your search terms and then using those to search the index. With soundex
mary
,marie
, andmarry
will all match, for example.如果您确实正在运行 ColdFusion,则可以使用 Solr/Lucene。其中任何一个都应该为您提供良好的字符串“模糊匹配”能力。
仅仅对于全文索引来说,不得不使用 MyISAM 表就是一颗苦药——你会放弃很多安心,以及诸如外键约束之类的东西。
If you are indeed running ColdFusion, you have access to CF's full text indexing using either Verity or Solr/Lucene. Either of those should give you good "fuzzy matching" capability for strings.
Having to use MyISAM tables is a bitter pill just for full-text indexing - you give up a lot of peace of mind, and things like Foreign Key constraints.
您可以创建一个新列,并通过删除空格使其成为名称的可搜索版本,然后将列数据类型设置为 FULLTEXT(仅适用于 MyISAM)。您可能还想研究 Lucene/SOLR。 SOLR 提供了许多在这种情况下工作得很好的分词器。学习曲线相当高,但从长远来看是值得的。
You could create a new column and make it the searchable version of the name by removing the whitespace, then set the column datatype as FULLTEXT (will only work with MyISAM). You may want to look into Lucene/SOLR as well. SOLR provides a number of tokenizers which work very well in this type of situation. Learning curve is fairly high, but worth it in the long run.
棘手的是,我想一个简单的方法是删除数据库术语搜索中的空格,因此使用 AJRKelly 而不是 AJR Kelly。然后使用空格作为各个搜索词的分隔符。这样 AJ Kelly 就会分别搜索 A、J 和 Kelly。 AJ Kelly 将分别搜索 AJ 和 Kelly。它们将匹配 AJRKelly 空白删除的数据库术语。
Tricky, i suppose a simple method would be to remove whitespaces on database term searching, so AJRKelly is used instead of A J R Kelly. Then use whitespace as a seperator for individual search terms. That way A J Kelly would search for A, J and Kelly seperately. AJ Kelly would search for AJ and Kelly seperately. They would match on the AJRKelly whitespace removed database term.