支持模糊搜索的最容易实现的站点搜索应用程序是什么?
我有一个网站需要搜索大约 20-30k 条记录,其中大部分是电影和电视节目名称。该站点使用 memcache 运行 php/mysql。
我希望用我当前拥有的 soundex()
搜索替换 FULLTEXT
,这可以工作......有点,但在很多情况下都不是很好。
是否有任何像样的搜索脚本易于实现,并且将提供像样的搜索功能(表中的 3 列)。
I have a site that needs to search thru about 20-30k records, which are mostly movie and TV show names. The site runs php/mysql with memcache.
Im looking to replace the FULLTEXT
with soundex()
searching that I currently have, which works... kind of, but isn't very good in many situations.
Are there any decent search scripts out there that are simple to implement, and will provide a decent searching capability (of 3 columns in a table).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
ewemli 的答案是正确的方向,但您应该结合 FULLTEXT 和 soundex 映射,而不是替换全文,否则您的 LIKE 查询可能会非常慢。
这提供了相当好的结果(在 soundex 算法的限制内),同时最大限度地利用索引(任何像 '%foo' 这样的查询都必须扫描表中的每一行)。
请注意对每个单词而不是整个短语运行 soundex 的重要性。您还可以对每个单词运行您自己的 soundex 版本,而不是让 SQL 执行此操作,但在这种情况下,请确保在存储和检索时都执行此操作,以防算法之间存在差异(例如,MySQL 的算法不限制本身到标准4个字符)
ewemli's answer is in the right direction but you should be combining FULLTEXT and soundex mapping, not replacing the fulltext, otherwise your LIKE queries are likely be very slow.
That gives pretty good results (within the limits of the soundex algo) while taking maximum advantage of an index (any query LIKE '%foo' has to scan every row in the table).
Note the importance of running soundex on each word, not on the entire phrase. You could also run your own version of soundex on each word rather than having SQL do it but in that case make sure you do it both when storing and retrieving in case there are differences between the algorithms (for instance, MySQL's algo doesn't limit itself to the standard 4 chars)
如果您正在寻找简单的现有解决方案而不是创建自己的解决方案,请查看
If you are looking for a simple existing solution instead of creating your own solution check out
mysql中有一个函数SOUNDEX。如果你想搜索电影标题:
当然,在文本中搜索是不行的,比如电影或剧情摘要。
Soundex 是一个相对简单的算法。您还可以决定在应用级别处理所有这些,这可能会更容易:
LIKE
。There is a function SOUNDEX in mysql. If you want to search for a movie title :
Of course it doesn't work to search in text, such as movie or plot summary.
Soundex is a relatively simple algo. You can also decide to handle all that at the applicative level, it may be easier:
LIKE
at the db level.Soundex 在处理模糊搜索方面存在局限性。一个更好的功能是编辑距离,可以使用UDF集成到MySQL中。检查 http://flamingo.ics.uci.edu/toolkit/ 的 C++ 实现Linux 上的 MySQL。
Soundex has limitations to deal with fuzzy search. A better function is edit distance, which can be integrated into MySQL using UDF. Check http://flamingo.ics.uci.edu/toolkit/ for a C++ implementation for MySQL on Linux.