单词建议程序
建议我一个程序或方法来处理单词纠正/建议系统。 - 假设输入为“建议集”,它应该建议“建议”。
提前致谢。我正在使用 python 和 AJAX。请不要向我推荐任何 jquery 模块,因为我需要算法部分。
Suggest me a program or way to handle the word correction / suggestion system.
- Let's say the input is given as 'Suggset', it should suggest 'Suggest'.
Thanx in advance. And I'm using python and AJAX. Please don't suggest me any jquery modules cuz I need the algorithmic part.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
解决您的问题的算法称为“编辑距离”。给定某种语言中的单词列表和输入错误/不完整的单词,您需要从给定的词典中最接近它的单词列表中构建一个单词列表。例如,“suggest”和“suggset”之间的距离等于 2 - 您需要一次删除和一次插入。作为一种优化,您可以为每个操作分配不同的权重 - 例如,您可以说替换比删除更便宜,并且键盘上更靠近的两个字母之间的替换(例如“v”和“b”)比那些更接近的字母之间的替换更便宜相距很远(例如“q”和“l”)。
拼写和纠正算法的首次描述出现于 1964 年。1974 年,Robert A. Wagner 和 Michael J. Fischer 在名为“字符串到字符串纠正问题”的论文中提出了基于动态规划的高效算法。任何算法书籍都或多或少有详细的处理。
对于 python,有一个库可以做到这一点:Levenshtein distance 库
另请查看关于 Stack Overflow 的早期讨论
Algorithm that solves your problem called "edit distance". Given the list of words in some language and mistyped/incomplete word you need to build a list of words from given dictionary closest to it. For example distance between "suggest" and "suggset" is equal to 2 - you need one deletion and one insertion. As an optimization you can assign different weights to each operation - for example you can say that substitution is cheaper than deletion and substitution between two letters that lie closer on keyboard (for example 'v' and 'b') is cheaper that between those that are far apart (for example 'q' and 'l').
First description of algorithm for spelling and correction appeared in 1964. In 1974 efficient algorithm based on dynamic programming appeared in paper called "String-to-string correction problem" by Robert A. Wagner and Michael J. Fischer. Any algorithms book have more or less detailed treatment of it.
For python there is library to do that: Levenshtein distance library
Also check this earlier discussion on Stack Overflow
自己制作其中之一需要做很多工作。我发现了一个用 python 编写的非常好的拼写检查器库,名为 PyEnchant非常好。这是他们网站上的示例:
It will take a lot of work to make one of those yourself. There is a really nice spell checker library written in python called PyEnchant that I've found to be quite nice. Here's an example from their website: