使用 trie 的 Python 拼写检查器
我正在尝试使用 trie 数据结构实现拼写检查器。我目前对 Node
有以下概要:
class Node:
def __init__(self):
self.next = {}
self.word_marker = False
def add_item(self, string):
#if the length of the string is 0 then return. When the end of the word
#comes set the word_marker to true
if len(string) == 0:
self.word_marker = True
return
#set the key to the first letter of the string and reformat the string to reflect the first letter taken out
#ultimately going to store the letters for each node as the key
key = string[0]
string = string[1:]
#if there is a key in the dictionary, then recursively call add_item with new string
if key in self.next:
self.next[key].add_item(string)
else:
node = Node()
self.next[key] = node
node.add_item(string)
我想做的下一件事是编写函数来搜索 string
,然后返回建议的拼写。 (def Correct(self, string)
)。我将如何通过这个 trie 来实现搜索?假设我已经通过定义根节点 root
将单词列表添加到 trie,然后对列表中的每个单词使用 add_item
。
I am trying to implement a spell checker using a trie data structure. I currently have the following outline for a Node
:
class Node:
def __init__(self):
self.next = {}
self.word_marker = False
def add_item(self, string):
#if the length of the string is 0 then return. When the end of the word
#comes set the word_marker to true
if len(string) == 0:
self.word_marker = True
return
#set the key to the first letter of the string and reformat the string to reflect the first letter taken out
#ultimately going to store the letters for each node as the key
key = string[0]
string = string[1:]
#if there is a key in the dictionary, then recursively call add_item with new string
if key in self.next:
self.next[key].add_item(string)
else:
node = Node()
self.next[key] = node
node.add_item(string)
The next thing that I want to do is write the function to search for a string
and then return a suggested spelling. (def correct(self, string)
). How would I go through this trie to implement the search? Assume that I already added a list of words to the trie by defining a root node root
, then using add_item
for each of the words in the list.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您还没有阅读过,您可能需要查看 Norvig 的“如何编写拼写校正器” '
If you haven't already, you might want to check out Norvig's 'How to Write a Spelling Corrector'
这里有与您的问题相关的答案:https://stackoverflow.com/a/11016430/793956。
还可以考虑使用诸如
https://github.com/kmike/marisa-trie#readme 或
https://github.com/kmike/datrie#readme
There's an answer to related to your question here: https://stackoverflow.com/a/11016430/793956 .
Also consider using libraries such as
https://github.com/kmike/marisa-trie#readme or
https://github.com/kmike/datrie#readme
虽然不是直接答案,但您可能希望从更开发的 Trie 实现开始,如下所示:
Though not a direct answer, you might want to start with a more developed Trie implementation like this: