使用嵌入字典进行迭代字符替换
我试图理解一个迭代函数,它接受字符串“12345”并根据字符串中每个字符附近的键字典返回所有可能的拼写错误。
outerDic = {}
Dict1 = {'1':'2','2':'q'}
outerDic['1'] = Dict1
Dict1 = {'1':'1','2':'q','3':'w','4':'3'}
outerDic['2'] = Dict1
Dict1 = {'1':'2','2':'w','3':'e','4':'4'}
outerDic['3'] = Dict1
Dict1 = {'1':'3','2':'e','3':'r','4':'5' }
outerDic['4'] = Dict1
Dict1 = {'1':'4','2':'r','3':'t','4':'6' }
outerDic['5'] = Dict1
outerDic
输出应返回字符串列表
12345
22345
q2345
11345
1q345
13345
12245
12e45
12445
等...
我已将函数设置如下:
def split_line(text):
words = text.split()
for current_word in words:
getWordsIterations()
我想了解如何设置 getWordsIterations () 函数来遍历字典并系统地替换字符。
I'm trying to understand an iterative function that that takes a string "12345" and returns all the possible misspellings based upon a dictionary of keys close to each character in the string.
outerDic = {}
Dict1 = {'1':'2','2':'q'}
outerDic['1'] = Dict1
Dict1 = {'1':'1','2':'q','3':'w','4':'3'}
outerDic['2'] = Dict1
Dict1 = {'1':'2','2':'w','3':'e','4':'4'}
outerDic['3'] = Dict1
Dict1 = {'1':'3','2':'e','3':'r','4':'5' }
outerDic['4'] = Dict1
Dict1 = {'1':'4','2':'r','3':'t','4':'6' }
outerDic['5'] = Dict1
outerDic
The output should return a list of strings
12345
22345
q2345
11345
1q345
13345
12245
12e45
12445
and so on...
I've set the function up as follows:
def split_line(text):
words = text.split()
for current_word in words:
getWordsIterations()
I'd like to understand how to set up the getWordsIterations () function to go through the dictionary and systematically replace the characters.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不确定内部字典(全部带有键“1”、“2”等)表示什么——它们基本上只是代表可能存在拼写错误的列表的替代品吗?但有些(但不是全部)也会包含“正确”的字符......一个非拼写错误......?!抱歉,但你在这个演示文稿中确实非常混乱——这个例子没有多大帮助(为什么第二个位置从来没有“w”,如果我理解你的奇怪之处,这应该是一个可能的拼写错误)数据结构...?等等)。
因此,在等待澄清的同时,让我假设您想要的只是为每个输入字符表示所有可能的单字符拼写错误 - 列表很好,但在这种情况下字符串更紧凑,并且本质上是等效的:
现在如果您只关心恰好有 1 个拼写错误的情况:
例如,
for w in one_typo("12345"): print w
发出:“任意数量的拼写错误”将产生巨大 /em> 列表——这是你想要的吗?或者“0到2个错别字”?或者到底还有什么……?
I'm not sure what the inner dicts, all with keys '1', '2', etc, signify -- are they basically just stand-ins for lists presenting possible typos? But then some (but not all) would also include the "right" character... a non-typo...?! Sorry, but you're really being extremely confusing in this presentation -- the example doesn't help much (why is there never a "w" in the second position, which is supposed to be a possible typo there if I understand your weird data structure...? etc, etc).
So, while awaiting clarification, let me assume that all you want is to represent for each input character all possible single-character typos for it -- lists would be fine but strings are more compact in this case, and essentially equivalent:
now if you only care about cases with exactly 1 mis-spelling:
so for example,
for w in one_typo("12345"): print w
emits:"Any number of typos" would produce an enormous list -- is that what you want? Or "0 to 2 typos"? Or what else exactly...?