使用嵌入字典进行迭代字符替换

发布于 2024-08-06 13:42:51 字数 724 浏览 3 评论 0原文

我试图理解一个迭代函数,它接受字符串“12345”并根据字符串中每个字符附近的键字典返回所有可能的拼写错误。

outerDic = {}
Dict1 = {'1':'2','2':'q'}
outerDic['1'] = Dict1
Dict1 = {'1':'1','2':'q','3':'w','4':'3'}
outerDic['2'] = Dict1
Dict1 = {'1':'2','2':'w','3':'e','4':'4'}
outerDic['3'] = Dict1
Dict1 = {'1':'3','2':'e','3':'r','4':'5' }
outerDic['4'] = Dict1
Dict1 = {'1':'4','2':'r','3':'t','4':'6' }
outerDic['5'] = Dict1
outerDic

输出应返回字符串列表

12345
22345
q2345
11345
1q345
13345
12245
12e45
12445

等...

我已将函数设置如下:

def split_line(text):
 words = text.split()
 for current_word in words:
  getWordsIterations()

我想了解如何设置 getWordsIterations () 函数来遍历字典并系统地替换字符。

I'm trying to understand an iterative function that that takes a string "12345" and returns all the possible misspellings based upon a dictionary of keys close to each character in the string.

outerDic = {}
Dict1 = {'1':'2','2':'q'}
outerDic['1'] = Dict1
Dict1 = {'1':'1','2':'q','3':'w','4':'3'}
outerDic['2'] = Dict1
Dict1 = {'1':'2','2':'w','3':'e','4':'4'}
outerDic['3'] = Dict1
Dict1 = {'1':'3','2':'e','3':'r','4':'5' }
outerDic['4'] = Dict1
Dict1 = {'1':'4','2':'r','3':'t','4':'6' }
outerDic['5'] = Dict1
outerDic

The output should return a list of strings

12345
22345
q2345
11345
1q345
13345
12245
12e45
12445

and so on...

I've set the function up as follows:

def split_line(text):
 words = text.split()
 for current_word in words:
  getWordsIterations()

I'd like to understand how to set up the getWordsIterations () function to go through the dictionary and systematically replace the characters.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

柠檬色的秋千 2024-08-13 13:42:51

我不确定内部字典(全部带有键“1”、“2”等)表示什么——它们基本上只是代表可能存在拼写错误的列表的替代品吗?但有些(但不是全部)也会包含“正确”的字符......一个非拼写错误......?!抱歉,但你在这个演示文稿中确实非常混乱——这个例子没有多大帮助(为什么第二个位置从来没有“w”,如果我理解你的奇怪之处,这应该是一个可能的拼写错误)数据结构...?等等)。

因此,在等待澄清的同时,让我假设您想要的只是为每个输入字符表示所有可能的单字符拼写错误 - 列表很好,但在这种情况下字符串更紧凑,并且本质上是等效的:

possible_typos = {
  '1': '2q',
  '2': '1qw3',
  '3': '2we4',
  '4': '3er5',
  '5': '4rt6',
}

现在如果您只关心恰好有 1 个拼写错误的情况:

def one_typo(word):
  L = list(word)
  for i, c in enumerate(L):
    for x in possible_typos[c]:
      L[i] = x
      yield ''.join(L)
    L[i] = c

例如,for w in one_typo("12345"): print w 发出:

22345
q2345
11345
1q345
1w345
13345
12245
12w45
12e45
12445
12335
123e5
123r5
12355
12344
1234r
1234t
12346

“任意数量的拼写错误”将产生巨大 /em> 列表——这是你想要的吗?或者“0到2个错别字”?或者到底还有什么……?

I'm not sure what the inner dicts, all with keys '1', '2', etc, signify -- are they basically just stand-ins for lists presenting possible typos? But then some (but not all) would also include the "right" character... a non-typo...?! Sorry, but you're really being extremely confusing in this presentation -- the example doesn't help much (why is there never a "w" in the second position, which is supposed to be a possible typo there if I understand your weird data structure...? etc, etc).

So, while awaiting clarification, let me assume that all you want is to represent for each input character all possible single-character typos for it -- lists would be fine but strings are more compact in this case, and essentially equivalent:

possible_typos = {
  '1': '2q',
  '2': '1qw3',
  '3': '2we4',
  '4': '3er5',
  '5': '4rt6',
}

now if you only care about cases with exactly 1 mis-spelling:

def one_typo(word):
  L = list(word)
  for i, c in enumerate(L):
    for x in possible_typos[c]:
      L[i] = x
      yield ''.join(L)
    L[i] = c

so for example, for w in one_typo("12345"): print w emits:

22345
q2345
11345
1q345
1w345
13345
12245
12w45
12e45
12445
12335
123e5
123r5
12355
12344
1234r
1234t
12346

"Any number of typos" would produce an enormous list -- is that what you want? Or "0 to 2 typos"? Or what else exactly...?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文