单词混淆算法
给定一个混乱的单词(即 ofbaor),如何解读字母以创建一个真正的单词(即 foobar)?我可以看到这有几种方法,我想我知道如何在 .NET 中做到这一点,但我很好奇其他一些解决方案是什么样的(总是很高兴看到我的解决方案是否是最佳的)。
这不是家庭作业或类似的东西,我只是在报纸的当地漫画部分看到了一个混乱的单词(是的,很好的老式新闻纸),我体内的工程师开始思考。
编辑: 如果可以的话,请发布一些伪代码或真实代码;通过看到这样的例子来尝试和扩展语言知识总是很好的。
Given a word jumble (i.e. ofbaor), what would be an approach to unscramble the letters to create a real word (i.e. foobar)? I could see this having a couple of approaches, and I think I know how I'd do it in .NET, but I curious to see what some other solutions look like (always happy to see if my solution is optimal or not).
This isn't homework or anything like that, I just saw a word jumble in the local comics section of the paper (yes, good ol' fashioned newsprint), and the engineer in me started thinking.
edit:
please post some pseudo code or real code if you can; it's always nice to try and expand language knowledge by seeing examples like this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
拥有一本字典,其中每个单词的字母按排序顺序键入。然后带你对字母进行排序——通过排序后的字母字符串查找字典中的所有单词。
因此,举个例子,单词“bear”和“bare”在字典中的位置如下:
如果给你混乱的“earb”,你会将字母排序为“aber”并且能够在字典中查找这两个可能的单词。
Have a dictionary that's keyed by the letters of each word in sorted order. Then take you jumble an sort the letters - look up all the words in the dictionary by that sorted-letter string.
So, as an example, the words 'bear' and 'bare' would be in the dictionary as follows:
And if you're given the jumble, 'earb', you'd sort the letters to 'aber' and be able to look up both possible words in the dictionary.
CodeProject 有几篇文章此处和此处。第二种使用递归。维基百科还在此处概述了一些算法。维基百科文章还提到了一个名为 Jumbo 的程序,它使用一种更具启发性的方法来像人类一样解决问题。似乎有几种方法可以解决这个问题。
CodeProject has a couple of articles here and here. The second uses recursion. Wikipedia also outlines a couple of algorithms here. The Wikipedia article also mentions a program called Jumbo that uses a more heuristic approach that solves the problem like a human would. There seem to be a few approaches to the problem.
根据字符串的长度,WhirlWind 的方法可能会更快,但另一种具有或多或少 O(n) 复杂度的替代方法是,您不必创建字符串的所有排列并查找它们,而是遍历所有排列字典中的单词,看看是否可以从输入字符串构建每个单词。
提前知道字典中单词数量的真正智能算法可以执行如下操作:
Depending on the length of the string WhirlWind's approach could be faster, but an alternative way of doing it which has more or less O(n) complexity is instead of creating all the permutations of the string and looking them up, you go through all the words in the dictionary and see if each can be built from the input string.
A really smart algorithm that knows the number of words in the dictionary in advance could do something like this:
创建字符串的所有排列,并在字典中查找它们。
您可以通过查找以单词开头的较短字符串来进行优化,如果字典中没有以这些字符串开头的长度合适的单词,则可以消除以这些字母开头的排列,从而避免进一步考虑。
Create all the permutations of the string, and look them up in a dictionary.
You can optimize by looking up shorter strings that begin words, and if there are no words of suitable length in the dictionary that start with those strings, eliminating permutations starting with those letters from further consideration.
http://www.codeproject.com/KB/game/Anagrams2.aspx
http://www.codeproject.com/KB/game/Anagrams2.aspx
一种方法是将字典拆分为具有特定长度的排序子字典,例如 1 个字母的单词、2 个字母的单词……
当你搜索某个混乱的单词时,将可能的排列数量与相应词典中的单词数量进行比较。如果前者较大,则将字典中的单词与混乱的单词进行比较,如果后者较大,则创建排列,然后在字典中搜索这些单词。
您还可以进一步优化它,根据字典的第一个字母以及它们出现的频率将字典划分为更小的子集,然后根据第二个字母进一步划分。然而,更多的划分可能会使数据库显着复杂化并减慢搜索速度。
One approach is to split your dictionary into sorted sub-dictionaries with specific lengths, like 1-letter words, 2-letter words,...
When you search for words of a certain jumble, compare the number of possible permutations with the number of the words in the corresponding dictionary. If the former is larger, then compare words in the dictionary to the jumble, if the latter is, then create permutations then search for those in your dictionary.
You can also optimize it further by dividing the dictionaries into smaller subsets based on their first letters, and how frequently they appear, and then divide further based on the second letter. More division might significantly complicate the database and slow down searching, however.