查找字典单词的anagaram
如何获取输入单词(或字母序列)并从字典中输出恰好包含这些字母的单词?
java 是否有我可以使用的英语词典类(单词列表),或者是否有开源实现?
如果需要重复执行此操作,如何优化我的代码?
How can I take an input word (or sequence of letters) and output a word from a dictionary that contains exactly those letters?
Does java has an English dictionary class (list of words) that I can use, or are there open source implementations of this?
How can I optimize my code if this needs to be done repeatedly?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
将您的字典转换为字谜字典。在字谜词典中,单词按字母顺序按字母顺序索引。要查找某个单词的字谜词,您需要对其字母进行排序,然后从字谜词典中查找相应的字母。
Convert your dictionary into an anagram dictionary. In an anagram dictionary, the words are indexed by their letters in sorted alphabetical order. To look up anagrams for a certain word, you sort its letters and look up corresponding ones from the anagram dictionary.
如果两个单词具有完全相同的字母、完全相同的次数,则称它们为字谜词。
字谜检查是对两个单词的字母进行排序并检查是否相等:
现在要查找给定字典单词
word1
的所有字谜,我会在字典中找到满足以下条件的所有单词:上述测试成立。为了优化搜索,我们可以只搜索相同长度的单词。如果我们必须重复执行此操作,最好进行一些预处理。我们可以构建类似
HashMap
的东西,其中我们可以将string
映射到一组string
(即字谜词)。就像这样:现在给定任何单词,我都可以查看
hashMap
来获取它的所有字谜。Two words are said to be anagrams if they have the exact same letters, exact same number of times.
The check for anagram is to sort the letters of both the words and check for equality:
Now to find all the anagrams of a given dictionary word say
word1
, I would find all the words in the dictionary for which the above test holds. To optimize the search we can just search for words that are of same length.If we have to do this repeatedly its better to do some preprocessing. We can build something like a
HashMap
where in we would map astring
to a set ofstrings
which are anagrams. Something like:Now given any word I can look into the
hashMap
to get all its anagrams.您可以使用 Sun 网站上的 Anagrams2 示例 作为起点
为了提高性能,您可以缓存经常使用/最近使用的单词的字谜词。考虑使用 WeakHashMap 来实现此目的
You can use Anagrams2 example from Sun site as a starting point
For improved performance, you can have a cache of anagrams for frequently used/recently used words.Consider using WeakHashMap for this purpose
正如 unicornaddict 提到的,您可以相当轻松地确定两个单词是否是字谜词通过排序,然而这是低效的,特别是如果你重复这样做的话。
准备好的哈希表可能是最好的解决方案,在程序开始时将字典加载到其中。一个相当容易编写的散列/比较算法将是
“
我的 Java 相当生锈”,但我认为这样就可以了。
As unicornaddict mentioned, you can fairly easily determine whether or not two words are anagrams by sorting, however this is inefficient, especially if you are doing it repeatedly.
A prepared hash-table would probably be the best solution, by loading up your dictionary into it at the beginning of the program. A fairly easy-to-write algorithm for hashing/comparing would be
then
My Java is pretty rusty, but I think that would do it.
从我的 POV 来看,此作业的关键是找到一个将字符串映射到数字的函数 (
hashFunc
),以便 1) 两个字谜词映射到同一个数字,2) 映射两个非字谜词到不同的数字。一旦找到该函数,就可以简单地将其应用于输入,从而避免繁琐的字符串比较:在 Unix 系统上,您可以从 单词文件 开始
使用预先计算的
hashFunc
将字典转换为哈希表。From my POV, the key to this assignment is to find a function (
hashFunc
) that maps strings to numbers so that 1) two anagrams are mapped to the same number, 2) two non-anagrams are mapped to different numbers. Once the function is found, it can be simply applied to inputs thus avoiding tedious string comparisons:On unix systems, you can start with the words file
Turn the dictionary into a hash table using precalculated
hashFunc
.