C++ Boggle Solver:查找集合中的前缀

发布于 2025-01-01 03:05:34 字数 616 浏览 3 评论 0原文

这是一项家庭作业,所以我不需要确切的代码,但希望任何可以帮助我指明正确方向的想法。

作业是编写一个令人难以置信的解决程序。我觉得递归部分已经完成,但我需要一些关于如何将当前字符序列与字典进行比较的见解。

我需要将字典存储在集合或排序列表中。我一直在尝试一种使用集合来实现这一点的方法。为了使程序运行得更快并且不遵循死胡同,我需要检查并查看当前的字符序列是否作为集合(字典)中任何内容的前缀存在。

我发现 set.find() 操作仅在字符串完全匹配时返回 true。在实验室要求中,教授提到:

“如果字典存储在Set中,许多数据结构库都提供了一种方法来查找Set中与您要搜索的字符串最接近的字符串。这样的操作可以是用于快速查找具有给定前缀的单词。”

我今天一直在寻找教授所描述的内容。我找到了很多有关尝试的信息,但由于我需要使用列表或集合,所以我认为这行不通。

我还尝试查找自动完成功能的算法,但我发现的算法对于我在这里想要完成的任务来说似乎非常复杂。

我还考虑使用 strncmp() 将当前序列与字典集中的单词进行比较,但同样,我不知道在这种情况下它到底如何发挥作用(如果有的话)。

是否值得继续研究这在集合中如何工作,或者我应该尝试使用排序列表来存储我的字典?

谢谢

This is for a homework assignment, so I don't want the exact code, but would appreciate any ideas that can help point me in the right direction.

The assignment is to write a boggle solving program. I've got the recursive part down I feel, but I need some insight on how to compare the current sequence of characters to the dictionary.

I'm required to store the dictionary in either a set or sorted list. I've been trying a way to implement this using a set. In order to make the program run faster and not follow dead end paths, I need to check and see if the current sequence of characters exists as a prefix to anything in the set (dictionary).

I've found that set.find() operation only returns true if the string is an exact match. In the lab requirements, the professor mentioned that:

"If the dictionary is stored in a Set, many data structure libraries provide a way to find the string in the Set that is closest to the one you are searching for. Such an operation could be used to quickly find a word with a given prefix."

I've been searching today for a what the professor is describing. I've found a lot of information on tries, but since I'm required to use a list or set, I don't think that will work.

I've also tried looking up algorithms for autocomplete functions, but the ones that I've found seem extremely complicated for what I'm trying to accomplish here.

I also was thinking of using strncmp() to compare the current sequence to a word from the dictionary set, but again, I don't know how exactly that would function in this situation, if at all.

Is it worth it to continue investigating how this would work in a set or should I just try using a sorted list to store my dictionary?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

愁以何悠 2025-01-08 03:05:34

正如 @Raymond Hettinger 在他的回答中提到的, trie 在这里非常有用。但是,如果您不习惯编写 trie 或者更喜欢使用现成的组件,则可以使用单词按字母顺序排序的可爱属性来在 O(log n) 时间内检查给定前缀是否存在。这个想法如下 - 假设您正在检查前缀“thr”。如果您注意到,以前缀“thr”开头的每个单词都必须夹在字符串“thr”和“ths”之间。例如,thr ≤ 到 < ths,且 thr ≤ 喉部 <如此。如果您将单词存储在一个巨大的排序数组中,则可以使用二分搜索的修改版本来按字母顺序查找至少是您选择的前缀的第一个单词,并按字母顺序查找至少下一个前缀的第一个单词(通过取最后一个前缀形成)前缀字母并递增)。如果它们是同一个单词,则它们之间没有任何内容,并且前缀不存在。如果不是,则说明它们之间存在某种东西,而前缀则起到了作用。

由于您使用的是 C++,因此您可以使用 std::vectorstd::lower_bound 算法。您还可以将所有单词放入 std::set 中,并使用 setlower_bound 版本。例如:

std::set<std::string> dictionary;
std::string prefix = /* ... */

/* Get the next prefix. */
std::string nextPrefix = prefix;
nextPrefix[nextPrefix.length() - 1]++;

/* Check whether there is something with the prefix. */
if (dictionary.lower_bound(prefix) != dictionary.lower_bound(nextPrefix)) {
    /* ... something has that prefix ... */
} else {
    /* ... no word has that prefix ... */
}

也就是说,特里树可能是一个更好的结构。如果您感兴趣,还有另一种数据结构,称为 DAWG(有向非循环字图)与 trie 类似,但使用的内存要少得多;在斯坦福大学的计算机科学入门课程中(其中 Boggle 是一项作业),实际上为学生提供了一个包含该语言中所有单词的 DAWG。还有另一种数据结构,称为三元搜索树,它介于二叉搜索树之间如果您想研究一下,这里可能会有用的 trie。

希望这有帮助!

As @Raymond Hettinger mentions in his answer, a trie would be extremely useful here. However, if you either are uncomfortable writing a trie or would prefer to use off-the-shelf components, you can use a cute property of how words are ordered alphabetically to check in O(log n) time whether a given prefix exists. The idea is as follows - suppose for example that you are checking for the prefix "thr." If you'll note, every word that begins with the prefix "thr" must be sandwiched between the strings "thr" and "ths." For example, thr ≤ through < ths, and thr ≤ throat < ths. If you are storing your words in a giant sorted array, you can use a modified version of binary search to find the first word alphabetically at least the prefix of your choice and the first word alphabetically at least the next prefix (formed by taking the last letter of the prefix and incrementing it). If these are the same word, then nothing is between them and the prefix doesn't exist. If they're not, then something is between them and the prefix does it.

Since you're using C++, you can potentially do with a std::vector and the std::lower_bound algorithm. You could also throw all the words into a std::set and use the set's version of lower_bound. For example:

std::set<std::string> dictionary;
std::string prefix = /* ... */

/* Get the next prefix. */
std::string nextPrefix = prefix;
nextPrefix[nextPrefix.length() - 1]++;

/* Check whether there is something with the prefix. */
if (dictionary.lower_bound(prefix) != dictionary.lower_bound(nextPrefix)) {
    /* ... something has that prefix ... */
} else {
    /* ... no word has that prefix ... */
}

That said, the trie is probably a better structure here. If you're interested, there is another data structure called a DAWG (Directed Acyclic Word Graph) that is similar to the trie but uses substantially less memory; in the Stanford introductory CS courses (where Boggle is an assignment), students actually are provided a DAWG containing all the words in the language. There is also another data structure called a ternary search tree that is somewhere in-between a binary search tree and a trie that may be useful here, if you'd like to look into it.

Hope this helps!

离去的眼神 2025-01-08 03:05:34

trie 是解决此问题的首选数据结构。

如果您仅限于集合和字典,我会选择一个将前缀映射到可能匹配的数组的字典:

asp -> aspberger aspire
bal -> balloon balance bale baleen ...

The trie is the preferred data structure of choice for this problem.

If you're limited to sets and dictionaries, I would choose a dictionary that maps prefixes to an array of possible matches:

asp -> aspberger aspire
bal -> balloon balance bale baleen ...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文