From what he is saying a language like PHP might not be suitable
How do you get that from the details he's published?
If anyone has a good English Dictionary...
There's one in the pspell extension although given the nature of the algorithm presented it may be more efficient to push most of the logic (and the dictionary) into the database - IIRC pspell uses a custom format albeit a documented one
如果我有一个包含列的表格怎么办 “单词”(例如 cat)、“长度”(例如 3)和 AZ(例如c=1 a=1 t=1)。这样一来 anagram 'atc' 我可以做这样的查询 '从字典中选择单词 WHERE c <= 1 AND a <= 1 AND t <= 1 AND 长度 <= 3' 它将返回 cat
What exactly looks wrong in the algorithm Pablo suggested? I was going to suggest the same ;)
What if I had a table with column's 'word' (e.g cat), 'length' (e.g 3) and A-Z (e.g c=1 a=1 t=1). That way the anagram 'atc' I could do a query like 'SELECT word FROM dictionary WHERE c <= 1 AND a <= 1 AND t <= 1 AND length <= 3' and it would return cat
T tar 1
A tar 1
R tar 1
C cat 1
A cat 1
T cat 1
C car 1
A car 1
R car 1
每个字母的结果
R car tar
A cat car tar
T cat tar
然后用交集连接每个查询!
I would have a table {letter} {word} {count} and for each word, store it along with each of it's component letters, and how many times the letter appears in the word. Then a search for an anagrams starts with searching for a set of letters, and finding the intersection between the sets of words each letter is associated with. For example
Input: rat Table:
T tar 1
A tar 1
R tar 1
C cat 1
A cat 1
T cat 1
C car 1
A car 1
R car 1
Results for each letter
R car tar
A cat car tar
T cat tar
And then you join each query with an intersection!
You could use a Trie datastructure to loop through every combination of character sequence (and obviously stop the current node if there are no subnodes).
This would produce a full list of all possible solutions in a fairly efficient way. With a limited starting character set I think it would work well.
At each node you can select the count of matching words, and when it is suitably small enough, load that into an array for comparison so you don't need to run a million selects.
Databases are very bad at storing hierarchical data like this, so I wouldn't recommend MySQL. You may be able to do some "clever" things with indexes and LIKE clauses, but I would expect that to be rather kludgey.
PHP has everything you need to do the coding for this, but there are probably better alternatives. Perl is known for its ability to do text manipulation. I'm not sure about scripting languages like Python or Ruby.
发布评论
评论(6)
您如何从他发布的详细信息中得知这一点?
pspell 扩展 中有一本尽管考虑到所呈现算法的性质,将大部分逻辑(和字典)推入数据库可能会更有效 - IIRC pspell 使用自定义格式,尽管 记录了一个
How do you get that from the details he's published?
There's one in the pspell extension although given the nature of the algorithm presented it may be more efficient to push most of the logic (and the dictionary) into the database - IIRC pspell uses a custom format albeit a documented one
您可能想查看 Xavier 的 Anagram Solver。它是用 PHP 和 MYSQL 编写的。
有一个演示:http://anagram.savjee.be/
源代码位于此处:https://github.com/Savjee/Xavier-s-Anagram-Solver
这很容易理解。
You might want to check out Xavier's Anagram Solver. It's written in PHP and MYSQL.
There's a demo on: http://anagram.savjee.be/
The source code is located here: https://github.com/Savjee/Xavier-s-Anagram-Solver
It's quite simple to understand.
巴勃罗建议的算法到底哪里出了问题?我也想提出同样的建议;)
重定向赞成票(如果有)到他的评论。
还有一个类似的问题:Algorithm togenerate anagrams
另外你还需要检查Google:
http://www.google.ru/search?q=anagram+solving+algorithms&ie=utf-8&oe=utf-8&aq=t&rls= org.mozilla:ru:official&client=firefox
What exactly looks wrong in the algorithm Pablo suggested? I was going to suggest the same ;)
Redirect upvoting (if any) to his comment please.
Also there is a similiar question: Algorithm to generate anagrams
Also you need to check Google:
http://www.google.ru/search?q=anagram+solving+algorithms&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:ru:official&client=firefox
我将有一个表 {letter} {word} {count} ,对于每个单词,将其与其每个组成字母一起存储,以及该字母在单词中出现的次数。然后,对字谜的搜索从搜索一组字母开始,并找到每个字母所关联的单词组之间的交集。例如
输入:rat
表:
每个字母的结果
然后用交集连接每个查询!
I would have a table {letter} {word} {count} and for each word, store it along with each of it's component letters, and how many times the letter appears in the word. Then a search for an anagrams starts with searching for a set of letters, and finding the intersection between the sets of words each letter is associated with. For example
Input: rat
Table:
Results for each letter
And then you join each query with an intersection!
您可以使用 Trie 数据结构 循环遍历字符序列的每个组合(并且显然会停止当前的节点(如果没有子节点)。
这将以相当有效的方式生成所有可能解决方案的完整列表。由于起始字符集有限,我认为它会运作良好。
在每个节点,您可以选择匹配单词的数量,当它足够小时,将其加载到数组中进行比较,这样您就不需要运行一百万个选择。
You could use a Trie datastructure to loop through every combination of character sequence (and obviously stop the current node if there are no subnodes).
This would produce a full list of all possible solutions in a fairly efficient way. With a limited starting character set I think it would work well.
At each node you can select the count of matching words, and when it is suitably small enough, load that into an array for comparison so you don't need to run a million selects.
从你的链接...
数据库非常不擅长存储这样的分层数据,所以我不推荐 MySQL。您也许可以使用索引和 LIKE 子句做一些“聪明”的事情,但我认为这会相当笨拙。
PHP 拥有为此进行编码所需的一切,但可能还有更好的选择。 Perl 以其文本操作的能力而闻名。我不确定像 Python 或 Ruby 这样的脚本语言。
From your link...
Databases are very bad at storing hierarchical data like this, so I wouldn't recommend MySQL. You may be able to do some "clever" things with indexes and LIKE clauses, but I would expect that to be rather kludgey.
PHP has everything you need to do the coding for this, but there are probably better alternatives. Perl is known for its ability to do text manipulation. I'm not sure about scripting languages like Python or Ruby.