将一个字符串更改为另一个字符串的简单突变数量?
我相信你们都听说过“文字游戏”,在这种游戏中,您试图通过一次更改一个字母来将一个单词更改为另一个单词,并且只浏览有效的英语单词。我正在尝试实现一个 A* 算法来解决它(只是为了充实我对 A* 的理解),并且需要的东西之一是最小距离启发式。
也就是说,这三个突变之一可以将任意字符串 a 变成另一个字符串 b 的最小数量: 1)将一个字母改为另一个字母 2) 在任意字母之前或之后的某个位置添加一个字母 3)删除任何字母
示例
aabca => abaca:
aabca
abca
abaca
= 2
abcdebf => bgabf:
abcdebf
bcdebf
bcdbf
bgdbf
bgabf
= 4
我已经尝试了很多算法;我似乎找不到一个每次都能给出实际答案的人。事实上,有时我不确定我的人类推理是否能找到最佳答案。
有谁知道用于此类目的的算法?或者也许可以帮我找到一个?
(为了澄清,我要求一种算法,可以将任何任意字符串转换为任何其他字符串,而不管它们的英语有效性。)
I'm sure you've all heard of the "Word game", where you try to change one word to another by changing one letter at a time, and only going through valid English words. I'm trying to implement an A* Algorithm to solve it (just to flesh out my understanding of A*) and one of the things that is needed is a minimum-distance heuristic.
That is, the minimum number of one of these three mutations that can turn an arbitrary string a into another string b:
1) Change one letter for another
2) Add one letter at a spot before or after any letter
3) Remove any letter
Examples
aabca => abaca:
aabca
abca
abaca
= 2
abcdebf => bgabf:
abcdebf
bcdebf
bcdbf
bgdbf
bgabf
= 4
I've tried many algorithms out; I can't seem to find one that gives the actual answer every time. In fact, sometimes I'm not sure if even my human reasoning is finding the best answer.
Does anyone know any algorithm for such purpose? Or maybe can help me find one?
(Just to clarify, I'm asking for an algorithm that can turn any arbitrary string to any other, disregarding their English validity-ness.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您需要最小编辑距离(或 Levenshtein 距离):
同一页面上有一种确定编辑顺序的算法 此处。
You want the minimum edit distance (or Levenshtein distance):
And one algorithm to determine the editing sequence is on the same page here.
关于“编辑距离”的一个很好的参考是 S. Dasgupta、CH Papadimitriou 和 UV Vazirani 所著的算法教科书第 6.3 节,该教科书的草稿可免费获取 此处。
An excellent reference on "Edit distance" is section 6.3 of the Algorithms textbook by S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani, a draft of which is available freely here.
如果您有一个大小合理(小)的字典,那么广度优先树搜索可能会起作用。
所以从你的单词可以变异成的所有单词开始,然后所有这些可以变异成的单词(除了原来的),然后一直到第三级......直到找到你要找的单词。
您可以消除发散的单词(远离目标的单词),但这样做可能会导致您在必须经历某些发散状态才能到达最短路径的情况下失败。
If you have a reasonably sized (small) dictionary, a breadth first tree search might work.
So start with all words your word can mutate into, then all those can mutate into (except the original), then go down to the third level... Until you find the word you are looking for.
You could eliminate divergent words (ones further away from the target), but doing so might cause you to fail in a case where you must go through some divergent state to reach the shortest path.