查找所有子字符串的编辑距离的算法

发布于 2024-12-15 20:56:19 字数 875 浏览 8 评论 0原文

给定 2 个字符串 s 和 t。我需要找到 s 到 t 的编辑距离（Levenshtein 距离）中的每个子字符串。实际上，我需要知道 s 中的每个 i 位置从位置 i 开始的所有子字符串的最小编辑距离是多少。

例如：

t = "ab"    
s = "sdabcb"

我需要得到类似：

{2,1,0,2,2}

解释：

1st position:
distance("ab", "sd") = 4 ( 2*subst )
distance("ab", "sda") = 3( 2*delete + insert )
distance("ab", "sdab") = 2 ( 2 * delete)
distance("ab", "sdabc") = 3 ( 3 * delete)
distance("ab", "sdabcb") = 4 ( 4 * delete)
So, minimum is 2

2nd position:
distance("ab", "da") = 2 (delete + insert)
distance("ab", "dab") = 1 (delete)
distance("ab", "dabc") = 2 (2*delete)
....
So, minimum is 1

3th position:
distance("ab", "ab") = 0
...
minimum is 0

等等。

当然，我可以使用暴力算法来解决这个任务。但有没有更快的算法呢？

感谢您的帮助。

原文

Given 2 strings s and t. I need to find for each substring in s edit distance(Levenshtein distance) to t. Actually I need to know for each i position in s what is the minimum edit distance for all substrings started at position i.

For example:

t = "ab"    
s = "sdabcb"

And I need to get something like:

{2,1,0,2,2}

Explanation:

1st position:
distance("ab", "sd") = 4 ( 2*subst )
distance("ab", "sda") = 3( 2*delete + insert )
distance("ab", "sdab") = 2 ( 2 * delete)
distance("ab", "sdabc") = 3 ( 3 * delete)
distance("ab", "sdabcb") = 4 ( 4 * delete)
So, minimum is 2

2nd position:
distance("ab", "da") = 2 (delete + insert)
distance("ab", "dab") = 1 (delete)
distance("ab", "dabc") = 2 (2*delete)
....
So, minimum is 1

3th position:
distance("ab", "ab") = 0
...
minimum is 0

and so on.

I can use brute force algorithm to solve this task, of course. But is there faster algorithm?

Thanks for help.

分享到QQ

分享到微博