最短异常子序列
给定两个字符串 s 和 t,确定最短字符串 z 的长度,使得 z 是 s 的子序列而不是 t 的子序列。
例如:
s :babab, t :babba
sol : 3 (aab)
不寻找复制粘贴代码,如果有人可以帮助直觉解决这个问题。
多谢 !
Given two strings s and t, determine length of shortest string z such that z is a subsequence of s and not a subsequence of t.
example :
s :babab, t :babba
sol : 3 (aab)
not looking for copy pastable code, please if anybody can help with intution for solving this.
thanks a lot !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用动态规划在二次时间内解决这个问题,就像最长公共子序列一样。我将展示该公式以及您将如何得出它。
首先,一些定义。令
m
为S
的长度,令n
为T
的长度。将DP(i, j)
定义为S[:i]
中非T[:j]
子序列的最短子序列的长度code> 或 'INF'(如果不存在)。这里,表达式S[:i]
是切片符号,意思是“S
的第一个i
个字符,因此S[:0” ]
为空,S[:m]
是S
的全部。我们需要DP(m, n)
。有两种简单的基本情况:由于
T[:0]
为空,因此S
中的任何字符都可以工作,因此如果 i 则DP(i, 0) = 1 > 0 。同样,对于所有 j,
DP(0, j) = 'INF'
。现在,我们只需为
DP(i, j)
编写一个通用公式,该公式仅取决于小于(i ,j)
。S[:i]
的最后一个字符只是某个字符S[i-1]
。S[:i], T[:j]
的最佳子序列要么以S[i-1]
结尾,要么不是。S[i-1]
结尾,那么我们可以从考虑中删除S[i-1]
,我们的答案是DP(i,j)= DP(i-1,j)
。如果我们的最佳子序列确实以
S[i-1]
结尾,那么我们需要知道S[i-1]
在T[ 中最右边出现的位置: j]
。S[i-1]
根本没有出现在T[:j]
中,则S[i-1]
本身就是最短子序列,因此DP(i, j) = 1
。Rightmost(c, j)
为T[:j]
的最右边索引,等于某个字符c
。由于我们使用S[i-1]
来结束最优子序列,因此我们可以忽略T[:j]
中最右边出现之后的所有字符S[i-1]
:它们不再影响字符串是否是T[:j]
的子序列。那么DP(i, j) = DP(i-1, Rightmost(S[i-1], j)) + 1
,其中+1
来自事实上,我们确实选择使用S[i-1]
。将它们放在一起,得到 i, j > 的一般公式: 0 变为:
由于根据定义,
Rightmost(c, j)
始终小于j
,因此我们仅使用小于(按字典顺序)的索引实现了一个公式(i, j)
,我们可以直接使用该公式进行递归算法。You can use dynamic programming to solve this in quadratic time just like the longest common subsequence. I'll show the formula and how you would come up with it.
First, some definitions. Let
m
be the length ofS
, and letn
be the length ofT
. DefineDP(i, j)
as the length of the shortest subsequence ofS[:i]
that is not a subsequence ofT[:j]
, or 'INF' if none exists. Here, the expressionS[:i]
is slice notation meaning 'the firsti
characters ofS
, soS[:0]
is empty andS[:m]
is all ofS
. We wantDP(m, n)
.There's two easy base cases: Since
T[:0]
is empty, any character inS
will work, soDP(i, 0) = 1 if i > 0
. Similarly,DP(0, j) = 'INF' for all j
.Now, we just have to write a general formula for
DP(i, j)
which only depends on the value ofDP()
on indices smaller than(i, j)
. The last character ofS[:i]
is just some characterS[i-1]
. Either our optimal subsequence forS[:i], T[:j]
ends withS[i-1]
or it doesn't.S[i-1]
, then we can deleteS[i-1]
from consideration, and our answer isDP(i, j) = DP(i-1, j)
.If our optimal subsequence does end with
S[i-1]
, then we need to know the rightmost occurrence ofS[i-1]
inT[:j]
.S[i-1]
does not occur inT[:j]
at all, thenS[i-1]
by itself is a shortest subsequence, soDP(i, j) = 1
.Rightmost(c, j)
be the rightmost index ofT[:j]
equal to some characterc
. Since we are usingS[i-1]
to end our optimal subsequence, we can ignore all the characters inT[:j]
after the rightmost occurrence ofS[i-1]
: they can no longer affect whether a string is a subsequence ofT[:j]
. So thenDP(i, j) = DP(i-1, Rightmost(S[i-1], j)) + 1
, where the+1
comes from the fact that we did choose to useS[i-1]
.Putting those together, the general formula for
i, j > 0
becomes:Since
Rightmost(c, j)
is always less thanj
by definition, we've achieved a formula using only indices smaller (lexicographically) than(i, j)
, and we can use that formula directly for a recursive algorithm.