当前位置：文江博客话题详情

如何计算最长公共子序列的数量

发布于 2024-08-21 17:51:01 字数 387 浏览 1 评论 0原文

我正在尝试计算两个字符串之间存在的最长可能子序列的数量。

例如字符串 X =“efgefg”；字符串 Y =“efegf”；

输出：最长公共序列的数量为：3 （即：efeg、efef、efgf - 这不需要通过算法计算，仅在此处显示以进行演示）

我已经设法使用基于 O(|X|*|Y|) 的动态规划在 O(|X|*|Y|) 中做到这一点总体思路如下：最便宜路径算法。

任何人都可以想出一种方法来以更好的运行时间有效地进行这种计算吗？

——针对杰森的评论进行编辑。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

无边思念无边月 2024-08-28 17:51:01

我不知道，但这里有一些大声思考的尝试：

我能够构造的最坏情况有一个指数 - 2**(0.5 |X|) - 最长公共子序列的数量：

X = "aAbBcCdD..."
Y = "AaBbCcDd..."

其中最长公共子序列恰好包含一个{A, a} 中的一个，{B, b} 中的一个等等......（挑剔：如果你的字母表限制为 256 个字符，这最终会崩溃 - 但 2**128 已经很大了。）

但是，您不必生成所有子序列来对它们进行计数。
如果你有 O(|X| * |Y|)，那么你已经比这更好了！我们从中学到的是，任何比您的算法更好的算法都不得尝试生成实际的子序列。

I don't know but here are some attempts at thinking aloud:

The worst case I was able to construct has an exponential - 2**(0.5 |X|) - number of longest common subsequences:

X = "aAbBcCdD..."
Y = "AaBbCcDd..."

where the longest common subsequences include exactly one of {A, a}, exactly one of {B, b} and so forth... (nitpicking: if you alphabet is limited to 256 chars, this breaks down eventually - but 2**128 is already huge.)

However, you don't necessarily have to generate all subsequences to count them.
If you've got O(|X| * |Y|), you are already better than that! What we learn from this is that any algorithm better than yours must not attempt to generate the actual subsequences.

回复收藏 0 原文