多个序列的最长公共子序列

发布于 2024-11-02 16:36:30 字数 353 浏览 5 评论 0原文

我已经做了很多研究来寻找 M = 2 序列的最长序列，但我试图弄清楚如何为 M ≥ 2 序列做到这一点，

我得到了 N 和 M: M 序列，具有 N 个独特元素。 N 是 {1 - N} 的集合。我已经考虑过动态编程方法，但我仍然对如何实际合并它感到困惑。

示例输入

这里的最大序列可以看作

5 3 1

预期输出

Length = 3

原文

I have done a bunch of research for finding the longest for M = 2 sequences, but I am trying to figure out how to do it for M ≥ 2 sequences

I am being given N and M: M sequences, with N unique elements. N is the set of {1 - N}. I have thought about the dynamic programming approach, but I am still confused as to how to actually incorporate it.

Example input

The max sequence here can be seen to be

5 3 1

Expected output

Length = 3

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浴红衣 2024-11-09 16:36:30

一个简单的想法。

对于 1 和 N 之间的每个数字 i，计算最后一个数字为 i 的最长子序列。（我们称之为a[i]）

为此，我们将从头到尾迭代第一个序列中的数字i。如果a[i]> 1，然后是数字 j，在每个序列中它都位于 i 之前。
现在我们可以检查 j 的所有可能值，并且（如果前面的条件成立）执行 a[i] = max(a[i], a[j] + 1)。

作为最后一位，因为在第一个序列中 j 出现在 i 之前，这意味着 a[j] 已经计算出来了。

for each i in first_sequence
    // for the OP's example, 'i' would take values [5, 3, 4, 1, 2], in this order
    a[i] = 1;
    for each j in 1..N
        if j is before i in each sequence
            a[i] = max(a[i], a[j] + 1)
        end
    end
end

如果你事先计算位置矩阵，它的复杂度是O(N^2*M)。

A simple idea.

For each number i between 1 and N, calculate the longest subsequence where the last number is i. (Let's call it a[i])

To do that, we'll iterate over numbers i in the first sequence from start to end. If a[i] > 1, then there's number j such that in each sequence it comes before i.
Now we can just check all possible values of j and (if previous condition holds) do a[i] = max(a[i], a[j] + 1).

As the last bit, because j comes before i in first sequence, it means a[j] is already calculated.

for each i in first_sequence
    // for the OP's example, 'i' would take values [5, 3, 4, 1, 2], in this order
    a[i] = 1;
    for each j in 1..N
        if j is before i in each sequence
            a[i] = max(a[i], a[j] + 1)
        end
    end
end

It's O(N^2*M), if you calculate matrix of positions beforehand.

回复收藏 0 原文

浊酒尽余欢 2024-11-09 16:36:30

由于您有独特的元素，@Nikita Rybak 的答案是可以使用的答案，但既然您提到了动态编程，那么当您有两个以上序列时，以下是使用 DP 的方法：

dp[i, j, k] = length of longest common subsequence considering the prefixes
              a[1..i], b[1..j], c[1..k].


dp[i, j, k] = 1 + dp[i - 1, j - 1, k - 1] if a[i] = b[j] = c[k]
            = max(dp[i - 1, j, k], dp[i, j - 1, k], dp[i, j, k - 1]) otherwise

要获取实际的子序列，请使用递归函数从 dp[a.Length, b.Length, c.Length] 开始，基本上反转了上面的公式：如果三个元素相等，则回溯到 dp[a.Length - 1 , b.Length - 1, c.Length - 1] 并打印该字符。如果不是，则按照上述值中的最大值回溯。

Since you have unique elements, @Nikita Rybak's answer is the one to go with, but since you mentioned dynamic programming, here's how you'd use DP when you have more than two sequences:

dp[i, j, k] = length of longest common subsequence considering the prefixes
              a[1..i], b[1..j], c[1..k].


dp[i, j, k] = 1 + dp[i - 1, j - 1, k - 1] if a[i] = b[j] = c[k]
            = max(dp[i - 1, j, k], dp[i, j - 1, k], dp[i, j, k - 1]) otherwise

To get the actual subsequence back, use a recursive function that starts from dp[a.Length, b.Length, c.Length] and basically reverses the above formulas: if the three elements are equal, backtrack to dp[a.Length - 1, b.Length - 1, c.Length - 1] and print the character. If not, backtrack according to the max of the above values.

回复收藏 0 原文

真心难拥有 2024-11-09 16:36:30

您可以查看“设计用于查找常见 DNA 子序列的新确定性算法”论文。您可以使用此算法构建 DAG（第 8 页，图 5）。从 DAG 中读取最大的公共不同子序列。然后尝试使用动态规划方法，使用 M 的值来决定每个序列需要构建多少个 DAG。基本上使用这些子序列作为键，并将相应的序列号存储在找到的地方，然后尝试找到最大的子序列（可以大于1）。

回复收藏 0 原文

~没有更多了~