两个字符串的最长公共子序列的长度

发布于 2024-09-25 18:23:14 字数 624 浏览 1 评论 0原文

我正在尝试编写一个函数来计算两个输入字符串 str1str2 的最长公共子序列的长度。

这就是我现在所拥有的,

(define LCS
  (lambda (str1 str2)
    (if (OR (equal? str1 "") (equal? str2 ""))
        0
        (if (equal? (string-contains str1 (string-ref str2 0)) #t) 
            (+ 1 (LCS (substring str1 1 (string-length str1)) 
                      (substring str2 1 (string-length str2))))
            (LCS (substring str1 1 (string-length str1)) 
                 (substring str2 1 (string-length str2)))))))

如果字符串中包含某个字符,则 string-contains 返回 true 。现在看来它可以工作,但我认为可能有一个错误。

I am trying to write a function which computes the length of the longest common subsequence of the two input strings str1 and str2.

This is what I have right now,

(define LCS
  (lambda (str1 str2)
    (if (OR (equal? str1 "") (equal? str2 ""))
        0
        (if (equal? (string-contains str1 (string-ref str2 0)) #t) 
            (+ 1 (LCS (substring str1 1 (string-length str1)) 
                      (substring str2 1 (string-length str2))))
            (LCS (substring str1 1 (string-length str1)) 
                 (substring str2 1 (string-length str2)))))))

Where string-contains returns true if a string has a certain character in it. Right now it seems like it works, but I think there might be a bug.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

单调的奢华 2024-10-02 18:23:14

如果您不介意相对较慢的算法,那么您的代码完全走在正确的轨道上;如果您需要更快的话,可以使用动态编程来解决该问题。

现在,代码中的错误是您在每次递归调用时同时向下移动两个字符串的长度。例如,它会失败(我想,我还没有尝试过),使用以下两个字符串: (LCS "scheme" "emesch") 原因是匹配的子字符串没有开始在第一个和第二个字符串中的相同位置。

我建议您将递归分成两个递归调用:一个是仅从第一个字符串的前面删除一个字符,另一个是从第二个字符串的前面删除一个字符。然后,您可以从这两个调用中获取最佳结果。以这种方式,您可以确定您会找到子字符串,无论它们位于另一个字符串中的哪个位置。

Your code is totally on the right track if you don't mind a relatively slow algorithm; there's a more sophisticated approach to the problem, with dynamic programming, if you need it to be faster.

Right now, the bug in your code is that you're moving down the length of both strings simultaneously with each recursive call. It would fail, for example (I think, I haven't tried it) with the following two strings: (LCS "scheme" "emesch") Reason being is that the matching substrings don't start at the same position in the first and second string.

I suggest that you split up the recursion into two recursive calls: one where you remove a character off the front of only the first string, and one where you remove a character off the front of only the second. Then, you take the best result from either of those calls. In that fashion, you can be sure that you'll find substrings no matter where they lie in the other string.

追风人 2024-10-02 18:23:14

并行扫描两个字符串。如果当前两个字符相同,则最长公共子序列的长度加一,继续扫描每个字符串的下一个字符;否则,在从一个输入序列或另一个输入序列中删除当前字符之后,有两种可能性需要递归地考虑,并且最长公共子序列的长度只是两种可能性中较大的一个。

更完整的解释和在Scheme中的实现可以在我的博客中找到。

The two strings are scanned in parallel. If the current two characters are identical, the length of the longest common subsequence increases by one, and the scan continues at the next character in each string; otherwise, there are two possibilities to consider recursively, after deleting the current character from either one input sequence or the other, and the length of the longest common subsequence is simply the greater of the two possibilities.

A more complete explanation and an implementation in Scheme are available at my blog.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文