音乐数据对比

发布于 2024-09-01 10:54:44 字数 1436 浏览 6 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

无法言说的痛 2024-09-08 10:54:44

就比较名称而言,您可能需要查看 Levenshtein 距离 算法。给定两个字符串,它将计算距离测量,该距离测量可用作捕获重复项的基础。

我个人在为一个应用程序开发的工具中使用了它,该应用程序具有相当大的数据库,其中有大量重复项。将其与与我的领域相关的其他一些数据比较结合使用,我能够将我的工具指向应用程序数据库并快速找到许多重复的记录。不会撒谎,我认为看到它的实际应用真是太酷了。

它甚至可以快速实现,这里有一个 C# 版本

public int CalculateDistance(string s, string t) {
    int n = s.Length; //length of s
    int m = t.Length; //length of t
    int[,] d = new int[n + 1, m + 1]; // matrix
    int cost; // cost

    // Step 1
    if (n == 0) return m;
    if (m == 0) return n;

    // Step 2
    for (int i = 0; i <= n; d[i, 0] = i++) ;
    for (int j = 0; j <= m; d[0, j] = j++) ;
    // Step 3
    for (int i = 1; i <= n; i++) {
        //Step 4
        for (int j = 1; j <= m; j++) {
            // Step 5
            cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);

            // Step 6
            d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
        }
    }

    // Step 7
    return d[n, m];
}

As far as comparing names is concerned you might want to take a look at the Levenshtein distance algorithm. Given two strings it will calculate a distance measurement which can be used as a basis for catching duplicates.

I personally have used it in a tool I developed for an application with a rather large database that had a large number of duplicates in it. Using it in conjunction with some other data comparisons relevant to my domain I was able to point my tool at the application database and quickly find many of the duplicated records. Not going to lie, I thought it was pretty darn cool to see in action.

It's even quick to implement, here's a C# version:

public int CalculateDistance(string s, string t) {
    int n = s.Length; //length of s
    int m = t.Length; //length of t
    int[,] d = new int[n + 1, m + 1]; // matrix
    int cost; // cost

    // Step 1
    if (n == 0) return m;
    if (m == 0) return n;

    // Step 2
    for (int i = 0; i <= n; d[i, 0] = i++) ;
    for (int j = 0; j <= m; d[0, j] = j++) ;
    // Step 3
    for (int i = 1; i <= n; i++) {
        //Step 4
        for (int j = 1; j <= m; j++) {
            // Step 5
            cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);

            // Step 6
            d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
        }
    }

    // Step 7
    return d[n, m];
}
像你 2024-09-08 10:54:44

我在这里写了一个类似的答案:音乐识别和信号处理

在研究界,寻找两个信号之间的相似性(直至噪声、节奏、音调或比特率的轻微变化等环境失真)的问题被称为音频(或音乐)指纹。这个话题已经被深入研究了至少十年。这个早期(并且经常被引用)Haitsma 和 Kalker 的论文清楚地描述了问题并提出了一个简单的解决方案。

寻找同一首歌的两个版本之间的音乐相似性的问题被称为翻唱歌曲标识。这个问题也得到了大量研究,但仍然被认为是开放的。

也许基于内容的音乐搜索的两个最流行的商业解决方案是 MidomiShazam

我相信这解决了你的问题。检查 Google Scholar 以获取这些问题的最新解决方案。 ISMIR 会议记录可免费在线获取。

I wrote a similar answer here: Music Recognition and Signal Processing.

In the research community, the problem of finding similarity between two signals (up to environmental distortions such as noise, mild variations in tempo, pitch, or bitrate) is known as audio (or music) fingerprinting. This topic has been studied heavily for at least a decade. This early (and oft cited) paper by Haitsma and Kalker clearly describes the problem and proposes a simple solution.

The problem of finding musical similarity between two versions of the same song is known as cover song identification. This problem is also studied heavily but is still considered open.

Perhaps the two most popular commercial solutions for content-based musical search are Midomi and Shazam.

I believe this addresses your question. Check Google Scholar for recent solutions to these problems. The ISMIR proceedings are available for free online.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文