动态时间扭曲来比较两个录音

发布于 2024-08-19 08:31:14 字数 160 浏览 6 评论 0原文

我想使用动态时间扭曲来比较两个录音的两个特征向量(当然我首先要做所有必要的预处理)。我的程序应该以百分比输出两个录音之间的相似度。例如,100% 意味着两个录音完全相同,录音差异越大,得到的数字就越低。我该如何解决这个问题? DTW 只给我路径的长度或转换的成本,我不知道如何将这些数字之一转换为百分比值。

I'd like to use Dynamic Time Warping to compare two feature vectors for two audio recordings (of course I'm doing all the necessary preprocessing first). My program should output the similarity between the two audio recordings in percent. For example 100% means that the two recordings are completely identical, and the more different are the recordings, the lower number I get. How do I get around to it? The DTW only gives me the length of the path or the cost of the transition and I don't know how to convert one of these numbers to a percent value.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

偷得浮生 2024-08-26 08:31:14

我不知道信号之间有任何以百分比衡量的距离度量。如果有100%的含义,那么就一定有0%的含义。所以首先你需要问自己:0% 意味着什么?

对于 DTW,我很确定没有建立最小距离到“百分比匹配”的转换。如果必须,那么您需要定义一个启发量,它是最小 DTW 距离的函数。

编辑:实际上,如果您有两个有限长度的录音,您可以定义最长距离。这将是(如果查看成本矩阵)一路向右然后向下,或者一路向下然后向右的路径距离。最好的路径,即完美匹配,沿着主对角线走。

一个简单的想法:如果使用 (0,1) (1,0) (1,1) 作为候选步骤,您可以使用 (0,1) 和 (1,0) 所采取的步骤数来衡量坏事。该度量当然有最大值和最小值,因此可以将其映射到某个所需的范围,例如 0-100%。

I'm not aware of any distance metric between signals that is measured by percent. If there is a meaning of 100%, then there must be a meaning of 0%. So first you need to ask yourself: what does 0% mean?

For DTW, I'm pretty sure that there is no established conversion of minimum distance to "percent match". If you must, then you need to define a heuristic quantity that is a function of the minimum DTW distance.

EDIT: Actually, you could sort of define a longest distance if you have two finite-length recordings. That would be the distance of a path that went (if looking at the cost matrix) all the way right then down, or all the way down then right. The best path, i.e. perfect match, goes down the main diagonal.

One simple idea: if using (0,1) (1,0) (1,1) as step candidates, you could maybe use the number of steps taken by (0,1) and (1,0) as a measure of badness. This measure certainly has a maximum and a minimum, so then it could be mapped to some desirable range like 0-100%.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文