歌声与音乐分离

发布于 2024-07-12 13:34:52 字数 1435 浏览 4 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

稚然 2024-07-19 13:34:52

从音频中分离出不同的信号是一个非常活跃的研究领域,也是一个非常困难的问题。 这在文献中通常称为盲信号分离。 (上一个链接中有一些 MATLAB 演示代码。

当然,如果您知道音乐中有人声,您可以使用众多 人声分离算法。

Separating out distinct signals from audio is a very active area of research, and it is a very hard problem. This is often called Blind Signal Separation in the literature. (There is some MATLAB demo code in the previous link.

Of course, if you know that there is vocal in the music, you can use one of the many vocal separation algorithms.

別甾虛僞 2024-07-19 13:34:52

正如其他人所指出的,仅使用原始频谱分析来解决这个问题是一个令人畏惧的难题,并且您不太可能找到一个好的解决方案。 充其量,您也许能够从混音中提取一些人声和一些额外的交叉频率。

但是,如果您可以更具体地了解您正在使用的音频材料的性质,您可能会更进一步。

在最坏的情况下,您的素材将是普通歌曲的普通 mp3——即,一个完整的乐队 + 歌手。 鉴于您问题的性质,我有一种感觉,您可能正在考虑这种情况。

在最好的情况下,您可以访问多轨录音室录音,并且至少有完整的混音和乐器音轨,在这种情况下,您可以从混音中提取人声频率。 您可以通过从其中一个轨道生成脉冲响应并将其应用到另一个轨道来实现此目的。

在中间的情况下,您正在处理简单的音乐,您可以应用某种根据音乐参数调整的算法。 例如,如果您正在处理电子音乐,您可以利用音轨的立体声宽度来消除所有单声道元素(即低音线+底鼓)以提取人声+其他平移乐器,然后应用某种类型的优势。从那里进行过滤和频谱分析。

简而言之,如果您计划制定一种通用算法来从任意源材料生成干净的阿卡贝拉片段,那么您可能会在这里贪多嚼不烂。 如果您可以明确限制源材料,那么您可以根据这些源的性质使用多种算法。

As others have noted, solving this problem using only raw spectrum analysis is a dauntingly hard problem, and you're unlikely to find a good solution to it. At best, you might be able to extract some of the vocals and a few extra crossover frequencies from the mix.

However, if you can be more specific about the nature of the audio material you are working with here, you might be able to get a little bit further.

In the worst case, your material would be normal mp3's of regular songs -- ie, a full band + vocalist. I have a feeling that this is the case you are probably looking at given the nature of your question.

In the best case, you have access to the multitrack studio recordings and have at least a full mixdown and an instrumental track, in which case you could extract the vocal frequencies from the mix. You would do this by generating an impulse response from one of the tracks and applying it to the other.

In the middle case, you are dealing with simple music which you could apply some sort of algorithm tuned to the parameters of the music to. For instance, if you are dealing with electronic music, you can use to your advantage the stereo width of the track to eliminate all mono elements (ie, basslines + kicks) to extract the vocals + other panned instruments, and then apply some type of filtering and spectrum analysis from there.

In short, if you are planning on making an all-purpose algorithm to generate clean acapella cuts from arbitrary source material, you're probably biting off more than you can chew here. If you can specifically limit your source material, then you have a number of algorithms at your disposal depending on the nature of those sources.

迷迭香的记忆 2024-07-19 13:34:52

这很难。 如果你能可靠地做到这一点,你将成为一名出色的计算机科学家。 我读到的最有前途的方法是使用歌词生成纯语音轨道以进行比较。 再说一次,如果你能做到这一点并写一篇关于它的论文,你就会出名(在计算机科学家中)。 另外,您还可以通过自动生成卡拉 OK 的时间来赚很多钱。

This is hard. If you can do this reliably you will be an accomplished computer scientist. The most promising method I read about used the lyrics to generate a voice only track for comparison. Again, if you can do this and write a paper about it you will be famous (amongst computer scientists). Plus you could make a lot of money by automatically generating timings for karaoke.

彡翼 2024-07-19 13:34:52

如果您只是想确定一段音乐是干净的无伴奏合唱还是有乐器背景,您可以通过将信号的带宽与正常人类歌手的带宽进行比较来做到这一点。 此外,您还可以检查基本频率,该频率只能在人声的相当有限的频率范围内。

不过,这可能并不容易。 然而,助听器一直在这样做,所以这显然是可行的。 (尽管他们通常寻找演讲,而不是唱歌)

If you just want to decide wether a block of music is clean a-capella or with instrumental background, you could probably do that by comparing the bandwidth of the signal to a normal human singer bandwidth. Also, you could check for the base frequency, which can only be in a pretty limited frequency range for human voices.

Still, it probably won't be easy. However, hearing aids do this all the time, so it is clearly doable. (Though they typically search for speech, not singing)

如此安好 2024-07-19 13:34:52

首先将乐器与原始音乐同步,确保它们的长度和比特率相同,并在准确的时间开始和结束,然后将它们转换为 .wav,

然后做类似的事情

I = wavread(instrumental.wav);
N = wavread(normal.wav);
i = inv(I);
A = (N - i); // it could be A = (N * i) or A = (N + i) you'll have to play around
wavwrite(A, acapella.wav)

应该可以做到。一点线性代数有很大帮助。

first sync the instrumental with the original, make sure they are the same length and bitrate and start and end at the exact time and convert them to .wav

then do something like

I = wavread(instrumental.wav);
N = wavread(normal.wav);
i = inv(I);
A = (N - i); // it could be A = (N * i) or A = (N + i) you'll have to play around
wavwrite(A, acapella.wav)

that should do it.. a little linear algebra goes a long way.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文