以编程方式增加音频样本数组的音高

发布于 2024-10-19 18:19:38 字数 1016 浏览 5 评论 0原文

音频计算领域的好心人大家好,

我有一组代表录音的样本。假设 44100Hz 时为 5 秒。我该如何以增加的音调播放这个?是否可以动态增加和减少音调?就像让音调慢慢增加到双倍速度,然后再降低。

换句话说,我想录制一段录音并回放,就好像它被 dj '刮擦'一样

。伪代码总是受欢迎的。我将用 C 语言写这篇文章。

谢谢,


编辑 1

请允许我澄清我的意图。我想将播放保持在 44100Hz,因此我需要在播放前操作样本。这也是因为我想要将音高增加的音频与以正常速率运行的音频混合

换句话说,也许我需要以某种方式缩小相同数量的样本的音频?这样播放的时候听起来会更快吗?


编辑2

另外,我想自己做这件事。请不要使用任何库(除非你觉得我可以通过代码挑选并找到一些有趣的东西)。


编辑3

用 C 编写的示例代码片段需要 2 个参数(样本数组和音高因子),然后返回新音频的数组,这将是非常棒的!


PS 我已经开始对此进行赏金,并不是因为我认为已经给出的答案无效。我只是认为获得有关该主题的更多反馈会很好。



赏金奖励

老实说,我希望我可以将赏金分配给几个不同的答案,因为其中有很多我认为非常有帮助的答案。特别感谢 Daniel 给我传递了一些代码,以及 AShelly 和 Hotpaw2 提供了如此详细的回复。

最终我使用了 datageist 引用了另一个 SO 问题,因此该奖项授予了他。

再次感谢大家!

Hello kind people of the audio computing world,

I have an array of samples that respresent a recording. Let us say that it is 5 seconds at 44100Hz. How would I play this back at an increased pitch? And is it possible to increase and decrease the pitch dynamically? Like have the pitch slowly increase to double the speed and then back down.

In other words I want to take a recording and play it back as if it is being 'scratched' by a d.j.

Pseudocode is always welcomed. I will be writing this up in C.

Thanks,


EDIT 1

Allow me to clarify my intentions. I want to keep the playback at 44100Hz and so therefore I need to manipulate the samples before playback. This is also because I would want to mix the audio that has an increased pitch with audio that is running at a normal rate.

Expressed in another way, maybe I need to shrink the audio over the same number of samples somehow? That way when it is played back it will sound faster?


EDIT 2

Also, I would like to do this myself. No libraries please (unless you feel I could pick through the code and find something interesting).


EDIT 3

A sample piece of code written in C that takes 2 arguments (array of samples and pitch factor) and then returns an array of the new audio would be fantastic!


PS I've started a bounty on this not because I don't think the answers already given aren't valid. I just thought it would be good to get more feedback on the subject.



AWARD OF BOUNTY

Honestly I wish I could distribute the bounty over several different answers as they were quite a few that I thought were super helpful. Special shoutout to Daniel for passing me some code and AShelly and Hotpaw2 for putting in such detailed responses.

Ultimately though I used an answer from another SO question referenced by datageist and so the award goes to him.

Thanks again everyone!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

甜柠檬 2024-10-26 18:19:38

看看Nosredna对这个(非常相似)SO问题的回答中的“大象”论文:
如何做您对重新采样的音频数据进行双三次(或其他非线性)插值吗?

从第 37 页开始提供示例实现,作为参考,AShelly 的答案对应于线性插值(在同一页上)。经过一些调整,本文中的任何其他公式都可以插入到该框架中。

要评估给定插值方法的质量(并了解使用“更便宜”方案的潜在问题),请查看此页面:

http://www.discodsp.com/highlife/aliasing/

对于比您可能想要处理的更多理论(带有源代码),这也是一个很好的参考:

< a href="https://ccrma.stanford.edu/~jos/resample/" rel="nofollow noreferrer">https://ccrma.stanford.edu/~jos/resample/

Take a look at the "Elephant" paper in Nosredna's answer to this (very similar) SO question:
How do you do bicubic (or other non-linear) interpolation of re-sampled audio data?

Sample implementations are provided starting on page 37, and for reference, AShelly's answer corresponds to linear interpolation (on that same page). With a little tweaking, any of the other formulas in the paper could be plugged into that framework.

For evaluating the quality of a given interpolation method (and understanding the potential problems with using "cheaper" schemes), take a look at this page:

http://www.discodsp.com/highlife/aliasing/

For more theory than you probably want to deal with (with source code), this is a good reference as well:

https://ccrma.stanford.edu/~jos/resample/

柠檬心 2024-10-26 18:19:38

一种方法是在原始波形中保留浮点索引,并将插值样本混合到输出波形中。

//Simulate scratching of `inwave`: 
// `rate` is the speedup/slowdown factor. 
// result mixed into `outwave`
// "Sample" is a typedef for the raw audio type.
void ScratchMix(Sample* outwave, Sample* inwave, float rate)
{
   float index = 0;
   while (index < inputLen)
   {
      int i = (int)index;          
      float frac = index-i;      //will be between 0 and 1
      Sample s1 = inwave[i];
      Sample s2 = inwave[i+1];
      *outwave++ += s1 + (s2-s1)*frac;   //do clipping here if needed
      index+=rate;
   }

如果您

想即时更改rate,您也可以这样做。

如果当速率 > 时这会产生噪声伪影1,尝试用这种技术替换 *outwave++ += s1 + (s2-s1)*frac; (来自这个问题

*outwave++ = InterpolateHermite4pt3oX(inwave+i-1,frac);

其中

public static float InterpolateHermite4pt3oX(Sample* x, float t)
{
    float c0 = x[1];
    float c1 = .5F * (x[2] - x[0]);
    float c2 = x[0] - (2.5F * x[1]) + (2 * x[2]) - (.5F * x[3]);
    float c3 = (.5F * (x[3] - x[0])) + (1.5F * (x[1] - x[2]));
    return (((((c3 * t) + c2) * t) + c1) * t) + c0;
}

在“Windows Startup.wav”上使用线性插值技术的示例系数为 1.1。原始版本在顶部,加速版本在底部:

它可能不会在数学上是完美的,但听起来应该并且应该能够很好地满足OP的需求。

One way is to keep a floating point index into the original wave, and mix interpolated samples into the output wave.

//Simulate scratching of `inwave`: 
// `rate` is the speedup/slowdown factor. 
// result mixed into `outwave`
// "Sample" is a typedef for the raw audio type.
void ScratchMix(Sample* outwave, Sample* inwave, float rate)
{
   float index = 0;
   while (index < inputLen)
   {
      int i = (int)index;          
      float frac = index-i;      //will be between 0 and 1
      Sample s1 = inwave[i];
      Sample s2 = inwave[i+1];
      *outwave++ += s1 + (s2-s1)*frac;   //do clipping here if needed
      index+=rate;
   }

}

If you want to change rate on the fly, you can do that too.

If this creates noisy artifacts when rate > 1, try replacing *outwave++ += s1 + (s2-s1)*frac; with this technique (from this question)

*outwave++ = InterpolateHermite4pt3oX(inwave+i-1,frac);

where

public static float InterpolateHermite4pt3oX(Sample* x, float t)
{
    float c0 = x[1];
    float c1 = .5F * (x[2] - x[0]);
    float c2 = x[0] - (2.5F * x[1]) + (2 * x[2]) - (.5F * x[3]);
    float c3 = (.5F * (x[3] - x[0])) + (1.5F * (x[1] - x[2]));
    return (((((c3 * t) + c2) * t) + c1) * t) + c0;
}

Example of using the linear interpolation technique on "Windows Startup.wav" with a factor of 1.1. The original is on top, the sped-up version is on the bottom:

It may not be mathematically perfect, but it sounds like it should, and ought to work fine for the OP's needs..

旧话新听 2024-10-26 18:19:38

是的,这是可能的。

但这可不是少量的伪代码。您需要一个时间音调修改算法,这是一个相当大且复杂的 DSP 代码量,以获得不错的结果。

这是来自 DSP Dimensions 的时间音调拉伸概述 。您还可以在 Google 上搜索相位声码器算法。

添加:

如果您想要“搓盘”(就像 DJ 可能在物理转盘上使用 LP 那样),则无需修改时间音调。刮盘会以相同的量改变音调和演奏速度(不像需要时间音调修改那样独立地改变)。

并且生成的数组不会具有相同的长度,而是会根据音调/速度变化而变短或变长。

您可以改变音调,也可以通过相同的比率使声音播放得更快或更慢,只需使用正确的滤波插值对信号进行重新采样即可。只需移动每个样本点,而不是移动 1.0,通过浮点加法按所需的速率变化,然后过滤并插值该点的数据。使用加窗 Sinc 插值内核进行插值,低通滤波器过渡频率低于原始采样率和插值局部采样率的较低者,效果会相当好。在网络上搜索“windowed Sinc interpolation”会返回很多合适的结果。

您需要一种包含低通滤波器的插值方法,否则您会听到可怕的混叠噪声。 (例外情况可能是,如果您的原始声音文件已经经过严格低通滤波,低于采样率十年或更长时间。)

Yes, it is possible.

But this is not a small amount of pseudo code. You are asking for a time pitch modification algorithm, which is a fairly large and complicated amount of DSP code for decent results.

Here's a Time Pitch stretching overview from DSP Dimensions. You can also Google for phase vocoder algorithms.

ADDED:

If you want to "scratch", as a DJ might do with an LP on a physical turntable, you don't need time-pitch modification. Scratching changes the pitch and the speed of play by the same amount (not independently as would require time-pitch modification).

And the resulting array won't be of the same length, but will be shorter or longer by the amont of pitch/speed change.

You can change the pitch, as well as make the sound play faster or slower by the same ratio, by just resampling the signal using properly filtered interpolation. Just move each sample point, instead of by 1.0, by floating point addition by your desired rate change, then filter and interpolate the data at that point. Interpolation using a windowed Sinc interpolation kernel, with a low-pass filter transition frequency below the lower of the original and interpolated local sample rate, will work fairly well. Searching for "windowed Sinc interpolation" on the web returns lots of suitable result.

You need an interpolation method that includes a low-pass filter, or else you will hear horrible aliasing noise. (The exception to this might be if your original sound file is already severely low-pass filtered a decade or more below the sample rate.)

锦欢 2024-10-26 18:19:38

如果您希望轻松完成此操作,请参阅 AShelly 的建议[编辑:事实上,无论如何先尝试一下]。如果您需要良好的质量,那么您基本上需要一个相位声码器

相位声码器的基本思想是找到声音组成的频率,根据需要更改这些频率并重新合成声音。因此,一个残酷的简化是:

  1. 运行 FFT
  2. 将所有频率更改一个因子
  3. 运行逆 FFT

如果您要自己实现这一点,您绝对应该阅读 相位声码器工作原理的详尽解释。该算法确实需要比上面三步简化更多的考虑。

当然,现成实现存在,但从我收集的问题来看,您想自己执行此操作。

If you want this done easily, see AShelly's suggestion [edit: as a matter of fact, try it first anyway]. If you need good quality, you basically need a phase vocoder.

The very basic idea of a phase vocoder is to find the frequencies that the sound consists of, change those frequencies as needed and resynthesize the sound. So a brutal simplification would be:

  1. run FFT
  2. change all frequencies by a factor
  3. run inverse FFT

If you're going to implement this yourself, you definitely should read a thorough explanation of how a phase vocoder works. The algorithm really needs many more considerations than the three-step simplification above.

Of course, ready-made implementations exist, but from the question I gather you want to do this yourself.

流云如水 2024-10-26 18:19:38

降低和提高音调就像以低于或高于 44.1kHz 的速率播放样本一样简单。这将产生较慢/较快的录音声音,但您需要添加真实唱片的“沙哑声”。

To decrease and increase the pitch is as simple as playing the sample back at a lower or higher rate than 44.1kHz. This will produce the slower/faster record sound but you'll need to add the 'scratchiness' of real records.

空‖城人不在 2024-10-26 18:19:38

帮助我进行了重新采样,这与您需要的相同,只是从另一侧看。

如果你找不到代码,请联系我,我有一个很好的 C 例程。

This helped me with resampling, which is same thing you need just looked from the opposite side.

If you can't find code, ping me, I have a nice C routine for this.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文