当前位置：文江博客话题详情

以编程方式增加音频样本数组的音高

发布于 2024-10-19 18:19:38 字数 1016 浏览 5 评论 0原文

音频计算领域的好心人大家好，

我有一组代表录音的样本。假设 44100Hz 时为 5 秒。我该如何以增加的音调播放这个？是否可以动态增加和减少音调？就像让音调慢慢增加到双倍速度，然后再降低。

换句话说，我想录制一段录音并回放，就好像它被 dj '刮擦'一样

。伪代码总是受欢迎的。我将用 C 语言写这篇文章。

谢谢，

编辑 1

请允许我澄清我的意图。我想将播放保持在 44100Hz，因此我需要在播放前操作样本。这也是因为我想要将音高增加的音频与以正常速率运行的音频混合。

换句话说，也许我需要以某种方式缩小相同数量的样本的音频？这样播放的时候听起来会更快吗？

编辑2

另外，我想自己做这件事。请不要使用任何库（除非你觉得我可以通过代码挑选并找到一些有趣的东西）。

编辑3

用 C 编写的示例代码片段需要 2 个参数（样本数组和音高因子），然后返回新音频的数组，这将是非常棒的！

PS 我已经开始对此进行赏金，并不是因为我认为已经给出的答案无效。我只是认为获得有关该主题的更多反馈会很好。

赏金奖励

老实说，我希望我可以将赏金分配给几个不同的答案，因为其中有很多我认为非常有帮助的答案。特别感谢 Daniel 给我传递了一些代码，以及 AShelly 和 Hotpaw2 提供了如此详细的回复。

最终我使用了 datageist 引用了另一个 SO 问题，因此该奖项授予了他。

再次感谢大家！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜柠檬 2024-10-26 18:19:38

看看Nosredna对这个（非常相似）SO问题的回答中的“大象”论文：
如何做您对重新采样的音频数据进行双三次（或其他非线性）插值吗？

从第 37 页开始提供示例实现，作为参考，AShelly 的答案对应于线性插值（在同一页上）。经过一些调整，本文中的任何其他公式都可以插入到该框架中。

要评估给定插值方法的质量（并了解使用“更便宜”方案的潜在问题），请查看此页面：

http://www.discodsp.com/highlife/aliasing/

对于比您可能想要处理的更多理论（带有源代码），这也是一个很好的参考：

< a href="https://ccrma.stanford.edu/~jos/resample/" rel="nofollow noreferrer">https://ccrma.stanford.edu/~jos/resample/

回复收藏 0 原文

柠檬心 2024-10-26 18:19:38

一种方法是在原始波形中保留浮点索引，并将插值样本混合到输出波形中。

//Simulate scratching of `inwave`: 
// `rate` is the speedup/slowdown factor. 
// result mixed into `outwave`
// "Sample" is a typedef for the raw audio type.
void ScratchMix(Sample* outwave, Sample* inwave, float rate)
{
   float index = 0;
   while (index < inputLen)
   {
      int i = (int)index;          
      float frac = index-i;      //will be between 0 and 1
      Sample s1 = inwave[i];
      Sample s2 = inwave[i+1];
      *outwave++ += s1 + (s2-s1)*frac;   //do clipping here if needed
      index+=rate;
   }

如果您

想即时更改rate，您也可以这样做。

如果当速率 > 时这会产生噪声伪影1，尝试用这种技术替换 *outwave++ += s1 + (s2-s1)*frac; (来自这个问题）

*outwave++ = InterpolateHermite4pt3oX(inwave+i-1,frac);

其中

public static float InterpolateHermite4pt3oX(Sample* x, float t)
{
    float c0 = x[1];
    float c1 = .5F * (x[2] - x[0]);
    float c2 = x[0] - (2.5F * x[1]) + (2 * x[2]) - (.5F * x[3]);
    float c3 = (.5F * (x[3] - x[0])) + (1.5F * (x[1] - x[2]));
    return (((((c3 * t) + c2) * t) + c1) * t) + c0;
}

在“Windows Startup.wav”上使用线性插值技术的示例系数为 1.1。原始版本在顶部，加速版本在底部：

它可能不会在数学上是完美的，但听起来应该并且应该能够很好地满足OP的需求。

One way is to keep a floating point index into the original wave, and mix interpolated samples into the output wave.

//Simulate scratching of `inwave`: 
// `rate` is the speedup/slowdown factor. 
// result mixed into `outwave`
// "Sample" is a typedef for the raw audio type.
void ScratchMix(Sample* outwave, Sample* inwave, float rate)
{
   float index = 0;
   while (index < inputLen)
   {
      int i = (int)index;          
      float frac = index-i;      //will be between 0 and 1
      Sample s1 = inwave[i];
      Sample s2 = inwave[i+1];
      *outwave++ += s1 + (s2-s1)*frac;   //do clipping here if needed
      index+=rate;
   }

}

If you want to change rate on the fly, you can do that too.

If this creates noisy artifacts when rate > 1, try replacing *outwave++ += s1 + (s2-s1)*frac; with this technique (from this question)

*outwave++ = InterpolateHermite4pt3oX(inwave+i-1,frac);

where

public static float InterpolateHermite4pt3oX(Sample* x, float t)
{
    float c0 = x[1];
    float c1 = .5F * (x[2] - x[0]);
    float c2 = x[0] - (2.5F * x[1]) + (2 * x[2]) - (.5F * x[3]);
    float c3 = (.5F * (x[3] - x[0])) + (1.5F * (x[1] - x[2]));
    return (((((c3 * t) + c2) * t) + c1) * t) + c0;
}

Example of using the linear interpolation technique on "Windows Startup.wav" with a factor of 1.1. The original is on top, the sped-up version is on the bottom:

It may not be mathematically perfect, but it sounds like it should, and ought to work fine for the OP's needs..

回复收藏 0 原文

旧话新听 2024-10-26 18:19:38

是的，这是可能的。

~~但这可不是少量的伪代码。您需要一个时间音调修改算法，这是一个相当大且复杂的 DSP 代码量，以获得不错的结果。~~

~~这是来自 DSP Dimensions 的时间音调拉伸概述。您还可以在 Google 上搜索相位声码器算法。~~

添加：

如果您想要“搓盘”（就像 DJ 可能在物理转盘上使用 LP 那样），则无需修改时间音调。刮盘会以相同的量改变音调和演奏速度（不像需要时间音调修改那样独立地改变）。

并且生成的数组不会具有相同的长度，而是会根据音调/速度变化而变短或变长。

您可以改变音调，也可以通过相同的比率使声音播放得更快或更慢，只需使用正确的滤波插值对信号进行重新采样即可。只需移动每个样本点，而不是移动 1.0，通过浮点加法按所需的速率变化，然后过滤并插值该点的数据。使用加窗 Sinc 插值内核进行插值，低通滤波器过渡频率低于原始采样率和插值局部采样率的较低者，效果会相当好。在网络上搜索“windowed Sinc interpolation”会返回很多合适的结果。

您需要一种包含低通滤波器的插值方法，否则您会听到可怕的混叠噪声。（例外情况可能是，如果您的原始声音文件已经经过严格低通滤波，低于采样率十年或更长时间。）