对数字音频进行下采样并应用低通滤波器

发布于 2024-07-08 02:51:55 字数 468 浏览 9 评论 0原文

我从 CD 中获得了 44Khz 音频流，表示为 16 位 PCM 样本数组。我想将其削减至 11KHz 流。我怎么做？从多年前我上工程课时起，我就知道流将无法再准确地描述超过 5500Hz 的任何内容，因此我想我也想删除高于此的所有内容。有任何想法吗？谢谢。

更新：此页面上有一些代码使用简单的算法和看起来像 { 1, 4, 12, 12, 4, 1 } 的系数数组从 48KHz 转换为 8KHz。我认为这就是我所需要的，但我需要 4 倍而不是 6 倍。知道这些常数是如何计算的吗？另外，无论如何，我最终都会将 16 字节样本转换为浮点数，因此我可以使用浮点数而不是 Shorts 进行下采样，如果这对质量有帮助的话。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

山川志 2024-07-15 02:51:55

阅读 FIR 和 IIR 滤波器。这些是使用系数数组的滤波器。

如果您在 Google 上搜索“FIR 或 IIR 滤波器设计器”，您会发现许多软件和在线小程序可以为您完成艰巨的工作（获取系数）。

编辑：

此页面位于此处 (http:// www-users.cs.york.ac.uk/~fisher/mkfilter/ ）让您输入过滤器的参数，并会输出准备使用的 C 代码...

回复收藏 0 原文

攒眉千度 2024-07-15 02:51:55

你是对的，你需要对你的信号应用低通滤波。任何超过 5500 Hz 的信号都将出现在您的下采样信号中，但会“混叠”为另一个频率，因此您必须在下采样之前将其删除。

使用浮动进行过滤是个好主意。也有定点滤波器算法，但这些算法通常需要进行质量权衡才能发挥作用。如果您有漂浮物，请使用它们！

使用 DFT 进行过滤通常是过度的，它使事情变得更加复杂，因为 dft 不是一个连续的过程，而是在缓冲区上工作。

数字滤波器通常有两种类型。 FIR 和 IIR。它们的想法大致相同，但 IIF 滤波器使用反馈环路以更少的系数实现更陡峭的响应。这对于下采样可能是一个好主意，因为您需要一个非常陡峭的滤波器斜率。

下采样是一种特殊情况。因为您将丢弃 4 个样本中的 3 个，所以无需计算它们。为此有一类特殊的滤波器，称为多相滤波器。

尝试在谷歌上搜索多相 IIR 或多相 FIR 以获取更多信息。

回复收藏 0 原文

枯叶蝶 2024-07-15 02:51:55

请注意（除了其他评论之外）简单、简单、直观的方法“通过将每组 4 个连续样本替换为平均值，将采样降低 4 倍”不是最佳的，但可以然而，无论是在实践上还是在概念上都没有错。因为平均精确地等于低通滤波器（矩形窗口，对应于频率的 sinc）。从概念上讲，错误的是仅通过每 4 个样本中的一个进行下采样：这肯定会引入混叠。

顺便说一句：几乎任何进行重采样的软件（音频、图像或其他；音频案例的示例：sox）都会考虑到这一点，并经常让您选择底层的低通滤波器。

回复收藏 0 原文

嘴硬脾气大 2024-07-15 02:51:55

在对信号进行下采样之前，您需要应用低通滤波器以避免“混叠”。低通滤波器的截止频率应小于奈奎斯特频率，即采样频率的一半。

回复收藏 0 原文

幸福还没到 2024-07-15 02:51:55

可能的“最佳”解决方案确实是 DFT，丢弃顶部 3/4 的频率，并执行逆 DFT，并将域限制为底部 1/4。在这种情况下，丢弃顶部 3/4 就是低通滤波器。填充样本数的 2 次幂可能会给您带来速度优势。不过，请注意您的 FFT 包如何存储样本。如果它是复杂的 FFT（更容易分析，并且通常具有更好的特性），频率将从 -22 到 22，或 0 到 44。在第一种情况下，您需要中间的 1/4。后者，最外面的1/4。

您可以通过对样本值进行平均来完成足够的工作。获取四乘四的样本并进行相等加权平均的简单方法是有效的，但并不是太好。相反，您需要使用“内核”函数以非直观的方式将它们平均在一起。

从数学角度来说，丢弃低频段之外的所有内容就是乘以频率空间中的盒函数。（逆）傅立叶变换将逐点乘法转换为函数的（逆）傅立叶变换的卷积，反之亦然。因此，如果我们想在时域中工作，我们需要使用盒函数的（逆）傅里叶变换来执行卷积。结果与“sinc”函数 (sin at)/at 成正比，其中 a 是频率空间中盒子的宽度。因此，在每第 4 个位置（因为您要按 4 倍进行下采样），您可以将其附近的点相加，乘以 sin (a dt) / a dt，其中 dt 是到该位置的时间距离。多近？嗯，这取决于您希望它听起来有多好。例如，通常会忽略第一个零之外的所有内容，或者仅将点数作为下采样的比率。

最后，还有一种糟糕（但快速）的方法，即丢弃大部分样本，只保留第 0 个、第 4 个样本，依此类推。

老实说，如果它适合内存，我建议只走 DFT 路线。如果它不使用其他人推荐的软件过滤器包之一来为您构建过滤器。

The "best" solution possible is indeed a DFT, discarding the top 3/4 of the frequencies, and performing an inverse DFT, with the domain restricted to the bottom 1/4th. Discarding the top 3/4ths is a low-pass filter in this case. Padding to a power of 2 number of samples will probably give you a speed benefit. Be aware of how your FFT package stores samples though. If it's a complex FFT (which is much easier to analyze, and generally has nicer properties), the frequencies will either go from -22 to 22, or 0 to 44. In the first case, you want the middle 1/4th. In the latter, the outermost 1/4th.

You can do an adequate job by averaging sample values together. The naïve way of grabbing samples four by four and doing an equal weighted average works, but isn't too great. Instead you'll want to use a "kernel" function that averages them together in a non-intuitive way.

Mathwise, discarding everything outside the low-frequency band is multiplication by a box function in frequency space. The (inverse) Fourier transform turns pointwise multiplication into a convolution of the (inverse) Fourier transforms of the functions, and vice-versa. So, if we want to work in the time domain, we need to perform a convolution with the (inverse) Fourier transform of box function. This turns out to be proportional to the "sinc" function (sin at)/at, where a is the width of the box in the frequency space. So at every 4th location (since you're downsampling by a factor of 4) you can add up the points near it, multiplied by sin (a dt) / a dt, where dt is the distance in time to that location. How nearby? Well, that depends on how good you want it to sound. It's common to ignore everything outside the first zero, for instance, or just take the number of points to be the ratio by which you're downsampling.

Finally there's the piss-poor (but fast) way of just discarding the majority of the samples, keeping just the zeroth, the fourth, and so on.

Honestly, if it fits in memory, I'd recommend just going the DFT route. If it doesn't use one of the software filter packages that others have recommended to construct the filter for you.

回复收藏 0 原文