清理嘈杂的倒谱结果

发布于 2024-10-21 18:46:07 字数 1086 浏览 10 评论 0原文

我一直在研究 iPhone 上的简单频率检测设置。在存在谐波的情况下，使用 FFT 结果进行频域分析有些不可靠。我希望使用倒谱结果来帮助确定正在播放的基频。

我正在 AudioToolbox 框架中使用 AudioQueues，并使用 Accelerate 框架进行傅里叶变换。

我的过程正是维基百科倒谱文章中列出的实功率倒谱，具体来说：信号→FT→abs()→平方→log→FT→abs()→平方→功率倒谱。

我遇到的问题是倒谱结果非常嘈杂。我必须删除前 20 个值，因为与其他值相比，它们是天文数字。即使在“清理”数据之后，仍然存在巨大的变化 - 远远超出我对第一张图的预期。请参阅下图，了解频域和频域的可视化。 FFT FFT 倒谱

当我在频域中看到如图所示的明显获胜者时，我希望看到频率域中的结果同样清晰。我玩过 A440，预计 bin 82 左右的震级最高。图表中的第三个峰值代表 bin 79，它足够接近。正如我所说，前 20 个左右的 bin 数量如此之大，以至于无法使用，我必须将它们从数据集中删除才能看到任何内容。倒谱数据的另一个奇怪的质量是偶数箱似乎比奇数箱高得多。以下是 77-86 的频率箱：

77: 151150.0313
78:  22385.92773
79: 298753.1875
80:  56532.72656
81: 114177.4766
82:  31222.88281
83:   4620.785156
84:  13382.5332
85:     83.668259
86: 1205.023193

我的问题是如何清理频域，以便我的倒谱域结果不那么疯狂。或者，帮助我更好地理解如何解释这些结果（如果它们符合倒谱分析中的预期）。我可以发布我正在使用的代码示例，但它主要使用 vDSP 调用，我不知道这会有多大帮助。

原文

I've been working on a simple frequency detection setup on the iphone. Analyzing in the frequency domain using FFT results has been somewhat unreliable in the presence of harmonics. I was hoping to use Cepstrum results to help decide what fundamental frequency is playing.

I am working with AudioQueues in the AudioToolbox framework, and do the Fourier transforms using the Accelerate framework.

My process has been exactly what is listed on Wikipedia's Cepstrum article for the Real Power Cepstrum, specifically: signal → FT → abs() → square → log → FT → abs() → square → power cepstrum.

The problem I have is that the Cepstrum results are extremely noisy. I have to drop the first and last 20 values as they are astronomical compared to the other values. Even after "cleaning" the data, there is still a huge amount of variation - far more than I would expect given the first graph. See the pictures below for the visualizations of the frequency domain and the quefrency domain.
FFT
FFT

Cepstrum

When I see such a clear winner in the frequency domain as on that graph, I expect to see a similarly clear result in the quefrency domain. I played A440 and would expect bin 82 or so to have the highest magnitude. The third peak on the graph represents bin 79, which is close enough. As I said, the first 20 or so bins are so astronomical in magnitude as to be unusuable, and I had to delete them from the data set in order to see anything. Another odd quality of the cepstrum data is that the even bins seem to be much higher than the odd bins. Here are the frequency bins from 77-86:

77: 151150.0313
78:  22385.92773
79: 298753.1875
80:  56532.72656
81: 114177.4766
82:  31222.88281
83:   4620.785156
84:  13382.5332
85:     83.668259
86: 1205.023193

My question is how to clean up the frequency domain so that my Cepstrum domain results are not so wild. Alternately, help me better understand how to interpret these results if they are as one would expect in a Cepstrum analysis. I can post examples of the code I'm using, but it mostly uses vDSP calls and I don't know how helpful that would be.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

南薇 2024-10-28 18:46:07

倒谱或倒谱分析是一种用于尝试将具有高泛音内容的信号分成两部分的技术。 DC 附近的部分表示所有泛音或语音共振峰的频谱包络，这可能对说话人或乐器识别有用。倒谱结果中的后续峰值表示激励器频率或多个频率（如果该频率产生足够的谐波泛音内容）。

由于倒谱通常是在没有任何（非矩形）窗口的情况下完成的，因此它甚至可以对干净的泛音序列产生 Sinc 响应，响应的宽度大致与泛音序列的长度或泛音数量成反比。当然，任何稍微不和谐的泛音（如在实际乐器中发现的）都会使倒谱结果更加混乱。因此，倒谱峰值可能只擅长给出基频的大致位置，这仍然是在进行频率估计时拒绝其他候选频率的有用结果。

“看起来干净”的倒谱可能是非常长的精确谐波泛音序列的结果，具有几乎平坦的频率响应，这可能不是现实生活信号中发现的。

回复收藏 0 原文