当前位置：文江博客话题详情

Python/SciPy 的寻峰算法

发布于 2024-08-10 19:03:04 字数 1009 浏览 14 评论 0原文

我可以通过查找一阶导数的零交叉或其他东西来自己编写一些东西，但这似乎是一个足够常见的函数，可以包含在标准库中。有人知道其中一个吗？

我的特定应用是二维数组，但通常它将用于在 FFT 等中查找峰值。

具体来说，在此类问题中，存在多个强峰值，然后有许多由噪声引起的较小“峰值”这应该被忽略。这些只是例子；不是我的实际数据：

一维峰值：

二维峰值：

带圆圈峰值的 Radon 变换输出

峰值查找算法将找到这些峰值的位置（而不仅仅是它们的值），并且理想情况下会找到真正的样本间峰值，而不仅仅是具有最大值的索引，可能使用二次插值之类的。

通常，您只关心一些强峰值，因此选择它们要么是因为它们高于特定阈值，要么是因为它们是有序列表的前 n 个峰值，按幅度排名。

正如我所说，我自己知道如何写这样的东西。我只是问是否有一个已知运行良好的预先存在的函数或包。

更新：

我翻译了一个 MATLAB 脚本，它对于一维模型工作得很好情况，但可能会更好。

更新的更新：

Sixtenbe 为一维案例创建了更好的版本。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

羁拥 2024-08-17 19:03:04

函数 scipy.signal.find_peaks，顾名思义，对此很有用。但重要的是要充分理解其参数宽度、阈值、距离以及最重要的突出

根据我的测试和文档，突出的概念是保留良好峰值并丢弃噪声峰值的“有用概念”。

什么是（地形）突出？这是“从山顶到达任何更高地形所需的最低高度”，如下所示：

这个想法是：

显着性越高，峰值就越“重要”。

测试：

我故意使用了（嘈杂的）频率变化的正弦曲线，因为它显示出许多困难。我们可以看到 width 参数在这里不是很有用，因为如果您将最小 width 设置得太高，那么它将无法跟踪非常接近的峰值高频部分。如果您将宽度设置得太低，信号的左侧部分将会出现许多不需要的峰值。 距离也有同样的问题。 threshold 仅与直接邻居进行比较，这在这里没有用。 prominence 是提供最佳解决方案的解决方案。请注意，您可以组合其中许多参数！

代码：

import numpy as np
import matplotlib.pyplot as plt 
from scipy.signal import find_peaks

x = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000) * 0.15
peaks, _ = find_peaks(x, distance=20)
peaks2, _ = find_peaks(x, prominence=1)      # BEST!
peaks3, _ = find_peaks(x, width=20)
peaks4, _ = find_peaks(x, threshold=0.4)     # Required vertical distance to its direct neighbouring samples, pretty useless
plt.subplot(2, 2, 1)
plt.plot(peaks, x[peaks], "xr"); plt.plot(x); plt.legend(['distance'])
plt.subplot(2, 2, 2)
plt.plot(peaks2, x[peaks2], "ob"); plt.plot(x); plt.legend(['prominence'])
plt.subplot(2, 2, 3)
plt.plot(peaks3, x[peaks3], "vg"); plt.plot(x); plt.legend(['width'])
plt.subplot(2, 2, 4)
plt.plot(peaks4, x[peaks4], "xk"); plt.plot(x); plt.legend(['threshold'])
plt.show()

The function scipy.signal.find_peaks, as its name suggests, is useful for this. But it's important to understand well its parameters width, threshold, distance and above all prominence to get a good peak extraction.

According to my tests and the documentation, the concept of prominence is "the useful concept" to keep the good peaks, and discard the noisy peaks.

What is (topographic) prominence? It is "the minimum height necessary to descend to get from the summit to any higher terrain", as it can be seen here:

The idea is:

The higher the prominence, the more "important" the peak is.

Test:

I used a (noisy) frequency-varying sinusoid on purpose because it shows many difficulties. We can see that the width parameter is not very useful here because if you set a minimum width too high, then it won't be able to track very close peaks in the high frequency part. If you set width too low, you would have many unwanted peaks in the left part of the signal. Same problem with distance. threshold only compares with the direct neighbours, which is not useful here. prominence is the one that gives the best solution. Note that you can combine many of these parameters!

Code:

import numpy as np
import matplotlib.pyplot as plt 
from scipy.signal import find_peaks

x = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000) * 0.15
peaks, _ = find_peaks(x, distance=20)
peaks2, _ = find_peaks(x, prominence=1)      # BEST!
peaks3, _ = find_peaks(x, width=20)
peaks4, _ = find_peaks(x, threshold=0.4)     # Required vertical distance to its direct neighbouring samples, pretty useless
plt.subplot(2, 2, 1)
plt.plot(peaks, x[peaks], "xr"); plt.plot(x); plt.legend(['distance'])
plt.subplot(2, 2, 2)
plt.plot(peaks2, x[peaks2], "ob"); plt.plot(x); plt.legend(['prominence'])
plt.subplot(2, 2, 3)
plt.plot(peaks3, x[peaks3], "vg"); plt.plot(x); plt.legend(['width'])
plt.subplot(2, 2, 4)
plt.plot(peaks4, x[peaks4], "xk"); plt.plot(x); plt.legend(['threshold'])
plt.show()

回复收藏 0 原文

胡渣熟男 2024-08-17 19:03:04

我正在研究类似的问题，并且发现一些最好的参考来自化学（来自质谱数据中发现的峰）。要全面了解峰值查找算法，请阅读此。这是我遇到过的对寻峰技术最好、最清晰的评论之一。（小波最适合在噪声数据中查找此类峰值。）。

看起来您的峰值清晰可见，并且没有隐藏在噪音中。在这种情况下，我建议使用平滑的 savtizky-golay 导数来查找峰值（如果您只是区分上面的数据，您将得到一堆误报。）。这是一种非常有效的技术，并且非常容易实现（您确实需要一个带有基本操作的矩阵类）。如果您只是找到第一个 SG 导数的零交叉点，我想您会很高兴。

回复收藏 0 原文

筱武穆 2024-08-17 19:03:04

scipy 中有一个名为 scipy.signal.find_peaks_cwt 的函数，听起来很适合您的需求，但我没有使用它的经验，所以我不能推荐..

http://docs.scipy.org/doc/scipy/reference/ generated/scipy。 signal.find_peaks_cwt.html

回复收藏 0 原文

弄潮 2024-08-17 19:03:04

对于那些不确定在 Python 中使用哪种峰值查找算法的人，这里是替代方案的快速概述：https: //github.com/MonsieurV/py-findpeaks

希望自己有一个与 MatLab findpeaks 函数等效的函数，我发现 detect_peaks 函数是一个很好的收获。

非常容易使用：

import numpy as np
from vector import vector, plot_peaks
from libs import detect_peaks
print('Detect peaks with minimum height and distance filters.')
indexes = detect_peaks.detect_peaks(vector, mph=7, mpd=2)
print('Peaks are: %s' % (indexes))

这将为您提供：

For those not sure about which peak-finding algorithms to use in Python, here a rapid overview of the alternatives: https://github.com/MonsieurV/py-findpeaks

Wanting myself an equivalent to the MatLab findpeaks function, I've found that the detect_peaks function from Marcos Duarte is a good catch.

Pretty easy to use:

import numpy as np
from vector import vector, plot_peaks
from libs import detect_peaks
print('Detect peaks with minimum height and distance filters.')
indexes = detect_peaks.detect_peaks(vector, mph=7, mpd=2)
print('Peaks are: %s' % (indexes))

Which will give you:

回复收藏 0 原文

倾其所爱 2024-08-17 19:03:04

要检测正峰值和负峰值，PeakDetect 很有帮助。

from peakdetect import peakdetect

peaks = peakdetect(data, lookahead=20) 
# Lookahead is the distance to look ahead from a peak to determine if it is the actual peak. 
# Change lookahead as necessary 
higherPeaks = np.array(peaks[0])
lowerPeaks = np.array(peaks[1])
plt.plot(data)
plt.plot(higherPeaks[:,0], higherPeaks[:,1], 'ro')
plt.plot(lowerPeaks[:,0], lowerPeaks[:,1], 'ko')

To detect both positive and negative peaks, PeakDetect is helpful.

from peakdetect import peakdetect

peaks = peakdetect(data, lookahead=20) 
# Lookahead is the distance to look ahead from a peak to determine if it is the actual peak. 
# Change lookahead as necessary 
higherPeaks = np.array(peaks[0])
lowerPeaks = np.array(peaks[1])
plt.plot(data)
plt.plot(higherPeaks[:,0], higherPeaks[:,1], 'ro')
plt.plot(lowerPeaks[:,0], lowerPeaks[:,1], 'ko')

回复收藏 0 原文

留蓝 2024-08-17 19:03:04

以可靠的方式检测频谱中的峰值已经进行了大量研究，例如 80 年代音乐/音频信号正弦建模的所有工作。在文献中查找“正弦建模”。

如果您的信号像示例一样干净，那么简单的“给我一些幅度高于 N 个邻居的信号”应该可以很好地工作。如果您有噪声信号，一种简单但有效的方法是及时查看峰值并跟踪它们：然后检测谱线而不是谱峰。 IOW，您在信号的滑动窗口上计算 FFT，以及时获得一组频谱（也称为频谱图）。然后，您可以查看光谱峰值随时间的演变（即在连续窗口中）。

回复收藏 0 原文