Python频率检测

发布于 2024-08-28 16:25:46 字数 273 浏览 7 评论 0原文

好吧,我想做的是一种音频处理软件,它可以检测流行频率,如果该频率播放足够长的时间(几毫秒),我知道我得到了积极的匹配。我知道我需要使用 FFT 或类似的东西,但在这个数学领域我很糟糕,我确实在互联网上搜索过,但没有找到只能做到这一点的代码。

我试图实现的目标是让自己成为一个自定义协议来通过声音发送数据,每秒需要非常低的比特率(5-10bps),但我在传输端也非常有限,因此接收软件需要能够自定义(无法使用实际的硬件/软件调制解调器)我也希望这只是软件(除了声卡之外没有其他硬件)

非常感谢您的帮助。

Ok what im trying to do is a kind of audio processing software that can detect a prevalent frequency an if the frequency is played for long enough (few ms) i know i got a positive match. i know i would need to use FFT or something simiral but in this field of math i suck, i did search the internet but didn not find a code that could do only this.

the goal im trying to accieve is to make myself a custom protocol to send data trough sound, need very low bitrate per sec (5-10bps) but im also very limited on the transmiting end so the recieving software will need to be able custom (cant use an actual hardware/software modem) also i want this to be software only (no additional hardware except soundcard)

thanks alot for the help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

孤云独去闲 2024-09-04 16:25:46

aubio 库已用 SWIG 封装,因此可以由 Python 使用。它们的众多功能包括多种音调检测/估计方法,包括 YIN算法和一些谐波梳算法。

但是,如果您想要更简单的东西,我不久前编写了一些用于音调估计的代码,您可以接受或保留它。它不会像使用aubio中的算法那么准确,但它可能足以满足您的需求。我基本上只是将数据的 FFT 乘以一个窗口(在本例中为 Blackman 窗口),对 FFT 值进行平方,找到具有最高值的 bin,并使用最大值的对数在峰值周围进行二次插值及其两个相邻值来找到基频。我从我找到的一些论文中得到了二次插值。

它在测试音上工作得相当好,但它不如上面提到的其他方法那么强大或准确。可以通过增加块大小来提高准确性(或通过减少块大小来降低准确性)。块大小应为 2 的倍数才能充分利用 FFT。另外,我只是确定每个块的基本音高,没有重叠。我使用 PyAudio 播放声音,同时写出估计的音高。

源代码:

# Read in a WAV and find the freq's
import pyaudio
import wave
import numpy as np

chunk = 2048

# open up a wave
wf = wave.open('test-tones/440hz.wav', 'rb')
swidth = wf.getsampwidth()
RATE = wf.getframerate()
# use a Blackman window
window = np.blackman(chunk)
# open stream
p = pyaudio.PyAudio()
stream = p.open(format =
                p.get_format_from_width(wf.getsampwidth()),
                channels = wf.getnchannels(),
                rate = RATE,
                output = True)

# read some data
data = wf.readframes(chunk)
# play stream and find the frequency of each chunk
while len(data) == chunk*swidth:
    # write data out to the audio stream
    stream.write(data)
    # unpack the data and times by the hamming window
    indata = np.array(wave.struct.unpack("%dh"%(len(data)/swidth),\
                                         data))*window
    # Take the fft and square each value
    fftData=abs(np.fft.rfft(indata))**2
    # find the maximum
    which = fftData[1:].argmax() + 1
    # use quadratic interpolation around the max
    if which != len(fftData)-1:
        y0,y1,y2 = np.log(fftData[which-1:which+2:])
        x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
        # find the frequency and output it
        thefreq = (which+x1)*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    else:
        thefreq = which*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    # read some more data
    data = wf.readframes(chunk)
if data:
    stream.write(data)
stream.close()
p.terminate()

The aubio libraries have been wrapped with SWIG and can thus be used by Python. Among their many features include several methods for pitch detection/estimation including the YIN algorithm and some harmonic comb algorithms.

However, if you want something simpler, I wrote some code for pitch estimation some time ago and you can take it or leave it. It won't be as accurate as using the algorithms in aubio, but it might be good enough for your needs. I basically just took the FFT of the data times a window (a Blackman window in this case), squared the FFT values, found the bin that had the highest value, and used a quadratic interpolation around the peak using the log of the max value and its two neighboring values to find the fundamental frequency. The quadratic interpolation I took from some paper that I found.

It works fairly well on test tones, but it will not be as robust or as accurate as the other methods mentioned above. The accuracy can be increased by increasing the chunk size (or reduced by decreasing it). The chunk size should be a multiple of 2 to make full use of the FFT. Also, I am only determining the fundamental pitch for each chunk with no overlap. I used PyAudio to play the sound through while writing out the estimated pitch.

Source Code:

# Read in a WAV and find the freq's
import pyaudio
import wave
import numpy as np

chunk = 2048

# open up a wave
wf = wave.open('test-tones/440hz.wav', 'rb')
swidth = wf.getsampwidth()
RATE = wf.getframerate()
# use a Blackman window
window = np.blackman(chunk)
# open stream
p = pyaudio.PyAudio()
stream = p.open(format =
                p.get_format_from_width(wf.getsampwidth()),
                channels = wf.getnchannels(),
                rate = RATE,
                output = True)

# read some data
data = wf.readframes(chunk)
# play stream and find the frequency of each chunk
while len(data) == chunk*swidth:
    # write data out to the audio stream
    stream.write(data)
    # unpack the data and times by the hamming window
    indata = np.array(wave.struct.unpack("%dh"%(len(data)/swidth),\
                                         data))*window
    # Take the fft and square each value
    fftData=abs(np.fft.rfft(indata))**2
    # find the maximum
    which = fftData[1:].argmax() + 1
    # use quadratic interpolation around the max
    if which != len(fftData)-1:
        y0,y1,y2 = np.log(fftData[which-1:which+2:])
        x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
        # find the frequency and output it
        thefreq = (which+x1)*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    else:
        thefreq = which*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    # read some more data
    data = wf.readframes(chunk)
if data:
    stream.write(data)
stream.close()
p.terminate()
陈年往事 2024-09-04 16:25:46

如果您要使用FSK(频移键控) 来编码数据,您'使用 Goertzel 算法 可能会更好,这样您就可以只检查您想要的频率,而不是完整的 DFT/FFT。

If you're going to use FSK (frequency shift keying) for encoding data, you're probably better off using the Goertzel algorithm so you can check just the frequencies you want, instead of a full DFT/FFT.

你列表最软的妹 2024-09-04 16:25:46

您可以从此处找到声音上滑动窗口的频谱然后通过查找 此处

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import auc
np.random.seed(0)

# Sine sample with a frequency of 5hz and add some noise
sr = 32  # sampling rate
y = np.linspace(0, 5 * 2*np.pi, sr)
y = np.tile(np.sin(y), 5)
y += np.random.normal(0, 1, y.shape)
t = np.arange(len(y)) / float(sr)

# Generate frquency spectrum
spectrum, freqs, _ = plt.magnitude_spectrum(y, sr)

# Calculate percentage for a frequency range 
lower_frq, upper_frq = 4, 6
ind_band = np.where((freqs > lower_frq) & (freqs < upper_frq))
plt.fill_between(freqs[ind_band], spectrum[ind_band], color='red', alpha=0.6)
frq_band_perc = auc(freqs[ind_band], spectrum[ind_band]) / auc(freqs, spectrum)
print('{:.1%}'.format(frq_band_perc))
# 19.8%

输入图片此处描述

You can find the frequency spectrum of the sliding windows over your sound from here and then check the presence of the prevalent frequency band via finding the area under the frequency spectrum curve for that band from here.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import auc
np.random.seed(0)

# Sine sample with a frequency of 5hz and add some noise
sr = 32  # sampling rate
y = np.linspace(0, 5 * 2*np.pi, sr)
y = np.tile(np.sin(y), 5)
y += np.random.normal(0, 1, y.shape)
t = np.arange(len(y)) / float(sr)

# Generate frquency spectrum
spectrum, freqs, _ = plt.magnitude_spectrum(y, sr)

# Calculate percentage for a frequency range 
lower_frq, upper_frq = 4, 6
ind_band = np.where((freqs > lower_frq) & (freqs < upper_frq))
plt.fill_between(freqs[ind_band], spectrum[ind_band], color='red', alpha=0.6)
frq_band_perc = auc(freqs[ind_band], spectrum[ind_band]) / auc(freqs, spectrum)
print('{:.1%}'.format(frq_band_perc))
# 19.8%

enter image description here

も星光 2024-09-04 16:25:46

虽然我之前没有尝试过使用 Python 进行音频处理,但也许您可以基于 SciPy (或其子项目 NumPy),高效科学/工程数值计算的框架?您可以首先查看 FFT 的 scipy.fftpack

While I haven't tried audio processing with Python before, perhaps you could build something based on SciPy (or its subproject NumPy), a framework for efficient scientific/engineering numerical computation? You might start by looking at scipy.fftpack for your FFT.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文