尝试使用 scikits.samplerate.resample 更改音频文件的音高会导致 pygame 产生垃圾音频

发布于 2024-12-21 06:28:59 字数 955 浏览 2 评论 0原文

我的问题与 Python 中的变调音频有关。我安装了当前的模块:numpy、scipy、pygame 和 scikits“samplerate”api。

我的目标是获取一个立体声文件并以尽可能少的步骤以不同的音高进行播放。目前,我使用 pygame.sndarray 将文件加载到数组中,然后使用 scikits.samplerate.resample 进行采样率转换,然后将输出转换回声音对象使用pygame进行播放。

问题是我的扬声器发出垃圾音频。当然,我错过了一些步骤(除了对数学和音频一无所知之外)。

import time, numpy, pygame.mixer, pygame.sndarray
from scikits.samplerate import resample

pygame.mixer.init(44100,-16,2,4096)

# choose a file and make a sound object
sound_file = "tone.wav"
sound = pygame.mixer.Sound(sound_file)

# load the sound into an array
snd_array = pygame.sndarray.array(sound)

# resample. args: (target array, ratio, mode), outputs ratio * target array.
# this outputs a bunch of garbage and I don't know why.
snd_resample = resample(snd_array, 1.5, "sinc_fastest")

# take the resampled array, make it an object and stop playing after 2 seconds.
snd_out = pygame.sndarray.make_sound(snd_resample)
snd_out.play()
time.sleep(2)

My problem is related to pitch-shifting audio in Python. I have the current modules installed: numpy, scipy, pygame, and the scikits "samplerate" api.

My goal is to take a stereo file and play it back at a different pitch in as few steps as possible. Currently, I load the file into an array using pygame.sndarray, then apply a samplerate conversion using scikits.samplerate.resample, then convert the output back to a sound object for playback using pygame.

The problem is garbage audio comes out of my speakers. Surely I'm missing a few steps (in addition to not knowing anything about math and audio).

import time, numpy, pygame.mixer, pygame.sndarray
from scikits.samplerate import resample

pygame.mixer.init(44100,-16,2,4096)

# choose a file and make a sound object
sound_file = "tone.wav"
sound = pygame.mixer.Sound(sound_file)

# load the sound into an array
snd_array = pygame.sndarray.array(sound)

# resample. args: (target array, ratio, mode), outputs ratio * target array.
# this outputs a bunch of garbage and I don't know why.
snd_resample = resample(snd_array, 1.5, "sinc_fastest")

# take the resampled array, make it an object and stop playing after 2 seconds.
snd_out = pygame.sndarray.make_sound(snd_resample)
snd_out.play()
time.sleep(2)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

寄居人 2024-12-28 06:28:59

您的问题是 pygame 使用 numpy.int16 数组,但对 resample 的调用返回 numpy.float32 数组:

>>> snd_array.dtype
dtype('int16')
>>> snd_resample.dtype
dtype('float32')

您可以转换 使用 astype 结果重新采样为 numpy.int16

>>> snd_resample = resample(snd_array, 1.5, "sinc_fastest").astype(snd_array.dtype)

通过此修改,您的 Python 脚本可以很好地播放 tone.wav 文件,位于一个较低的音调和一个较低的速度。

Your problem is that pygame works with numpy.int16 arrays but the call to resample return a numpy.float32 array:

>>> snd_array.dtype
dtype('int16')
>>> snd_resample.dtype
dtype('float32')

You can convert resample result to numpy.int16 using astype:

>>> snd_resample = resample(snd_array, 1.5, "sinc_fastest").astype(snd_array.dtype)

With this modification, your python script plays the tone.wav file nicely, at a lower pitch and a lower speed.

水晶透心 2024-12-28 06:28:59

你最好的选择可能是使用 python audiere。

这是一个链接,我用它来做同样的事情,这很简单,只需阅读所有文档即可。

http://audiere.sourceforge.net/home.php

Your best bet is probably using python audiere.

Here is a link, I used it to do the same sort of thing, it's very easy, just read all the documentation.

http://audiere.sourceforge.net/home.php

北方。的韩爷 2024-12-28 06:28:59

scikits.samplerate.resample 很可能“认为”您的音频是 16 位立体声以外的另一种格式。检查 scikits.samplerate 上的文档,了解在阵列中的何处选择正确的音频格式 -
如果它对 16 位音频进行重新采样,就会将其视为 8 位垃圾。

Most likely the scikits.samplerate.resample is "thinking" your audio is in another format than 16bit stereo. Check the documentation on scikits.samplerate on where to select the proper audio format in your array -
If it resampled 16 bit audio treating it as 8 bit garbage is what would come out.

满意归宿 2024-12-28 06:28:59

来自 scikits.samplerate.resample 文档:

如果输入的等级为 1,则使用所有数据,并假定来自单声道信号。如果等级为 2,则数字列将被假定为通道数。

因此,我认为您需要做的是将立体声数据传递到 resample 中它期望的格式:

snd_array = snd_array.reshape((-1,2))

snd_resample = resample(snd_array, 1.5, "sinc_fastest")

snd_resample = snd_resample.reshape(-1) # Flatten it out again

From the scikits.samplerate.resample documentation:

If input has rank 1, than all data are used, and are assumed to be from a mono signal. If rank is 2, the number columns will be assumed to be the number of channels.

So I think what you need to do is something like this to pass the stereo data to resample in the format it expects:

snd_array = snd_array.reshape((-1,2))

snd_resample = resample(snd_array, 1.5, "sinc_fastest")

snd_resample = snd_resample.reshape(-1) # Flatten it out again
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文