当前位置：文江博客话题详情

如何将wav文件转换为浮动幅度

发布于 2024-12-09 18:33:00 字数 134 浏览 5 评论 0 原文

所以我问了标题中的所有内容：

我有一个 wav 文件（由 PyAudio 从输入音频编写），我想将其转换为与声级（振幅）相对应的浮点数据，以进行一些傅立叶变换等...

任何人都有将 WAV 数据转换为 float 的想法？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

野生奥特曼 2024-12-16 18:33:00

我已经找到了两种不错的方法来做到这一点。

方法 1：使用 Wavefile 模块

如果您不介意安装一些额外的库，那么可以使用此方法，这些库在我的 Mac 上会有点麻烦，但在我的 Ubuntu 服务器上很容易。

https://github.com/vokimon/python-wavefile

import wavefile

# returns the contents of the wav file as a double precision float array
def wav_to_floats(filename = 'file1.wav'):
    w = wavefile.load(filename)
    return w[1][0]

signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(max(signal))

方法2：使用Wave 模块

如果您希望减少模块安装的麻烦，请使用此方法。

从文件系统读取 wav 文件并将其转换为 -1 到 1 范围内的浮点数。它适用于 16 位文件，如果它们 > 1 通道，将以与在文件中找到样本相同的方式交错样本。对于其他位深度，请根据本页底部的表将参数中的“h”更改为 struct.unpack：

https://docs.python.org/2/library/struct.html

它不适用于 24 位文件，因为没有 24 位数据类型，所以有没有办法告诉struct.unpack 做什么。

import wave
import struct
import sys

def wav_to_floats(wave_file):
    w = wave.open(wave_file)
    astr = w.readframes(w.getnframes())
    # convert binary chunks to short 
    a = struct.unpack("%ih" % (w.getnframes()* w.getnchannels()), astr)
    a = [float(val) / pow(2, 15) for val in a]
    return a

# read the wav file specified as first command line arg
signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(max(signal))

I have identified two decent ways of doing this.

Method 1: using the wavefile module

Use this method if you don't mind installing some extra libraries which involved a bit of messing around on my Mac but which was easy on my Ubuntu server.

https://github.com/vokimon/python-wavefile

import wavefile

# returns the contents of the wav file as a double precision float array
def wav_to_floats(filename = 'file1.wav'):
    w = wavefile.load(filename)
    return w[1][0]

signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(max(signal))

Method 2: using the wave module

Use this method if you want less module install hassles.

Reads a wav file from the filesystem and converts it into floats in the range -1 to 1. It works with 16 bit files and if they are > 1 channel, will interleave the samples in the same way they are found in the file. For other bit depths, change the 'h' in the argument to struct.unpack according to the table at the bottom of this page:

https://docs.python.org/2/library/struct.html

It will not work for 24 bit files as there is no data type that is 24 bit, so there is no way to tell struct.unpack what to do.

import wave
import struct
import sys

def wav_to_floats(wave_file):
    w = wave.open(wave_file)
    astr = w.readframes(w.getnframes())
    # convert binary chunks to short 
    a = struct.unpack("%ih" % (w.getnframes()* w.getnchannels()), astr)
    a = [float(val) / pow(2, 15) for val in a]
    return a

# read the wav file specified as first command line arg
signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(max(signal))

回复收藏 0 原文

毁梦 2024-12-16 18:33:00

大多数波形文件都是 PCM 16 位整数格式。

您想要做什么：

解析标头以了解其格式（检查 Xophmeister 的链接）
读取数据，获取整数值并将其转换为浮点数

整数值范围从 -32768 到 32767，您需要转换为浮点值从 -1.0 到 1.0。

我没有 python 中的代码，但是在 C++ 中，如果 PCM 数据是 16 位整数，则这里是代码摘录，并将其转换为浮点（32 位）：

short* pBuffer = (short*)pReadBuffer;

const float ONEOVERSHORTMAX = 3.0517578125e-5f; // 1/32768 
unsigned int uFrameRead = dwRead / m_fmt.Format.nBlockAlign;

for ( unsigned int i = 0; i < uFrameCount * m_fmt.Format.nChannels; ++i )
{
    short i16In = pBuffer[i];
    out_pBuffer[i] = (float)i16In * ONEOVERSHORTMAX;
}

小心立体声文件，因为立体声 PCM波形文件中的数据是交错的，这意味着数据看起来像 LRLRLRLRLRLRLRLR（而不是 LLLLLLLLRRRRRRRR）。您可能需要也可能不需要去交错，具体取决于您对数据的处理方式。

Most wave files are in PCM 16-bit integer format.

What you will want to:

Parse the header to known which format it is (check the link from Xophmeister)
Read the data, take the integer values and convert them to float

Integer values range from -32768 to 32767, and you need to convert to values from -1.0 to 1.0 in floating points.

I don't have the code in python, however in C++, here is a code excerpt if the PCM data is 16-bit integer, and convert it to float (32-bit):

short* pBuffer = (short*)pReadBuffer;

const float ONEOVERSHORTMAX = 3.0517578125e-5f; // 1/32768 
unsigned int uFrameRead = dwRead / m_fmt.Format.nBlockAlign;

for ( unsigned int i = 0; i < uFrameCount * m_fmt.Format.nChannels; ++i )
{
    short i16In = pBuffer[i];
    out_pBuffer[i] = (float)i16In * ONEOVERSHORTMAX;
}

Be careful with stereo files, as the stereo PCM data in wave files is interleaved, meaning the data looks like LRLRLRLRLRLRLRLR (instead of LLLLLLLLRRRRRRRR). You may or may not need to de-interleave depending what you do with the data.

回复收藏 0 原文

诗化ㄋ丶相逢 2024-12-16 18:33:00

我花了几个小时试图找到这个问题的答案。事实证明，解决方案非常简单：struct.unpack 就是您正在寻找的。最终代码将如下所示：

rawdata=stream.read()                  # The raw PCM data in need of conversion
from struct import unpack              # Import unpack -- this is what does the conversion
npts=len(rawdata)                      # Number of data points to be converted
formatstr='%ih' % npts                 # The format to convert the data; use '%iB' for unsigned PCM
int_data=unpack(formatstr,rawdata)     # Convert from raw PCM to integer tuple

大部分功劳都归功于解释 WAV 数据。唯一的技巧是获得正确的解包格式：它必须是正确的字节数和正确的格式（有符号或无符号）。

I spent hours trying to find the answer to this. The solution turns out to be really simple: struct.unpack is what you're looking for. The final code will look something like this:

rawdata=stream.read()                  # The raw PCM data in need of conversion
from struct import unpack              # Import unpack -- this is what does the conversion
npts=len(rawdata)                      # Number of data points to be converted
formatstr='%ih' % npts                 # The format to convert the data; use '%iB' for unsigned PCM
int_data=unpack(formatstr,rawdata)     # Convert from raw PCM to integer tuple

Most of the credit goes to Interpreting WAV Data. The only trick is getting the format right for unpack: it has to be the right number of bytes and the right format (signed or unsigned).

回复收藏 0 原文

魂ガ小子 2024-12-16 18:33:00

此版本从文件系统读取 wav 文件并将其转换为 -1 到 1 范围内的浮点数。它适用于所有样本宽度的文件，并且将以在文件中找到样本的相同方式交错样本。

import wave

def read_wav_file(filename):
    def get_int(bytes_obj):
        an_int = int.from_bytes(bytes_obj, 'little',  signed=sampwidth!=1)
        return an_int - 128 * (sampwidth == 1)
    with wave.open(filename, 'rb') as file:
        sampwidth = file.getsampwidth()
        frames = file.readframes(-1)
    bytes_samples = (frames[i : i+sampwidth] for i in range(0, len(frames), sampwidth))
    return [get_int(b) / pow(2, sampwidth * 8 - 1) for b in bytes_samples]

这里还有一个函数的链接，该函数将浮点数转换回整数并将它们写入所需的 wav 文件：

https://gto76.github.io/python-cheatsheet/#writefloatsamplestowavfile

This version reads a wav file from the filesystem and converts it into floats in the range -1 to 1. It works with files of all sample widths and it will interleave the samples in the same way they are found in the file.

import wave

def read_wav_file(filename):
    def get_int(bytes_obj):
        an_int = int.from_bytes(bytes_obj, 'little',  signed=sampwidth!=1)
        return an_int - 128 * (sampwidth == 1)
    with wave.open(filename, 'rb') as file:
        sampwidth = file.getsampwidth()
        frames = file.readframes(-1)
    bytes_samples = (frames[i : i+sampwidth] for i in range(0, len(frames), sampwidth))
    return [get_int(b) / pow(2, sampwidth * 8 - 1) for b in bytes_samples]

Also here is a link to the function that converts floats back to ints and writes them to desired wav file:

https://gto76.github.io/python-cheatsheet/#writefloatsamplestowavfile

回复收藏 0 原文