改善两个波形文件的连接?

发布于 2024-09-05 11:14:11 字数 1015 浏览 7 评论 0原文

我编写了一个用于连接两个波形文件的代码。当我连接较大的段时,它工作正常,但由于我需要连接非常小的段,因此清晰度不好。

我了解到诸如窗口连接之类的信号处理技术可用于改进文件的连接。

y[n] = w[n]s[n] 将样本号 n 处的信号值乘以窗函数的值 汉明窗 w[n]= .54 - .46*cos(2*Pi*n)/L 0

我不明白如何获取样本 n 处的信号值以及如何实现这一点?

the code i am using for joining is

import wave
m=['C:/begpython/S0001_0002.wav', 'C:/begpython/S0001_0001.wav']
i=1
a=m[i]
infiles = [a, "C:/begpython/S0001_0002.wav", a]
outfile = "C:/begpython/S0001_00367.wav"

data= []
data1=[]
for infile in infiles:
    w = wave.open(infile, 'rb')
    data1=[w.getnframes]
    data.append( [w.getparams(), w.readframes(w.getnframes())] )
    #data1 = [ord(character) for character in data1]

    #print data1
    #data1 = ''.join(chr(character) for character in data1)

    w.close()

output = wave.open(outfile, 'wb')
output.setparams(data[0][0])
output.writeframes(data[0][1])
output.writeframes(data[1][1])
output.writeframes(data[2][1])
output.close()

在加入过程中,我正在使用帧的字节格式进行操作。我猜现在必须使用整数或浮点格式对它们执行操作,如果我的想法是正确的,我该怎么做?

I have written a code for joining two wave files.It works fine when i am joining larger segments but as i need to join very small segments the clarity is not good.

I have learned that the signal processing technique such a windowed join can be used to improve the joining of file.

y[n] = w[n]s[n]
Multiply value of signal at sample number n by the value of a windowing function
hamming window w[n]= .54 - .46*cos(2*Pi*n)/L 0

I am not understanding how to get the value to signal at sample n and how to implement this??

the code i am using for joining is

import wave
m=['C:/begpython/S0001_0002.wav', 'C:/begpython/S0001_0001.wav']
i=1
a=m[i]
infiles = [a, "C:/begpython/S0001_0002.wav", a]
outfile = "C:/begpython/S0001_00367.wav"

data= []
data1=[]
for infile in infiles:
    w = wave.open(infile, 'rb')
    data1=[w.getnframes]
    data.append( [w.getparams(), w.readframes(w.getnframes())] )
    #data1 = [ord(character) for character in data1]

    #print data1
    #data1 = ''.join(chr(character) for character in data1)

    w.close()

output = wave.open(outfile, 'wb')
output.setparams(data[0][0])
output.writeframes(data[0][1])
output.writeframes(data[1][1])
output.writeframes(data[2][1])
output.close()

during joining i am manipulating using byte format for frames.now have to use integer or float format to perform operation on them i guess,if what i am thinking is true,how can i do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

三生路 2024-09-12 11:14:15

为了让您有一个高层次的理解,WAV 音频格式由 44 字节标头组成,您可以在其中定义必要的元数据,例如采样率、通道数等,后面是实际音频数据所在的有效负载。音频只是振幅随时间变化的曲线。 WAV 格式允许该幅度从最大值 +1.0 变化到最小值 -1.0(以浮点表示)。当进行音频记录时,该幅度通常每秒测量 44100 次(采样率)。所以WAV文件只存储这一系列的样本值。 WAV 格式不存储浮点数,而是将 +1 到 -1 的范围存储为 0 到 2^16 之间的整数。这些 16 位样本每个样本需要两个字节的文件存储空间。在上面的示例代码中,i>>8 将音频值移位 8 位。如果您考虑这些想法,并编写自己的 WAV 格式代码来读取或写入文件,您将能够很好地回答您的问题。

To give you a high level understanding, WAV audio format consists of a 44 byte header where you define necessary meta data like sample rate, number of channels, etc. followed by the payload where the actual audio data lives. Audio is simply a curve of amplitude change over time. WAV format permits this amplitude to vary from a maximum value of +1.0 to minimum of -1.0 as expressed as a floating point. As an audio recording is made this amplitude is measured typically 44100 times per second (sample rate). So a WAV file just stores this series of sample values. The WAV format does NOT store floating points, instead it stores the range of +1 to -1 as integers ranging from 0 to 2^16. These 16 bit samples require two bytes of file storage per sample. In example code like above the i>>8 is shifting the audio values by 8 bits. If you think about these ideas, and write your own WAV format code to read or write from/to files you'll be well on your way to being able to answer your question.

悲喜皆因你 2024-09-12 11:14:14

这可能不是最好的解决方案,但我确信它会起作用。也许你会发现现有的库或其他一些步骤,我不知道Python。我建议的步骤是:

  1. 加载波形文件。
  2. 创建样本值(幅度)
    对于每一帧(取决于帧
    大小,小/大端,
    已签名/未签名)。
  3. 将结果数组除以 int
    值进入窗口,例如示例
    0-511, 512-1023, ...
  4. 执行窗口函数,对于
    您想要加入的窗口。
  5. 做你的加入。
  6. 将窗口存储回字节中
    数组的逆运算
    第一步。

旧帖子:
你必须计算样本值,在java中,2字节/帧声音文件的函数看起来像这样:

public static int createIntFrom16( byte _8Bit1, byte _8Bit2 ) {
    return ( 8Bit1<<8 ) | ( 8Bit2 &0x00FF );
}

通常你必须关心文件是否使用小端,我不知道Python库是否会考虑到这一点。

创建所有样本值后,您必须将文件划分为窗口,例如大小为 512 个样本的窗口。然后您可以对值进行窗口化,并创建回字节值。对于 16 位,它看起来像这样:

public static byte[] createBytesFromInt(int i) {
    byte[] bytes = new byte[2];
    bytes[0]=(byte)(i>>8);
    bytes[1]=(byte)i;
    return bytes;
}

It's probably not the best solution, but I'm sure it will work. Maybe you find existing libs or so for some steps, I dont know for Python. The steps I suggest are:

  1. Load the wave file.
  2. Create the sample values (amplitude)
    for each frame (depending on frame
    size, litte/big endian,
    signed/unsigned).
  3. Divide the resulting array of int
    values into windows, e.g. sample
    0-511, 512-1023, ...
  4. Perform the window function, for the
    windows that you want to join.
  5. Do your joining.
  6. Store the windows back in a byte
    array, the inverse operation of the
    first step.

Old Post:
You have to calculate the sample value, in java a function for a 2 byte/frame soundfile would look like this:

public static int createIntFrom16( byte _8Bit1, byte _8Bit2 ) {
    return ( 8Bit1<<8 ) | ( 8Bit2 &0x00FF );
}

Normally you will have to care about whether or not the file uses little endian, I don't know if the Python lib will take this into account.

Once you have created all sample values, you have to divide your file into windows, e.g. of size 512 samples. Then you can window the values, and create back the byte values. For 16bit it would look like this:

public static byte[] createBytesFromInt(int i) {
    byte[] bytes = new byte[2];
    bytes[0]=(byte)(i>>8);
    bytes[1]=(byte)i;
    return bytes;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文