波形文件如何存储多个通道?

发布于 2024-09-04 03:17:23 字数 569 浏览 3 评论 0原文

我使用 Audacity 创建了两个波形文件。两者都具有 44100hz 采样率、32 位浮点样本,保存为 WAV (Microsoft) 16 位签名并包含 1 秒静音(根据 Audacity)。区别在于一个文件包含一个通道,而另一个文件包含两个通道(立体声)。当读取一个通道文件时,我得到这样的帧:

0x00 0x00  
...  ...  

正如预期的那样,但是当读取第二个文件时,我得到:

0x00 0x00 0x00 0x00  
0x01 0x00 0xff 0xff  
0x00 0x00 0x00 0x00  
0x00 0x00 0x01 0x00  
0xff 0xff 0x01 0x00  
0xfe 0xff 0x03 0x00  

这对我来说似乎是一种随机模式。它与波形文件中通道的存储方式有关吗?不应该是这样的吗:

0x00 0x00 0x00 0x00  
...  ...  ...  ...  

PS:我使用 python 内置模块“wave”来读取文件。

I've created two wave files using Audacity. Both have 44100hz sample rate, 32-bit float samples, were saved as WAV (Microsoft) 16-bit signed and contain 1s of silence (according to Audacity). The difference is that one file contains one channel, while the other have two (stereo). When reading the one channel file I got frames like this:

0x00 0x00  
...  ...  

Just as expected, but when reading the second file I got:

0x00 0x00 0x00 0x00  
0x01 0x00 0xff 0xff  
0x00 0x00 0x00 0x00  
0x00 0x00 0x01 0x00  
0xff 0xff 0x01 0x00  
0xfe 0xff 0x03 0x00  

This seems to be a random pattern to me. It has something to do with the way channels are stored within the wave file? Shouldn't it be something like:

0x00 0x00 0x00 0x00  
...  ...  ...  ...  

?

PS: I have used python builtin module 'wave' to read the files.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

热血少△年 2024-09-11 03:17:23

数据不是随机的

看着它,我似乎每行看到 2 个 int 值,每行 2 个字节,采用小端:

0x00 0x00 0x00 0x00  
0x01 0x00 0xff 0xff  
0x00 0x00 0x00 0x00  
0x00 0x00 0x01 0x00  
0xff 0xff 0x01 0x00  
0xfe 0xff 0x03 0x00  

解码为:

 0  0
 1 -1
 0  0
 0  1
-1  1
-2  3

所以你看到那些非常接近 0 的数字(几乎沉默),看起来像抖动,正如其他人建议的那样。

The data is not random

Looking at it i seem to see 2 int values per line, each 2 bytes in little-endian:

0x00 0x00 0x00 0x00  
0x01 0x00 0xff 0xff  
0x00 0x00 0x00 0x00  
0x00 0x00 0x01 0x00  
0xff 0xff 0x01 0x00  
0xfe 0xff 0x03 0x00  

Decodes as:

 0  0
 1 -1
 0  0
 0  1
-1  1
-2  3

So you see those very close to 0 numbers (nigh silence), seems as jitter, as others suggested.

怎樣才叫好 2024-09-11 03:17:23

预期静音的信号电平非常低,可能是由转换中使用的 dither 引起的从 32 位到 16 位。

The very low level signal where silence was expected, may have been caused by dither used in the conversion from 32-bit to 16-bit.

北座城市 2024-09-11 03:17:23

据我所知,通道应该是交替的,因此 44.1 khz 的 1 秒将是 88,200 k 样本的流,左右交替或无论规范如何规定。

另外,Audacity 不应该浮动 -> int转换错误,只能反过来。尝试从整数样本开始,而不是浮点数。或者让一个通道处于已知值(即 Ox8f8f)而另一个通道处于 0,这可能更容易弄清楚。

From what I remember the channels should be alternating, so 1 second of 44.1 khz will be a stream of 88,200 k samples, alternating left and right or whatever the spec says.

Also Audacity should not get float -> int conversion wrong, only the other way around. Try to start out with integer samples instead of flotatng point maybe. Or have one channel at a known value (ie Ox8f8f) and the other 0, that might be easier to figure out.

弥繁 2024-09-11 03:17:23

删除了代码和上一篇文章。

沉默:“真正的”沉默必须为零。否则,它通常被称为“房间”静音,这是一种非常小的噪音,如果不使用噪声门,则到处都存在。 (记录)
这只是一个想法:请记住,使用有符号值将导致 1 位用于有符号/无符号标记。也许(我不知道)这就是您使用 audacity 将其转换为签名波形文件后看到的内容。抱歉,我没有时间测试这个。

波形文件:
我不知道您对声音文件了解多少,但是:
如果您只是想添加静音,请尝试以下方法:
每个样本的大小为 X 位:因此一个样本需要 X/8 字节。
您知道采样率,因此您可以将原始原始字节数组复制到大小之一 (silence_length_in_samplesbytes_per_frame)+(original)+(silence_length_in_samplesbytes_per_frame) 中,然后使用以下命令将其写回声音文件我希望Python工具可以做到这一点。

2 个频道:
原始字节的组织方式为:
[样本1(通道1_字节,通道2字节)] [样本2(通道1_字节,通道2_字节)....
我希望我的意思很清楚:)

Deleted Code and prev post.

Silence: "Real" silence must be zero. Otherwise it is often called "room" silence, a very small noise which is present everywhere if you dont use a noise gate. (recording)
Its just an idea: remember that using signed values will cause 1 bit to be used for the signed/unsigned marker. Maybe ( i dont know) this is what you see after converting it to a signed wave file using audacity. Im sorry but i dont have the time to test this.

Wave files:
I don't know how much you know about soundfiles, but:
If you just want to add silence try it this way:
Each sample is of size X bits: so you need X/8 bytes for one sample.
You know the sampling rate-so you can just copy the original raw byte array into one of size (silence_length_in_samplesbytes_per_frame)+(original)+(silence_length_in_samplesbytes_per_frame) and just write it back into a soundfile by using the python tools which i hope can do this.

2 Channels:
The raw bytes are organized in:
[sample1(channel1_bytes, channel2bytes)][sample2(channel1_bytes,channel2_bytes)....
I hope it is clear what I mean :)

以酷 2024-09-11 03:17:23

您可以使用以下代码查看这些数字:

import struct
struct.unpack("f", struct.pack("I", 0xfeff0300))
(-1.6948435790786458e+38,)

它们看起来都是非常小的,可以说是无声的数字。我生成了静音并将其保存为 32 位浮点 WAV,但没有得到小数字。我的文件包含零,不包括标题。

0.2 秒的静默、2 通道浮点数据可以像这样生成:

import array
silence = array.array("f", [0] * int(44100 * 2 * 0.2))

You can see what those numbers are with this code:

import struct
struct.unpack("f", struct.pack("I", 0xfeff0300))
(-1.6948435790786458e+38,)

They all appear to be very small, arguably silent, numbers. I generated silence and saved it as a 32-bit floating point WAV and did not get small numbers. My file contained zeros, excluding the header.

0.2 seconds of silent, 2 channel floating point data can be generated like so:

import array
silence = array.array("f", [0] * int(44100 * 2 * 0.2))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文