speex解码出错
我使用 speex 对一些音频数据进行编码并通过 UDP 发送,然后在另一端对其进行解码。 我用 speex 进行了一些测试,发现如果我在编码后立即解码数据包,则解码后的数据与原始数据相差甚远。缓冲区开头的大部分字节都是 0。 因此,当我解码通过 UDP 发送的音频时,我得到的只是噪音。 这就是我编码音频的方式:
bool AudioEncoder::encode( float *raw, char *encoded_bits )
{
for ( size_t i = 0; i < 256; i++ )
this->_rfdata[i] = raw[i];
speex_bits_reset(&this->_bits);
speex_encode(this->_state, this->_rfdata, &this->_bits);
int bytesWritten = speex_bits_write(&this->_bits, encoded_bits, 512);
if (bytesWritten)
return true;
return false;
}
这就是我解码音频的方式:
float *f = new float[256];
// recvbuf is the buffer I pass to my recv function on the socket
speex_bits_read_from(&this->_bits, recvbuf, 512);
speex_decode(this->state, &this->_bits, f);
我已经查看了文档,并且我的大部分代码来自 speex 网站的示例编码/解码示例。 我不确定我在这里缺少什么。
I'm using speex to encode some audio data and send it over UDP, and decode it on the other side.
I ran a few tests with speex, and noticed that if I decode a packet straight after I encoded it, the decoded data is in no way close to the original data. Most of the bytes at the start of the buffer are 0.
So when I decode the audio sent over UDP, all I get is noise.
This is how I am encoding the audio:
bool AudioEncoder::encode( float *raw, char *encoded_bits )
{
for ( size_t i = 0; i < 256; i++ )
this->_rfdata[i] = raw[i];
speex_bits_reset(&this->_bits);
speex_encode(this->_state, this->_rfdata, &this->_bits);
int bytesWritten = speex_bits_write(&this->_bits, encoded_bits, 512);
if (bytesWritten)
return true;
return false;
}
this is how I am decoding the audio:
float *f = new float[256];
// recvbuf is the buffer I pass to my recv function on the socket
speex_bits_read_from(&this->_bits, recvbuf, 512);
speex_decode(this->state, &this->_bits, f);
I've check out the docs, and most of my code comes from the example encoding/decoding sample from the speex website.
I'm not sure what I'm missing here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我找到了编码数据如此不同的原因。事实上,正如 Paulo Scardine 所说,这是一种有损压缩,而且 speex 只能处理 160 帧,因此当从 portaudio 获取数据到 speex 时,需要通过 160 帧的“数据包”。
I found the reason the encoded data was so different. There is the fact it's a lossy compression as Paulo Scardine said, and also that speex only works with 160 frames, so when getting data from portaudio to speex, it needs to be by "packets" of 160 frames.
实际上,说话会给音频数据带来额外的延迟,我通过逆向工程发现:
由于前瞻是用零初始化的,因此您会观察到前几个样本“接近于零”。
为了获得正确的时序,您必须在获得已输入编解码器的实际音频数据之前跳过这些样本。为什么会这样,我不知道。 speex 的作者可能从来没有关心过这一点,因为 speex 用于流媒体,而不是主要用于存储和恢复音频数据。
另一种解决方法(为了不浪费空间)是,在输入实际音频数据之前,将(帧大小延迟)零输入编解码器,然后丢弃整个第一个 speex 帧。
我希望这能澄清一切。如果熟悉 Speex 的人读到这篇文章,如果我错了,请随时纠正我。
编辑:实际上,解码器和编码器都有先行时间。延迟的实际公式为:
Actually speaks introduces an additional delay to the audio data, I found out by reverse enginiering:
Since the lookahead is initialized with zereos, you observe the first few samples to be "close to zero".
To get the timing right, you must skip those samples before you get the actual audio data you have feeded into the codec. Why that is, I dont know. Probalby the author of speex never cared about this since speex is for streaming, not primarily for storing and restoring audio data.
Another workaround (to not waste space) is, you feed (framesize-delay) zeroes into the codec, before feeding your actual audio data, and then dropping the entire first speex-frame.
I hope this clarifies everything. If someone familiar with Speex reads this, feel free to correct me if I am wrong.
EDIT: Actually, decoder and encoder have both a lookahead time. The actual formula for the delay is:
您可能想看看这里一些简单的编码/解码:
http://www.speex.org/docs/manual/speex -manual/node13.html#SECTION001310000000000000000
由于您使用的是 UDP,您还可以使用抖动缓冲区来重新排序数据包和内容。
You may want to have a look here for some simple encoding/decoding:
http://www.speex.org/docs/manual/speex-manual/node13.html#SECTION001310000000000000000
Since you are using UDP you may also work with a jitter buffer to re-order packets and stuff.