voip 基础知识 - 数据包的标头信息？

发布于 2025-01-01 13:30:36 字数 971 浏览 1 评论 0原文

我学习了一些在小型网络中通过 udp 使用 voip 的知识。我知道有很多库可以通过一些方法调用来完成我需要的一切，但正如我所说的，我正在学习，所以需要重新发明轮子来看看它是如何工作的。

我目前正在研究 DatagramPacket 类，我注意到没有方法可以在 DatagramPacket 类中设置标头信息（即我需要知道进行交织的数据包顺序序列号）。

一些反映环境的代码：

byte[] block;
DatagramPacket packet; // UDP packet                

/* x Bytes per block , y blocks per second,
   z ms time block playback duration */

block = recorder.getBlock(); // assume I have class that handles audio
                              // recording and returns speech in a
                              // uncompressed form of bytes

packet = new DatagramPacket(block, block.length, clientIP, PORT);

首先，我假设因为它是 UDP，所以发送者除了将数据包扔到某个地方这一简单事实之外，并不真正关心任何事情。所以这就是为什么里面没有这样的方法。

其次，我假设我需要自己做——向要发送的字节块添加额外的字节，其中将包含数据包的序列号？但是，我还担心如果我这样做，那么我如何识别字节是否是标头字节而不是音频字节？我可以假设第一个字节代表一个数字，但是我们知道该字节只能代表 258 个数字。我以前从未真正在字节级别上工作过。或者也许还有其他技术？

简而言之，要进行交错，我需要知道如何设置数据包序列号，因为我无法订购无序的数据包:-)

谢谢，

原文

I do some learning of using voip over udp in a small network. I know there are bundles of libraries ready to do and overdo everything I ever need with a few method calls, but as I said I am learning, so need to reinvent the wheel to see how it works.

I am currently investigating the DatagramPacket class and I've noticed that there is no method that would set header information(ie packet order sequence number which I need to know to do interleaving) in DatagramPacket class.

A little code to reflect the environment:

byte[] block;
DatagramPacket packet; // UDP packet                

/* x Bytes per block , y blocks per second,
   z ms time block playback duration */

block = recorder.getBlock(); // assume I have class that handles audio
                              // recording and returns speech in a
                              // uncompressed form of bytes

packet = new DatagramPacket(block, block.length, clientIP, PORT);

Firstly, I assume that because it is UDP, the sender doesnt really care anything whatsoever besides the simple fact that he throws packets somewhere. So that is why there is no such method inside.

Secondly, I assume that I need to do it myself - add extra bytes to the byte block to be sent , which would contain a sequence number of a packet? However am also concerned that if I do that, then how do I recognize if bytes are header bytes not audio bytes? I can make assumption that first byte represents a number, however we know that byte can only represent 258 numbers. I've never really worked on byte level before. Or there maybe other techniques?

Shortly saying, to do interleaving I need to know how to set up packet sequence number as I can't order unordered packets :-)

Thank You,

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱，才寂寞 2025-01-08 13:30:36

您需要将程序使用的数据类型序列化/反序列化到字节数组上。

假设您正在谈论 RTP，并且您想要发送一个数据包这些字段 - 请参阅 RTP 规范中的第 5 章：

版本 = 2
填充= 0
扩展名 = 0
中国证监会计数 = 1
标记=0
有效负载类型 = 8 (G711 alaw)
序列号 = 1234
时间戳 = 1
one CSRC = 4321

让我们将它们放入一些变量中，为了方便起见，使用整数，或者当我们需要处理无符号 32 位值时使用 long：

int version = 2;
int padding = 0;
int extension = 0;
int csrcCount = 1;
int marker = 0;
int payloadType = 8;
int sequenceNumber = 1234;
long timestamp = 1;
long ourCsrc = 4321;

byte buf[] = ...; //allocate this big enough to hold the RTP header + audio data

//assemble the first bytes according to the RTP spec (note, the spec marks version as bit 0 and 1, but
//this is really the high bits of the first byte ...
buf[0] = (byte) ((version & 0x3) << 6 | (padding & 0x1) << 5 | (extension & 0x1) << 4 | (csrcCount & 0xf));

//2.byte
buf[1] = (byte)((marker & 0x1) << 7 | payloadType & 0x7f);

//squence number, 2 bytes, in big endian format. So the MSB first, then the LSB.
buf[2] = (byte)((sequenceNumber & 0xff00) >> 8);
buf[3] = (byte)(sequenceNumber  & 0x00ff);

//packet timestamp , 4 bytes in big endian format
buf[4] = (byte)((timestamp & 0xff000000) >> 24);
buf[5] = (byte)((timestamp & 0x00ff0000) >> 16);
buf[6] = (byte)((timestamp & 0x0000ff00) >> 8);
buf[7] = (byte) (timestamp & 0x000000ff);
//our CSRC , 4 bytes in big endian format
buf[ 8] = (byte)((sequenceNumber & 0xff000000) >> 24);
buf[ 9] = (byte)((sequenceNumber & 0x00ff0000) >> 16);
buf[10] = (byte)((sequenceNumber & 0x0000ff00) >> 8);
buf[11] = (byte) (sequenceNumber & 0x000000ff);

这就是标头，现在您可以将音频字节复制到 buf ，从 buf[12] 开始并将 buf 作为一个数据包发送。

现在，上述内容当然只是为了展示原理，根据 RTP 规范，RTP 数据包的实际序列化器必须处理更多内容（例如，您可能需要一些扩展标头，您可能需要多个 CSRC，您需要根据您拥有的音频数据的格式使用正确的有效负载类型，您需要正确打包和安排这些音频数据 - 例如，对于 G.711Alaw，您应该用 160 字节的音频数据填充每个 RTP 数据包并发送一个每 20 毫秒发送一个数据包。

You'll need to serialize/deserialize data types your program uses onto a byte array.

Lets assume you're talking about RTP, and you'd want to send a packet with these fields - look at chapter 5 in the RTP specs:

Version = 2
padding = 0
extension = 0
CSRC count = 1
marker = 0
payload type = 8 (G711 alaw)
sequence number = 1234
timestamp = 1
one CSRC = 4321

Lets put these into some variables, using integers for ease, or long when we need to deal with an unsigned 32 bit value:

int version = 2;
int padding = 0;
int extension = 0;
int csrcCount = 1;
int marker = 0;
int payloadType = 8;
int sequenceNumber = 1234;
long timestamp = 1;
long ourCsrc = 4321;

byte buf[] = ...; //allocate this big enough to hold the RTP header + audio data

//assemble the first bytes according to the RTP spec (note, the spec marks version as bit 0 and 1, but
//this is really the high bits of the first byte ...
buf[0] = (byte) ((version & 0x3) << 6 | (padding & 0x1) << 5 | (extension & 0x1) << 4 | (csrcCount & 0xf));

//2.byte
buf[1] = (byte)((marker & 0x1) << 7 | payloadType & 0x7f);

//squence number, 2 bytes, in big endian format. So the MSB first, then the LSB.
buf[2] = (byte)((sequenceNumber & 0xff00) >> 8);
buf[3] = (byte)(sequenceNumber  & 0x00ff);

//packet timestamp , 4 bytes in big endian format
buf[4] = (byte)((timestamp & 0xff000000) >> 24);
buf[5] = (byte)((timestamp & 0x00ff0000) >> 16);
buf[6] = (byte)((timestamp & 0x0000ff00) >> 8);
buf[7] = (byte) (timestamp & 0x000000ff);
//our CSRC , 4 bytes in big endian format
buf[ 8] = (byte)((sequenceNumber & 0xff000000) >> 24);
buf[ 9] = (byte)((sequenceNumber & 0x00ff0000) >> 16);
buf[10] = (byte)((sequenceNumber & 0x0000ff00) >> 8);
buf[11] = (byte) (sequenceNumber & 0x000000ff);

That's the header, now you can copy the audio bytes into buf, starting at buf[12] and send buf as one packet.

Now, the above is ofcourse just to show the principles, an actual serializer for a RTP packet would have to deal with much more, in accordance to the RTP specificaion (e.g. you might need some extension headers, you might need more than one CSRC, you need the correct payload type according to the format of the audio data you have, you need to packetize and schedule those audio data correctly - e.g. for G.711Alaw you'll should fill each RTP packet with 160 bytes of audio data and send one packet every 20 milisecond.

回复收藏 0 原文

~没有更多了~