Java中如何从网络包数据中取出数据
在 C 中,如果你有某种类型的数据包,你通常要做的就是定义一些结构并将 char * 转换为指向该结构的指针。 此后,您可以直接以编程方式访问网络数据包中的所有数据字段。 像这样:
struct rdp_header {
int version;
char serverId[20];
};
当你收到一个网络数据包时,你可以快速执行以下操作:
char * packet;
// receive packet
rdp_header * pckt = (rdp_header * packet);
printf("Servername : %20.20s\n", pckt.serverId);
这种技术对于基于 UDP 的协议非常有效,并且允许使用非常少的代码进行非常快速且非常高效的数据包解析和发送,以及简单的错误处理(只需检查数据包的长度)。 java 中是否有等效的、同样快速的方法来执行相同的操作? 或者您被迫使用基于流的技术?
In C if you have a certain type of packet, what you generally do is define some struct and cast the char * into a pointer to the struct. After this you have direct programmatic access to all data fields in the network packet. Like so :
struct rdp_header {
int version;
char serverId[20];
};
When you get a network packet you can do the following quickly :
char * packet;
// receive packet
rdp_header * pckt = (rdp_header * packet);
printf("Servername : %20.20s\n", pckt.serverId);
This technique works really great for UDP based protocols, and allows for very quick and very efficient packet parsing and sending using very little code, and trivial error handling (just check the length of the packet). Is there an equivalent, just as quick way in java to do the same ? Or are you forced to use stream based techniques ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我不相信这种技术可以在 Java 中完成,除非使用 JNI 并实际用 C 编写协议处理程序。实现您描述的技术的另一种方法是变体记录和联合,Java 也没有。
如果您可以控制协议(它是您的服务器和客户端),您可以使用序列化对象(包括 xml)来获得数据的自动解析(但运行时效率不高),但仅此而已。
否则,您将陷入解析流或字节数组(可以将其视为流)的困境。
请注意,您描述的技术非常容易出错,并且是任何相当有趣的协议的安全漏洞的来源,因此损失并不是那么大。
I don't believe this technique can be done in Java, short of using JNI and actually writing the protocol handler in C. The other way to do the technique you describe is variant records and unions, which Java doesn't have either.
If you had control of the protocol (it's your server and client) you could use serialized objects (inc. xml), to get the automagic (but not so runtime efficient) parsing of the data, but that's about it.
Otherwise you're stuck with parsing Streams or byte arrays (which can be treated as Streams).
Mind you the technique you describe is tremendously error prone and a source of security vulnerabilities for any protocol that is reasonably interesting, so it's not that great a loss.
将数据包读入字节数组,然后从中提取所需的位和字节。
这是一个示例,没有异常处理:
在实践中,您可能最终会使用辅助函数来从字节数组中按网络顺序提取数据字段,或者作为Tom 在评论中指出,您可以使用
ByteArrayInputStream()
,从中您可以构造一个DataInputStream ()
它具有从流中读取结构化数据的方法:Read your packet into a byte array, and then extract the bits and bytes you want from that.
Here's a sample, sans exception handling:
In practise you'll probably end up with helper functions to extract data fields in network order from the byte array, or as Tom points out in the comments, you can use a
ByteArrayInputStream()
, from which you can construct aDataInputStream()
which has methods to read structured data from the stream:我写了一些东西来简化这种工作。 与大多数任务一样,编写一个工具比尝试手动完成所有事情要容易得多。
它由两个类组成,下面是如何使用它的示例:
我编写了 ByteArrayBuilder 来简单地累积位。 我使用了方法链接模式(仅从所有方法返回“this”)来更轻松地一起编写一堆语句。
ByteArrayBuilder 中的所有方法都很简单,就像 1 或 2 行代码一样(我只是将所有内容写入数据输出流)
这是为了构建一个数据包,但将其拆开应该不会更困难。
BitBuilder 中唯一有趣的方法是这个:
同样,逻辑可以很容易地反转以读取数据包而不是构建数据包。
编辑:我在这个答案中提出了一种不同的方法,我将把它作为一个单独的答案发布,因为它完全不同。
I wrote something to simplify this kind of work. Like most tasks, it was much easier to write a tool than to try to do everything by hand.
It consisted of two classes, Here's an example of how it was used:
I wrote the ByteArrayBuilder to simply accumulate bits. I used a method chaining pattern (Just returning "this" from all methods) to make it easier to write a bunch of statements together.
All the methods in the ByteArrayBuilder were trivial, just like 1 or 2 lines of code (I just wrote everything to a data output stream)
This is to build a packet, but tearing one apart shouldn't be any harder.
The only interesting method in BitBuilder is this one:
Again, the logic could be inverted very easily to read a packet instead of build one.
edit: I had proposed a different approach in this answer, I'm going to post it as a separate answer because it's completely different.
查看 Javolution 库及其结构类,它们将按照您的要求进行操作。 事实上,作者有这个确切的例子,使用 Javolution Struct 类来操作 UDP 数据包。
Look at the Javolution library and its struct classes, they will do just what you are asking for. In fact, the author has this exact example, using the Javolution Struct classes to manipulate UDP packets.
这是我上面留下的答案的替代提案。 我建议您考虑实现它,因为它的行为与 C 解决方案几乎相同,您可以按名称从数据包中选择字段。
您可以从一个外部文本文件开始,如下所示:
它可以指定数据包的整个结构,包括可能重复的字段。 该语言可以根据您的需要简单或复杂 -
您可以创建一个如下所示的对象:
您的构造函数将迭代 PacketStructure.txt 文件并将每个字符串存储为哈希表的键及其数据的确切位置(位偏移量和大小)作为数据。
一旦你创建了一个对象,传入 bitStructure 和一个数据包,你就可以使用如下语句随机访问数据:
另请注意,这个东西的效率比 C 结构低得多,但也没有你想象的那么多想一想——它的效率可能仍然比您需要的高很多倍。 如果做得正确,规范文件只会被解析一次,因此您只需对从数据包中读取的每个值进行一次哈希查找和一些二进制操作的次要命中 - 一点也不坏。
例外情况是,如果您从高速连续流中解析数据包,即使如此,我怀疑快速网络甚至可能会淹没速度较慢的 CPU。
This is an alternate proposal for an answer I left above. I suggest you consider implementing it because it would act pretty much the same as a C solution where you could pick fields out of a packet by name.
You might start it out with an external text file something like this:
It could specify the entire structure of a packet, including fields that may repeat. The language could be as simple or complicated as you need--
You'd create an object like this:
Your constructor would iterate over the PacketStructure.txt file and store each string as the key of a hashtable, and the exact location of it's data (both bit offset and size) as the data.
Once you created an object, passing in the bitStructure and a packet, you could randomly access the data with statements as straight-forward as:
Also note, this stuff would be much less efficient than a C struct, but not as much as you might think--it's still probably many times more efficient than you'll need. If done right, the specification file would only be parsed once, so you would only take the minor hit of a single hash lookup and a few binary operations for each value you read from the packet--not bad at all.
The exception is if you are parsing packets from a high-speed continuous stream, and even then I doubt a fast network could flood even a slowish CPU.
简短的回答,不,你不能那么容易做到。
更长的答案,如果您可以使用
Serialized
对象,则可以将InputStream
连接到ObjectInputStream
并使用它来反序列化您的对象。 但是,这需要您对协议有一定的控制权。 如果您使用 TCPSocket
,它的工作也会更容易。 如果您使用 UDPDatagramSocket
,则需要从数据包中获取数据,然后将其输入到ByteArrayInputStream
中。如果您无法控制协议,您也许仍然可以使用上述反序列化方法,但您可能必须实现
readObject()
和writeObject( )
方法而不是使用提供给您的默认实现。 如果您需要使用其他人的协议(比如因为您需要与本机程序进行互操作),这可能是您将找到的最简单的解决方案。另外,请记住,Java 在内部对字符串使用 UTF-16,但我不确定它是否以这种方式序列化它们。 无论哪种方式,在将字符串来回传递给非 Java 程序时都需要非常小心。
Short answer, no you can't do it that easily.
Longer answer, if you can use
Serializable
objects, you can hook yourInputStream
up to anObjectInputStream
and use that to deserialize your objects. However, this requires you have some control over the protocol. It also works easier if you use a TCPSocket
. If you use a UDPDatagramSocket
, you will need to get the data from the packet and then feed that into aByteArrayInputStream
.If you don't have control over the protocol, you may be able to still use the above deserialization method, but you're probably going to have to implement the
readObject()
andwriteObject()
methods rather than using the default implementation given to you. If you need to use someone else's protocol (say because you need to interop with a native program), this is likely the easiest solution you are going to find.Also, remember that Java uses UTF-16 internally for strings, but I'm not certain that it serializes them that way. Either way, you need to be very careful when passing strings back and forth to non-Java programs.