对于非常少量的数据(3-4 kib?)来说,最好的压缩库是什么?
我正在开发一个游戏引擎,它是从 Quake 2 松散衍生而来的,添加了一些诸如脚本效果之类的东西(允许服务器向客户端详细指定特殊效果,而不是只有客户端能够使用的有限数量的硬编码效果) 。)这是网络效率与灵活性的权衡。
我遇到了一个有趣的障碍。请注意,最大数据包大小为 2800 字节,每个客户端每帧只能发送一个数据包。
这是执行“火花”效果的脚本(可能适用于子弹撞击火花、电击等) http://pastebin.com/m7acdf519(如果您不明白,请不要担心;它是我制作的自定义语法与我提出的问题无关。)
我已尽一切可能缩小该脚本的大小。我什至将变量名称简化为单个字母。但结果正好是 405 字节。这意味着每帧最多可以容纳 6 个。我还考虑到一些服务器端的更改可能会再减少 12 个,而协议更改可能会再节省 6 个。尽管节省的费用会根据您使用的脚本而有所不同。
然而,在这 387 个字节中,我估计在效果的多次使用之间只有 41 个字节是唯一的。换句话说,这是压缩的主要候选者。
恰好R1Q2(具有扩展网络协议的向后兼容的Quake 2引擎)具有Zlib压缩代码。我可以引用这段代码,或者至少严格遵循它作为参考。
但 Zlib 一定是最好的选择吗?我至少可以想到一种替代方案,LZMA,而且很容易还有更多。
要求:
- 必须非常快(如果每秒运行超过 100 次,性能影响必须非常小。)
- 必须将尽可能多的数据塞进 2800 字节 元
- 数据占用小
- 兼容 GPL 的
Zlib 看起来不错,但还有更好的吗?请记住,这些代码还没有被合并,因此还有足够的实验空间。
谢谢, -Max
编辑:感谢那些建议将脚本编译为字节码的人。我应该说清楚——是的,我正在这样做。如果您愿意,可以在我的网站上浏览相关源代码,尽管它还没有“美化”。
这是服务器端代码:
Lua组件: http://meliaserlow.dyndns.tv:8000/alienarena/ lua_source/lua/scriptedfx.lua
C 组件: http://meliaserlow.dyndns.tv:8000/alienarena/ lua_source/game/g_scriptedfx.c
对于我发布的特定示例脚本,这会将 1172 字节源减少到 405 字节——仍然不够小。 (请记住,我希望将尽可能多的数据包放入 2800 字节中!)
EDIT2:无法保证任何给定的数据包都会到达。每个数据包都应该包含“世界状态”,而不依赖于先前数据包中传达的信息。一般来说,这些脚本将用于传达“养眼的东西”。如果没有空间容纳一个,它就会从数据包中丢弃,这没什么大不了的。但如果掉落的东西太多,事情在视觉上就会看起来很奇怪,这是不希望的。
I am working on a game engine which is loosely descended from Quake 2, adding some things like scripted effects (allowing the server to specify special effects in detail to a client, instead of having only a limited number of hardcoded effects which the client is capable of.) This is a tradeoff of network efficiency for flexibility.
I've hit an interesting barrier. See, the maximum packet size is 2800 bytes, and only one can go out per client per frame.
Here is the script to do a "sparks" effect (could be good for bullet impact sparks, electrical shocks, etc.)
http://pastebin.com/m7acdf519 (If you don't understand it, don't sweat it; it's a custom syntax I made and not relevant to the question I am asking.)
I have done everything possible to shrink the size of that script. I've even reduced the variable names to single letters. But the result is exactly 405 bytes. Meaning you can fit at most 6 of these per frame. I also have in mind a few server-side changes which could shave it down another 12, and a protocol change which might save another 6. Although the savings would vary depending on what script you are working with.
However, of those 387 bytes, I estimate that only 41 would be unique between multiple usages of the effect. In other words, this is a prime candidate for compression.
It just so happens that R1Q2 (a backward-compatible Quake 2 engine with an extended network protocol) has Zlib compression code. I could lift this code, or at least follow it closely as a reference.
But is Zlib necessarily the best choice here? I can think of at least one alternative, LZMA, and there could easily be more.
The requirements:
- Must be very fast (must have very small performance hit if run over 100 times a second.)
- Must cram as much data as possible into 2800 bytes
- Small metadata footprint
- GPL compatible
Zlib is looking good, but is there anything better? Keep in mind, none of this code is being merged yet, so there's plenty of room for experimentation.
Thanks,
-Max
EDIT: Thanks to those who have suggested compiling the scripts into bytecode. I should have made this clear-- yes, I am doing this. If you like you can browse the relevant source code on my website, although it's still not "prettied up."
This is the server-side code:
Lua component: http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/lua/scriptedfx.lua
C component: http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/game/g_scriptedfx.c
For the specific example script I posted, this gets a 1172 byte source down to 405 bytes-- still not small enough. (Keep in mind I want to fit as many of these as possible into 2800 bytes!)
EDIT2: There is no guarantee that any given packet will arrive. Each packet is supposed to contain "the state of the world," without relying on info communicated in previous packets. Generally, these scripts will be used to communicate "eye candy." If there's no room for one, it gets dropped from the packet and that's no big deal. But if too many get dropped, things start to look strange visually and this is undesirable.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
LZO 可能是一个很好的候选者。
LZO might be a good candidate for this.
最终更新:这两个库看起来差不多。 Zlib 的压缩率提高了大约 20%,而 LZO 的解码速度大约快两倍,但两者的性能影响都非常小,几乎可以忽略不计。这就是我的最终答案。感谢所有其他答案和评论!
更新:在实现 LZO 压缩并看到性能明显改善之后,很明显,我自己的代码对性能影响负有责任(每个数据包可能的脚本效果数量大量增加,因此我的效果“解释器” “得到了更多的锻炼。)我想为自己急于推卸责任而谦虚地道歉,希望大家不要感到难过。我会做一些分析,然后也许我能够得到一些对其他人更有用的数字。
原始帖子:
好的,我终于开始为此编写一些代码。我从 Zlib 开始,这是我的第一个发现。
Zlib 的压缩效果非常。即使使用 Z_BEST_SPEED(而不是 Z_DEFAULT_COMPRESSION)进行压缩,它也能可靠地将 8.5 kib 的数据包减少到 750 字节或更少。压缩时间也相当不错。
然而,我不知道任何东西的解压速度可能会这么糟糕。我没有实际数字,但每个数据包至少需要 1/8 秒! (Core2Duo T550 @ 1.83 Ghz。)完全不可接受。
据我所知,与 Zlib 相比,LZMA 是较差性能与更好压缩的权衡。由于 Zlib 的压缩已经过大,而且性能也已经非常糟糕,LZMA 目前已经不被人们所关注。
如果LZO的减压时间像它声称的那样好,那么我就会使用它。我认为最终服务器在极端情况下仍然能够发送 Zlib 数据包,但客户端可以配置为忽略它们,这将是默认设置。
FINAL UPDATE: The two libraries seem about equivalent. Zlib gives about 20% better compression, while LZO's decoding speed is about twice as fast, but the performance hit for either is very small, nearly negligible. That is my final answer. Thanks for all other answers and comments!
UPDATE: After implementing LZO compression and seeing only sightly better performance, it is clear that my own code is to blame for the performance hit (massively increased number of scripted effects possible per packet, thus my effect "interpreter" is getting exercised a lot more.) I would like to humbly apologize for scrambling to shift blame, and I hope there are no hard feelings. I will do some profiling and then maybe I will be able to get some numbers which will be more useful to someone else.
ORIGINAL POST:
OK, I finally got around to writing some code for this. I started out with Zlib, here are the first of my findings.
Zlib's compression is insanely great. It is reliably reducing packets of, say, 8.5 kib down to, say, 750 bytes or less, even when compressing with Z_BEST_SPEED (instead of Z_DEFAULT_COMPRESSION.) The compression time is also pretty good.
However, I had no idea the decompression speed of anything could even possibly be this bad. I don't have actual numbers, but it must be taking 1/8 second per packet at least! (Core2Duo T550 @ 1.83 Ghz.) Totally unacceptable.
From what I've heard, LZMA is a tradeoff of worse performance vs. better compression when compared to Zlib. Since Zlib's compression is already overkill and its performance is already incredibly bad, LZMA is off the table sight unseen for now.
If LZO's decompression time is as good as it's claimed to be, then that is what I will be using. I think in the end the server will still be able to send Zlib packets in extreme cases, but clients can be configured to ignore them and that will be the default.
zlib 可能是一个不错的候选者 - 许可证非常好,运行速度快,其作者表示它的开销和开销非常小是使用少量数据会出现问题的问题。
zlib might be a good candidate - license is very good, works fast and its authors say it has very little overhead and overhead is the thing that makes use for small amounts of data problematic.
你应该看看 OpenTNL 并调整他们在那里使用的一些技术,比如这个概念网络字符串数
you should look at OpenTNL and adapt some of the techniques they use there, like the concept of Network Strings
我倾向于使用每个字符的最高有效位,这目前被浪费了,通过向左移动 9 个字节的组,您将适合 8 个字节。
您可以更进一步,将字符映射到一个小空间中 - 例如,您可以通过不允许大写字母并从每个字符中减去 0x20 (以便空间变成值)将它们减少到 6 位(即只有 64 个有效字符) 0 )
您可以进一步映射每个字符的频率并进行霍夫曼类型压缩以减少每个字符的平均位数。
我怀疑没有任何算法可以比一般情况下更好地保存数据,因为在您已经进行的更改之后,消息中基本上没有冗余。
I would be inclinded to use the most significant bit of each character, which is currently wasted, by shifting groups of 9 bytes leftwards, you will fit into 8 bytes.
You could go further and map the characters into a small space - can you get them down to 6 bits (i.e. only having 64 valid characters) by, for example, not allowing capital letters and subtracting 0x20 from each character ( so that space becomes value 0 )
You could go further by mapping the frequency of each character and make a Huffman type compression to reduce the avarage number bits of each character.
I suspect that there are no algorithms that will save data any better that, in the general case, as there is essentially no redundancy in the message after the changes that you have alrady made.
发送脚本的二进制表示怎么样?
所以我正在考虑抽象语法树的行,每个过程都有一个标识符。
这意味着由于一次性解析而提高了客户端的性能,并且由于删除方法名称而减小了大小。
How about sending a binary representation of your script?
So I'm thinking in the lines of a Abstract Syntax Tree with each procedure having a identifier.
This means preformance gain on the clients due to the one time parsing, and decrease of size due to removing the method names.