如何用少于 2*n 个字符表示 n 字节数组
鉴于n字节数组可以使用十六进制表示为2*n字符串,有没有办法用少于2*n个字符来表示n字节数组?
例如,通常可以将整数 (int32) 视为 4 字节数据数组
given that a n-byte array can be represented as a 2*n character string using hex, is there a way to represent the n-byte array in less than 2*n characters?
for example, typically, an integer(int32) can be considered as a 4-byte array of data
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
十六进制的优点在于,将 8 位字节分成相等的两半是将字节映射到可打印 ASCII 字符的最简单的方法。更有效的方法将多个字节视为一个块:
Base-64 使用 64 个 ASCII 字符一次表示 6 位。每3个字节(即24位)被分成4个6位base-64数字,其中“数字”是:(
如果输入不是3字节长的倍数,则为第65个字符,“
=
”,用于末尾的填充)。请注意,base-64 的某些变体形式对最后两个“数字”使用不同的字符。Ascii85 是另一种表示形式,虽然不太为人所知,但很常用:它通常是在 PostScript 和 PDF 文件中对二进制数据进行编码的方式。这将每 4 个字节(big-endian)视为一个无符号整数,以 85 为基数表示为 5 位数字,每个 85 基数数字编码为 ASCII 代码 33+n(即“
!”代表 0,最大为“
u
”代表 84) - 加上可以使用单个字符“z
”(而不是“”的特殊情况) !!!!!!
") 来表示 4 个零字节。(为什么是 85?因为 845 <232 <855。)
The advantage of hex is that splitting an 8-bit byte into two equal halves is about the simplest thing you can do to map a byte to printable ASCII characters. More efficient methods consider multiple bytes as a block:
Base-64 uses 64 ASCII characters to represent 6 bits at a time. Every 3 bytes (i.e. 24 bits) are split into 4 6-bit base-64 digits, where the "digits" are:
(and if the input is not a multiple of 3 bytes long, a 65th character, "
=
", is used for padding at the end). Note that there are some variant forms of base-64 use different characters for the last two "digits".Ascii85 is another representation, which is somewhat less well-known, but commonly used: it's often the way that binary data is encoded within PostScript and PDF files. This considers every 4 bytes (big-endian) as an unsigned integer, which is represented as a 5-digit number in base 85, with each base-85 digit encoded as ASCII code 33+n (i.e. "
!
" for 0, up to "u
" for 84) - plus a special case where the single character "z
" may be used (instead of "!!!!!
") to represent 4 zero bytes.(Why 85? Because 845 < 232 < 855.)
是的,使用二进制(在这种情况下需要 n 个字节,毫不奇怪),或使用任何高于 16 的基数,常见的是基数 64。
yes, using binary (in which case it takes n bytes, not surprisingly), or using any base higher than 16, a common one is base 64.
这可能取决于您想要表示的确切数字。例如,数字 9223372036854775808 在二进制中需要 8 个字节来表示,如果使用素数表示的乘积(即“2^63”),则在 ascii 中只需要 4 个字节。
It might depend on the exact numbers you want to represent. For instance, the number 9223372036854775808, which requres 8 bytes to represent in binary, takes only 4 bytes in ascii, if you use the product of primes representation (which is "2^63").
base-64 怎么样?
这完全取决于您愿意在编码(即表示)中使用哪些字符。
How about base-64?
It all depends on what characters you're willing to use in your encoding (i.e. representation).
Base64 每个字符适合 6 位,这意味着 4 个字符适合 3 个字节。
Base64 fits 6 bits in each character, which means that 3 bytes will fit in 4 characters.
使用大约 90000 个定义的 Unicode 字符中的 65536 个,您可以用 N/2 个字符表示二进制字符串。
Using 65536 of about 90000 defined Unicode characters you may represent binary string in N/2 characters.
是的。使用更多字符而不仅仅是 0-9 和 af。单个字符(假设为 8 位)可以有 256 个值,因此可以用 n 个字符表示一个 n 字节数字。
如果需要可打印,您只需选择一些字符集来表示各种值即可。在这种情况下,一个不错的选择是 base-64。
Yes. Use more characters than just 0-9 and a-f. A single character (assuming 8-bit) can have 256 values, so you can represent an n-byte number in n characters.
If it needs to be printable, you can just choose some set of characters to represent various values. A good option is base-64 in that case.