字符到字节
有什么好的估计/转换/公式可以计算出 X# 字符 = Y# 字节?
What's a good estimate/conversion/formula to figure out X# characters = Y# bytes?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
有什么好的估计/转换/公式可以计算出 X# 字符 = Y# 字节?
What's a good estimate/conversion/formula to figure out X# characters = Y# bytes?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(2)
它完全取决于编码和潜在的数据。
对于 UTF-16,如果您知道所有字符都在基本多语言平面中,则答案将是字节 = 2 * 字符。
对于 UTF-8,如果所有内容都在 ASCII 范围内,则字节 = 字符 - 但如果有大量远东字符,则可能多达字节 = 3 * 字符(并且仍然假设基本多语言平面)。
其他编码显然有不同的场景。您能否提供有关您的情况(以及您的平台)的更多详细信息?您想要一个基于实际字符的准确计算值吗?您了解要编码的文本吗?
It entirely depends on the encoding and potentially the data.
For UTF-16, if you know that all the characters are in the Basic Multilingual Plane, the answer will be bytes = 2 * characters.
For UTF-8, if everything is in the ASCII range, then bytes = characters - but if there are lots of Far Eastern characters, it could be as much as bytes = 3 * characters (and that's still assuming the Basic Multilingual Plane).
Other encodings obviously have different scenarios. Could you give more details about your situation (and your platform)? Do you want an accurate calculated value based on actual characters? Do you know anything about the text you're going to encode?
对于 ANSI,我认为 1 个字节到 char,但对于 unicode,我认为每个字符 2 个字节。尽管也可能存在多字节模式。
For ANSI, I would think 1 byte to char but for unicode I would think 2 bytes per char. Although there are probably multi byte patterns too.