字符到字节

发布于 2024-08-30 23:58:25 字数 40 浏览 0 评论 0原文

有什么好的估计/转换/公式可以计算出 X# 字符 = Y# 字节?

What's a good estimate/conversion/formula to figure out X# characters = Y# bytes?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

滿滿的愛 2024-09-06 23:58:25

它完全取决于编码和潜在的数据。

对于 UTF-16,如果您知道所有字符都在基本多语言平面中,则答案将是字节 = 2 * 字符。

对于 UTF-8,如果所有内容都在 ASCII 范围内,则字节 = 字符 - 但如果有大量远东字符,则可能多达字节 = 3 * 字符(并且仍然假设基本多语言平面)。

其他编码显然有不同的场景。您能否提供有关您的情况(以及您的平台)的更多详细信息?您想要一个基于实际字符的准确计算值吗?您了解要编码的文本吗?

It entirely depends on the encoding and potentially the data.

For UTF-16, if you know that all the characters are in the Basic Multilingual Plane, the answer will be bytes = 2 * characters.

For UTF-8, if everything is in the ASCII range, then bytes = characters - but if there are lots of Far Eastern characters, it could be as much as bytes = 3 * characters (and that's still assuming the Basic Multilingual Plane).

Other encodings obviously have different scenarios. Could you give more details about your situation (and your platform)? Do you want an accurate calculated value based on actual characters? Do you know anything about the text you're going to encode?

短暂陪伴 2024-09-06 23:58:25

对于 ANSI,我认为 1 个字节到 char,但对于 unicode,我认为每个字符 2 个字节。尽管也可能存在多字节模式。

For ANSI, I would think 1 byte to char but for unicode I would think 2 bytes per char. Although there are probably multi byte patterns too.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文