某些字符是否比其他字符占用更多字节?
我对较低级别的事情不太有经验,例如一个字符有多少字节。 我尝试找出一个字符是否等于一个字节,但没有成功。
我需要设置一个用于服务器和客户端之间的套接字连接的分隔符。 该分隔符必须尽可能小(以字节为单位),以最大限度地减少带宽。
当前的分隔符是“#”。 使用其他分隔符会减少我的带宽吗?
I'm not very experienced with lower level things such as howmany bytes a character is. I tried finding out if one character equals one byte, but without success.
I need to set a delimiter used for socket connections between a server and clients. This delimiter has to be as small (in bytes) as possible, to minimize bandwidth.
The current delimiter is "#". Would getting an other delimiter decrease my bandwidth?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这取决于您使用什么字符编码在字符和字节之间进行转换(它们完全不是一回事):
US-ASCII 字符(其中 # 是一个)在 UTF-8 中仅占用 1 个字节,这是最流行的编码,允许多字节字符。
It depends on what character encoding you use to translate between characters and bytes (which are not at all the same thing):
US-ASCII characters (of whcich # is one) will take only 1 byte in UTF-8, which is the most popular encoding that allows multibyte characters.
答案当然是视情况而定。 如果您处于纯 ASCII 环境中,那么是的,每个字符占用 1 个字节,但如果您处于 Unicode 环境中(例如所有 Windows),那么字符大小的范围可以从 1 到 4 个字节。
如果您从 ASCII 集中选择一个字符,那么您的分隔符会尽可能小。
The answer of course is that it depends. If you are in a pure ASCII env, then yes, every char takes 1 byte, but if you are in a Unicode env (all of Windows for example), then chars can range from 1 to 4 bytes in size.
If you choose a char from the ASCII set, then yes your delimter is a small as possible.
这取决于编码。 在单字节字符集中,例如 ANSI 和各种 ISO8859 字符集,每个字符一个字节。 某些编码(例如 UTF8)是可变宽度的,其中对字符进行编码的字节数取决于所编码的字形。
It depends on the encoding. In Single-byte character sets such as ANSI and the various ISO8859 character sets it is one byte per character. Some encodings such as UTF8 are variable width where the number of bytes to encode a character depends on the glyph being encoded.
不,所有字符都是 1 个字节,除非您使用 Unicode 或宽字符(例如重音符号和其他符号)。
一个字符有 1 个字节或 8 位长,可提供 256 种可能的组合来形成字符。 1 字节字符称为 ASCII 字符。 他们只使用 7 位(尽管有 8 位可用,但你不能使用这第 8 位)来形成标准字母表和电传打字机和打字机仍然常见时使用的各种符号。
您可以找到 ASCII 图表以及哪些数字对应哪些字符 这里。
No, all characters are 1 byte, unless you're using Unicode or wide characters (for accents and other symbols for example).
A character is 1 byte, or 8 bits, long which gives 256 possible combination to form characters with. 1 byte characters are called ASCII characters. They only use 7 bits (even though 8 are available, but you can't use this 8th bit) to form the standard alphabet and various symbols used when teletypes and typewriters were still common.
You can find an ASCII chart and what numbers correspond to what characters here.