C++位操作
我正在尝试从 UTF-8 格式中提取字符值。假设我有两个字符,我从第一个字符中提取 5 位 => 10111 和另一个字符的 6 位 => 010000
那么
ch1 = 10111;
ch2 = 010000;
我如何将它们组合成 10111010000 并将其十六进制输出为 0x5d0?我是否需要转换或者是否有更简单的方法来做到这一点,因为检查文档write
似乎能够顺序读取字符,有类似的功能吗?另外,看来我需要一个字符缓冲区,因为 10111010000 是 11 位长。有人知道该怎么做吗?
I am trying to extract character value from UTF-8 format. Suppose I have two characters, and I extract 5 bits from first character => 10111 and 6 bits from another character => 010000
so
ch1 = 10111;
ch2 = 010000;
how would I combine them to form 10111010000 and output its hex as 0x5d0? Do I need to shift or is there an easier way to do this, because checking the documentation write
appear to be able to read characters sequentially, is there a similar function like this? Also, it appears I would need a char buffer since 10111010000 is 11 bits long. Does any know how to go about this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您需要使用移位以及
|
或|=
运算符。我在这里假设
unsigned int
是 16 位。您的里程可能会有所不同。You need to use shifting, plus the
|
or|=
operator.I'm assuming here that an
unsigned int
is 16 bits. Your mileage may vary.您肯定需要使用移位和或。
首先,声明一个正确大小的无符号整数类型。我喜欢 stdint.h 中定义的 C99 类型,但您的 C++ 编译器可能没有它们。如果您没有
uint16_t
,那么您可以使用unsigned short
。即 16 位宽,可容纳 11 位。然后你会找出哪些位进入高位。看起来应该是:
You will definitely need to use shift and OR.
First, declare an unsigned integer type of the right size. I like the C99 types defined in stdint.h but your C++ compiler may not have them. If you don't have
uint16_t
then you can useunsigned short
. That is 16 bits wide and can hold 11 bits.Then you would figure out which bits go into the high bits. It looks like it should be:
1:将它们组合在一起:
其打印为十六进制:
2:在本例中将
1: for combining them together:
2: for printing it out as hex
in this case:
首先,来自 K&R:“几乎所有关于位域的内容都依赖于实现”。
以下内容适用于 MS Visual Studio 2008:
生成输出:
但是我无法保证它在所有编译器中都以相同的方式工作。请注意将所有填充初始化为零的
memset
。First, from K&R: "Almost everything about bitfields is implementation dependent".
The following works on MS Visual Studio 2008:
Produces the output:
However I could not guarentee that it will work in the same way in all compilers. Note the
memset
which initialises any padding to zero.