什么是按位移位(bit-shift)运算符以及它们如何工作?
我一直在业余时间尝试学习 C,其他语言(C#、Java 等)也有相同的概念(通常也有相同的运算符)...
在核心层面,位移位 (< code><<、>>
、>>>
) 能做什么,它能帮助解决什么问题,以及潜伏着什么问题拐弯处? 换句话说,这是一本关于位移位所有优点的绝对初学者指南。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
移位运算符的作用正如其名称所暗示的那样。 他们移位。 以下是对不同移位运算符的简要(或不那么简要)介绍。
运算符
>>
是算术(或有符号)右移运算符。>>>
是逻辑(或无符号)右移运算符。<<
是左移运算符,可以满足逻辑移位和算术移位的需要。所有这些运算符都可以应用于整数值(
int
、long
,可能是short
和byte
或字符)。 在某些语言中,将移位运算符应用于任何小于 int 的数据类型会自动将操作数的大小调整为 int。
请注意,
<<<
不是运算符,因为它是多余的。另请注意,C 和 C++ 不区分右移运算符。 它们仅提供
>>
运算符,并且右移行为是为有符号类型定义的实现。 答案的其余部分使用 C# / Java 运算符。(在所有主流 C 和 C++ 实现中,包括 GCC 和 Clang/LLVM,有符号类型上的
>>
是算术。有些代码假设了这一点,但这不是标准保证的。它不是 < em>未定义,尽管如此;标准要求实现以一种或另一种方式定义它,但是,负符号数的左移是未定义的行为(有符号整数溢出)。算术右移,使用无符号类型进行位移动通常是一个好主意。)左移 (<<)
整数在内存中作为一系列位存储。 例如,数字 6 存储为 32 位
int
将是:将此位模式向左移动一位 (
6 << 1
) 将导致数字 12:如您所见,数字向左移动了一位,右侧最后一位数字用零填充。 您可能还注意到,左移相当于乘以 2 的幂。因此
6 << 1
相当于6 * 2
,并且6 << 3
相当于6 * 8
。 一个好的优化编译器会尽可能用移位代替乘法。非循环移位
请注意,这些不是循环移位。 将此值向左移动一位 (
3,758,096,384 << 1
):结果为 3,221,225,472:
“偏离末尾”的数字会丢失。 它不会环绕。
逻辑右移 (>>)
逻辑右移与左移相反。 它们不是向左移动位,而是向右移动。 例如,将数字 12:
向右移动一位(
12 >>> 1
)将得到我们原来的 6:所以我们看到向右移动相当于除法2 的幂。
丢失的位消失了。
但是,移位不能收回“丢失”的位。 例如,如果我们将此模式:
向左移动 4 个位置 (
939,524,102 << 4
),我们将得到 2,147,483,744:,然后向后移动 (
(939,524,102 << 4) >> 4
) 我们得到 134,217,734:一旦丢失了位,我们就无法恢复原始值。
算术右移 (>>)
算术右移与逻辑右移完全相同,只不过它不是用零填充,而是用最高有效位填充。 这是因为最高有效位是符号位,或者区分正数和负数的位。 通过用最高有效位填充,算术右移可以保留符号。
例如,如果我们将此位模式解释为负数:
我们得到数字 -2,147,483,552。 通过算术移位 (-2,147,483,552 >> 4) 将其向右移动 4 个位置将得到:
或数字 -134,217,722。
所以我们看到,我们通过使用算术右移而不是逻辑右移保留了负数的符号。 我们再一次看到我们正在执行 2 的幂除法。
The bit shifting operators do exactly what their name implies. They shift bits. Here's a brief (or not-so-brief) introduction to the different shift operators.
The Operators
>>
is the arithmetic (or signed) right shift operator.>>>
is the logical (or unsigned) right shift operator.<<
is the left shift operator, and meets the needs of both logical and arithmetic shifts.All of these operators can be applied to integer values (
int
,long
, possiblyshort
andbyte
orchar
). In some languages, applying the shift operators to any datatype smaller thanint
automatically resizes the operand to be anint
.Note that
<<<
is not an operator, because it would be redundant.Also note that C and C++ do not distinguish between the right shift operators. They provide only the
>>
operator, and the right-shifting behavior is implementation defined for signed types. The rest of the answer uses the C# / Java operators.(In all mainstream C and C++ implementations including GCC and Clang/LLVM,
>>
on signed types is arithmetic. Some code assumes this, but it isn't something the standard guarantees. It's not undefined, though; the standard requires implementations to define it one way or another. However, left shifts of negative signed numbers is undefined behaviour (signed integer overflow). So unless you need arithmetic right shift, it's usually a good idea to do your bit-shifting with unsigned types.)Left shift (<<)
Integers are stored, in memory, as a series of bits. For example, the number 6 stored as a 32-bit
int
would be:Shifting this bit pattern to the left one position (
6 << 1
) would result in the number 12:As you can see, the digits have shifted to the left by one position, and the last digit on the right is filled with a zero. You might also note that shifting left is equivalent to multiplication by powers of 2. So
6 << 1
is equivalent to6 * 2
, and6 << 3
is equivalent to6 * 8
. A good optimizing compiler will replace multiplications with shifts when possible.Non-circular shifting
Please note that these are not circular shifts. Shifting this value to the left by one position (
3,758,096,384 << 1
):results in 3,221,225,472:
The digit that gets shifted "off the end" is lost. It does not wrap around.
Logical right shift (>>>)
A logical right shift is the converse to the left shift. Rather than moving bits to the left, they simply move to the right. For example, shifting the number 12:
to the right by one position (
12 >>> 1
) will get back our original 6:So we see that shifting to the right is equivalent to division by powers of 2.
Lost bits are gone
However, a shift cannot reclaim "lost" bits. For example, if we shift this pattern:
to the left 4 positions (
939,524,102 << 4
), we get 2,147,483,744:and then shifting back (
(939,524,102 << 4) >>> 4
) we get 134,217,734:We cannot get back our original value once we have lost bits.
Arithmetic right shift (>>)
The arithmetic right shift is exactly like the logical right shift, except instead of padding with zero, it pads with the most significant bit. This is because the most significant bit is the sign bit, or the bit that distinguishes positive and negative numbers. By padding with the most significant bit, the arithmetic right shift is sign-preserving.
For example, if we interpret this bit pattern as a negative number:
we have the number -2,147,483,552. Shifting this to the right 4 positions with the arithmetic shift (-2,147,483,552 >> 4) would give us:
or the number -134,217,722.
So we see that we have preserved the sign of our negative numbers by using the arithmetic right shift, rather than the logical right shift. And once again, we see that we are performing division by powers of 2.
位运算(包括移位)是低级硬件或嵌入式编程的基础。 如果您阅读设备甚至某些二进制文件格式的规范,您将看到字节、字和双字,它们被分解为非字节对齐的位字段,其中包含各种感兴趣的值。 访问这些位字段以进行读/写是最常见的用法。
图形编程中一个简单的真实示例是,16 位像素表示如下:
要获取绿色值,您可以执行以下操作:
说明
为了仅获取绿色值,它开始在偏移量 5 处并以 10 结束(即 6 位长)时,您需要使用(位)掩码,当应用于整个 16 位像素时,只会产生我们感兴趣的位。
适当的掩码是0x7E0,二进制为 0000011111100000(十进制为 2016)。
要应用遮罩,请使用 AND 运算符 (&)。
应用掩码后,您将得到一个 16 位数字,它实际上只是一个 11 位数字,因为它的 MSB 位于第 11 位。 绿色实际上只有 6 位长,因此我们需要使用右移 (11 - 6 = 5) 将其缩小,因此使用 5 作为偏移量 (
#define GREEN_OFFSET 5
)。同样常见的是使用位移位来快速乘除 2 的幂:
Bitwise operations, including bit shift, are fundamental to low-level hardware or embedded programming. If you read a specification for a device or even some binary file formats, you will see bytes, words, and dwords, broken up into non-byte aligned bitfields, which contain various values of interest. Accessing these bit-fields for reading/writing is the most common usage.
A simple real example in graphics programming is that a 16-bit pixel is represented as follows:
To get at the green value you would do this:
Explanation
In order to obtain the value of green ONLY, which starts at offset 5 and ends at 10 (i.e. 6-bits long), you need to use a (bit) mask, which when applied against the entire 16-bit pixel, will yield only the bits we are interested in.
The appropriate mask is 0x7E0 which in binary is 0000011111100000 (which is 2016 in decimal).
To apply a mask, you use the AND operator (&).
After applying the mask, you'll end up with a 16-bit number which is really just a 11-bit number since its MSB is in the 11th bit. Green is actually only 6-bits long, so we need to scale it down using a right shift (11 - 6 = 5), hence the use of 5 as offset (
#define GREEN_OFFSET 5
).Also common is using bit shifts for fast multiplication and division by powers of 2:
假设我们有一个字节:
应用单个左移位可以得到:
最左边的零被移出字节,并且一个新的零被附加到字节的右端。
这些位不会翻转; 它们被丢弃。 这意味着如果您左移 1101100 然后右移,您将不会得到相同的结果。
左移 N 相当于乘以 2N。
右移 N 是(如果您使用个数补码)相当于除法2N 并四舍五入到零。
位移位可用于极其快速的乘法和除法,前提是您使用的是 2 的幂。几乎所有低级图形例程都使用位移位。
例如,很久以前,我们在游戏中使用 13h 模式(320x200 256 色)。 在模式 13h 中,视频内存按像素顺序布局。 这意味着要计算像素的位置,您将使用以下数学公式:
现在,在那个时代,速度至关重要,因此我们将使用位移来执行此操作。
然而,320 不是 2 的幂,所以为了解决这个问题,我们必须找出 2 的幂加起来等于 320:
现在我们可以将其转换为左移:
对于最终结果:
现在我们得到与以前相同的偏移量,除了使用两个位移位而不是昂贵的乘法运算...在 x86 中,它会是这样的(注意,自从我完成汇编以来,它已经永远存在了(编者注:纠正了几个错误)并添加了一个 32 位示例)):
总计:在具有这些计时的任何古代 CPU 上有 28 个周期。
有
在同一个古老的 CPU 上
12 个周期。 是的,我们会努力工作以减少 16 个 CPU 周期。
在 32 位或 64 位模式下,两个版本都变得更短、更快。 现代乱序执行 CPU,如 Intel Skylake(请参阅 http://agner.org/optimize/)有非常快的硬件乘法(低延迟和高吞吐量),因此增益要小得多。 AMD Bulldozer 系列有点慢,特别是对于 64 位乘法。 在 Intel CPU 和 AMD Ryzen 上,两次移位延迟稍低,但比乘法执行更多指令(这可能会导致吞吐量降低):
vs.
编译器会为您执行此操作:了解如何 GCC、Clang 和 Microsoft Visual C++在优化
return 320*row + col;
时都使用shift+lea。这里要注意的最有趣的事情是 x86 有一个转变 - and-add 指令 (
LEA
),可以同时进行小幅左移和加法,其性能与add
指令相同。 ARM 更强大:任何指令的一个操作数都可以自由左移或右移。 因此,按已知为 2 的幂的编译时间常数进行缩放甚至比乘法更有效。好吧,回到现代……现在更有用的方法是使用位移位将两个 8 位值存储在 16 位整数中。 例如,在 C# 中:
在 C++ 中,如果您使用具有两个 8 位成员的
struct
,编译器应该为您执行此操作,但实际上并非总是如此。Let's say we have a single byte:
Applying a single left bitshift gets us:
The leftmost zero was shifted out of the byte, and a new zero was appended to the right end of the byte.
The bits don't rollover; they are discarded. That means if you left shift 1101100 and then right shift it, you won't get the same result back.
Shifting left by N is equivalent to multiplying by 2N.
Shifting right by N is (if you are using ones' complement) is the equivalent of dividing by 2N and rounding to zero.
Bitshifting can be used for insanely fast multiplication and division, provided you are working with a power of 2. Almost all low-level graphics routines use bitshifting.
For example, way back in the olden days, we used mode 13h (320x200 256 colors) for games. In Mode 13h, the video memory was laid out sequentially per pixel. That meant to calculate the location for a pixel, you would use the following math:
Now, back in that day and age, speed was critical, so we would use bitshifts to do this operation.
However, 320 is not a power of two, so to get around this we have to find out what is a power of two that added together makes 320:
Now we can convert that into left shifts:
For a final result of:
Now we get the same offset as before, except instead of an expensive multiplication operation, we use the two bitshifts...in x86 it would be something like this (note, it's been forever since I've done assembly (editor's note: corrected a couple mistakes and added a 32-bit example)):
Total: 28 cycles on whatever ancient CPU had these timings.
Vrs
12 cycles on the same ancient CPU.
Yes, we would work this hard to shave off 16 CPU cycles.
In 32 or 64-bit mode, both versions get a lot shorter and faster. Modern out-of-order execution CPUs like Intel Skylake (see http://agner.org/optimize/) have very fast hardware multiply (low latency and high throughput), so the gain is much smaller. AMD Bulldozer-family is a bit slower, especially for 64-bit multiply. On Intel CPUs, and AMD Ryzen, two shifts are slightly lower latency but more instructions than a multiply (which may lead to lower throughput):
vs.
Compilers will do this for you: See how GCC, Clang, and Microsoft Visual C++ all use shift+lea when optimizing
return 320*row + col;
.The most interesting thing to note here is that x86 has a shift-and-add instruction (
LEA
) that can do small left shifts and add at the same time, with the performance as anadd
instruction. ARM is even more powerful: one operand of any instruction can be left or right shifted for free. So scaling by a compile-time-constant that's known to be a power-of-2 can be even more efficient than a multiply.OK, back in the modern days... something more useful now would be to use bitshifting to store two 8-bit values in a 16-bit integer. For example, in C#:
In C++, compilers should do this for you if you used a
struct
with two 8-bit members, but in practice they don't always.位掩码和 移位
位移位通常用于低级图形编程。 例如,以 32 位字编码的给定像素颜色值。
为了更好地理解,相同的二进制值标记了哪些部分代表什么颜色部分。
举例来说,我们想要获取该像素颜色的绿色值。 我们可以通过屏蔽和移位轻松获得该值。
我们的掩码:
逻辑
&
运算符确保仅保留掩码为 1 的值。 我们现在要做的最后一件事是通过将所有这些位向右移动 16 位来获得正确的整数值(逻辑右移)。这样,我们就有了表示像素颜色中绿色数量的整数:
这通常用于编码或解码图像格式,如
jpg
、png
等。Bit Masking & Shifting
Bit shifting is often used in low-level graphics programming. For example, a given pixel color value encoded in a 32-bit word.
For better understanding, the same binary value labeled with what sections represent what color part.
Let's say for example we want to get the green value of this pixel's color. We can easily get that value by masking and shifting.
Our mask:
The logical
&
operator ensures that only the values where the mask is 1 are kept. The last thing we now have to do, is to get the correct integer value by shifting all those bits to the right by 16 places (logical right shift).Et voilà, we have the integer representing the amount of green in the pixel's color:
This is often used for encoding or decoding image formats like
jpg
,png
, etc.请注意,在 Java 实现中,要移位的位数按源的大小进行取模。
例如:
等于 2。您可能期望将位向右移动 65 次会将所有内容清零,但实际上相当于:
对于 <<、>> 和 >>> 都是如此。 。 我还没有用其他语言尝试过。
Note that in the Java implementation, the number of bits to shift is mod'd by the size of the source.
For example:
equals 2. You might expect shifting the bits to the right 65 times would zero everything out, but it's actually the equivalent of:
This is true for <<, >>, and >>>. I have not tried it out in other languages.
位运算符用于执行位级运算或以不同方式操作位。 人们发现按位运算要快得多,并且有时用于提高程序的效率。
基本上,位运算符可应用于整数类型:long、int、short、char 和 >字节。
按位移位运算符
分为左移和右移两类。
输出:6,这里 3 的二进制表示是 0...0011(考虑 32 位系统),因此当它移位一次时,前导零将被忽略/丢失,其余所有 31 位向左移动。 并在末尾添加零。 于是就变成了0...0110,这个数字的十进制表示是6。
输出:-2,在java中负数,用2的补码表示。 SO,-1代表2^32-1,相当于1....11(考虑32位系统)。 当移位一次时,前导位将被忽略/丢失,其余 31 位将向左移位,最后添加零。 所以它变成了 11...10,它的十进制等值是 -2。
所以,我认为您对左移及其工作原理有了足够的了解。
以下代码片段将值 35 向右移动两个位置:
输出:8,在 32 位系统中 35 的二进制表示为 00... 00100011,所以当我们右移两次时,前 30 个前导位被移动/移位到右侧,两个低位被丢失/忽略,并在前导位添加两个零。 所以,它变成了 00....00001000,这个二进制表示的十进制相当于 8。
或者有一个简单的数学技巧来找出以下代码的输出:为了概括这一点,我们可以说,x>>> y = 下限(x/pow(2,y))。 考虑上面的示例,x=35 和 y=2,因此 35/2^2 = 8.75,如果我们采用下限值,则答案为 8。
输出:
但请记住一件事,这个技巧对于较小的 y 值来说是很好的,如果你采用较大的 y 值,它会给出不正确的输出。
由于负数,右移运算符以有符号和无符号两种模式工作。 在有符号右移运算符(>>)中,如果是正数,则用 0 填充前导位。如果是负数,则用 1 填充前导位。以保留符号。 这称为“符号扩展”。
输出:-5,正如我上面所解释的,编译器将负值存储为 2 的补码。 因此,-10 表示为 2^32-10,并以二进制表示形式考虑 32 位系统 11....0110。 当我们移位/移动一次时,前 31 个前导位被移位到右侧,而低位被丢失/忽略。 因此,它变成 11...0011,这个数字的十进制表示形式是 -5(我如何知道数字的符号?因为前导位是 1)。
有趣的是,如果右移 -1,结果始终保持 -1,因为符号扩展不断在高位中引入更多的值。
输出:2147483647,因为-2在32位系统中表示为11...10。 当我们将位移一位时,前 31 位前导位向右移动/右移,低位被丢失/忽略,并且将零添加到前导位。 因此,它变为 011...1111 (2^31-1),其十进制等效值为 2147483647。
The Bitwise operators are used to perform operations a bit-level or to manipulate bits in different ways. The bitwise operations are found to be much faster and are some times used to improve the efficiency of a program.
Basically, Bitwise operators can be applied to the integer types: long, int, short, char and byte.
Bitwise Shift Operators
They are classified into two categories left shift and the right shift.
Output: 6, Here the binary representation of 3 is 0...0011(considering 32-bit system) so when it shifted one time the leading zero is ignored/lost and all the rest 31 bits shifted to left. And zero is added at the end. So it became 0...0110, the decimal representation of this number is 6.
Output: -2, In java negative number, is represented by 2's complement. SO, -1 represent by 2^32-1 which is equivalent to 1....11(Considering 32-bit system). When shifted one time the leading bit is ignored/lost and the rest 31 bits shifted to left and zero is added at the last. So it becomes, 11...10 and its decimal equivalent is -2.
So, I think you get enough knowledge about the left shift and how its work.
The following code fragment shifts the value 35 to the right by two positions:
Output: 8, As a binary representation of 35 in a 32-bit system is 00...00100011, so when we right shift it two times the first 30 leading bits are moved/shifts to the right side and the two low-order bits are lost/ignored and two zeros are added at the leading bits. So, it becomes 00....00001000, the decimal equivalent of this binary representation is 8.
Or there is a simple mathematical trick to find out the output of this following code: To generalize this we can say that, x >> y = floor(x/pow(2,y)). Consider the above example, x=35 and y=2 so, 35/2^2 = 8.75 and if we take the floor value then the answer is 8.
Output:
But remember one thing this trick is fine for small values of y if you take the large values of y it gives you incorrect output.
Because of the negative numbers the Right shift operator works in two modes signed and unsigned. In signed right shift operator (>>), In case of a positive number, it fills the leading bits with 0. And In case of a negative number, it fills leading bits with 1. To keep the sign. This is called 'sign extension'.
Output: -5, As I explained above the compiler stores the negative value as 2's complement. So, -10 is represented as 2^32-10 and in binary representation considering 32-bit system 11....0110. When we shift/ move one time the first 31 leading bits got shifted in the right side and the low-order bit got lost/ignored. So, it becomes 11...0011 and the decimal representation of this number is -5 (How I know the sign of number? because the leading bit is 1).
It is interesting to note that if you shift -1 right, the result always remains -1 since sign extension keeps bringing in more ones in the high-order bits.
Output: 2147483647, Because -2 is represented as 11...10 in a 32-bit system. When we shift the bit by one, the first 31 leading bit is moved/shifts in right and the low-order bit is lost/ignored and the zero is added to the leading bit. So, it becomes 011...1111 (2^31-1) and its decimal equivalent is 2147483647.
一个问题是以下内容取决于实现(根据 ANSI 标准):
x 现在可以是 127 (01111111) 或仍然是 -1 (11111111)。
实际上,通常是后者。
One gotcha is that the following is implementation dependent (according to the ANSI standard):
x can now be 127 (01111111) or still -1 (11111111).
In practice, it's usually the latter.
我只写提示和技巧。 它可能在测试和考试中有用。
n = n*2
:n = n<<1
n = n/2
:n = n>>1
!(n & (n-1))
n
的第 位:n |= (1 << x)
x&1 == 0
(偶数)x ^ (1<
I am writing tips and tricks only. It may be useful in tests and exams.
n = n*2
:n = n<<1
n = n/2
:n = n>>1
!(n & (n-1))
n
:n |= (1 << x)
x&1 == 0
(even)x ^ (1<<n)
希望有人发现我制作的这个十进制到二进制字符串可视化工具有助于理解位操作运算符如何通过算术与基数转换相关:
输入任何
无符号整数
直到2^53 - 1
,它都会生成二进制字符串的代数版本,可以直接送入
bc
、Wolfram Alpha,很可能还有 Excel(尚未测试),以及纯粹的位操作版本只是一系列按位或与移位配对,可以直接输入到
C
/Perl
代码中。-- 注意:在代数字符串中,二进制字符串以大端 MSB 在前表示法显示,而位移字符串则以小端法 LSB 在前表示法显示。
-- 注意:出于视觉清晰度的目的,零已被删除。
Hopefully some find this decimal-to-binary-string visualizer I made to be helpful in understanding how bit manipulation operators relate to base conversion via arithmetic :
Feeding in any
unsigned int
up to2^53 - 1
, it would generate bothThe algebraic version of the binary string, which can be directly fed into
bc
, Wolfram Alpha, most likely Excel too (haven't tested it), as well asThe purely bit manipulation version that's simply a series of bitwise OR paired with bit-shifts, which can be directly fed into
C
/Perl
codes.-- Note : in the algebraic one, the binary string is shown in big-endian MSB first notation, while the bit-shifting one is in little-endian LSB first.
-- Note : for visual clarity purposes, zeros have been blanked out.
Python 中一些有用的位运算/操作。
我实现了 Ravi Prakash 的用 Python 回答。
Some useful bit operations/manipulations in Python.
I implemented Ravi Prakash's answer in Python.
请注意,Windows 平台上仅提供 32 位版本的 PHP。
那么如果你例如转移 << 或>> 超过 31 位,结果是不可预期的。 通常会返回原始数字而不是零,这可能是一个非常棘手的错误。
当然,如果您使用 64 位版本的 PHP (Unix),则应避免移位超过 63 位。 不过,比如MySQL使用的是64位BIGINT,所以应该不会有任何兼容性问题。
更新:从 PHP 7 Windows 开始,PHP 构建终于能够使用完整的 64 位整数:
整数的大小取决于平台,但通常的最大值约为 20 亿(即 32 位有符号)。 64 位平台的最大值通常约为 9E18,但 PHP 7 之前的 Windows 上的最大值始终为 32 位。
Be aware of that only 32 bit version of PHP is available on the Windows platform.
Then if you for instance shift << or >> more than by 31 bits, results are unexpectable. Usually the original number instead of zeros will be returned, and it can be a really tricky bug.
Of course if you use 64 bit version of PHP (Unix), you should avoid shifting by more than 63 bits. However, for instance, MySQL uses the 64-bit BIGINT, so there should not be any compatibility problems.
UPDATE: From PHP 7 Windows, PHP builds are finally able to use full 64 bit integers:
The size of an integer is platform-dependent, although a maximum value of about two billion is the usual value (that's 32 bits signed). 64-bit platforms usually have a maximum value of about 9E18, except on Windows prior to PHP 7, where it was always 32 bit.