使用 -1 将所有位设置为 true 是否安全?
我见过这种模式在 C 和 C 语言中被大量使用。 C++。
unsigned int flags = -1; // all bits are true
这是实现此目的的良好便携式方法吗? 或者使用 0xffffffff
还是 ~0
更好?
I've seen this pattern used a lot in C & C++.
unsigned int flags = -1; // all bits are true
Is this a good portable way to accomplish this? Or is using 0xffffffff
or ~0
better?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(21)
我建议您完全按照您所展示的那样进行操作,因为这是最直接的方法。 初始化为
-1
,它总是,独立于实际的符号表示,而~
有时会出现令人惊讶的行为,因为你必须有正确的操作数类型。 只有这样,您才能获得unsigned
类型的最高值。对于可能出现的意外示例,请考虑以下示例:
它不一定会将所有位均为 1 的模式存储到
a
中。 但它首先会创建一个unsigned int
中所有位均为 1 的模式,然后将其分配给a
。 当unsigned long
有更多位时会发生的情况是,并非所有这些位都是 1。考虑一下这个,它在非二进制补码表示上会失败:
原因是
~ 0
必须反转所有位。 反转它会在二进制补码机器上产生-1
(这是我们需要的值!),但在另一台机器上不会产生-1
表示。 在补码机器上,它产生零。 因此,在补码机器上,上面的代码会将a
初始化为零。你应该明白的是,这一切都与值有关,而不是位。 该变量使用值进行初始化。 如果在初始化程序中修改用于初始化的变量的位,则将根据这些位生成值。 将
a
初始化为可能的最高值所需的值为-1
或UINT_MAX
。 第二个取决于a
的类型 - 您需要对unsigned long
使用ULONG_MAX
。 然而,第一个不依赖于它的类型,这是获得最高值的好方法。我们不是讨论
-1
是否所有位都为1(并不总是如此)。 我们不是讨论~0
是否所有位都为 1(当然是)。但我们讨论的是结果是什么初始化的
flags
变量是。 对于它,只有-1
适用于每种类型和机器。I recommend you to do it exactly as you have shown, since it is the most straight forward one. Initialize to
-1
which will work always, independent of the actual sign representation, while~
will sometimes have surprising behavior because you will have to have the right operand type. Only then you will get the most high value of anunsigned
type.For an example of a possible surprise, consider this one:
It won't necessarily store a pattern with all bits 1 into
a
. But it will first create a pattern with all bits 1 in anunsigned int
, and then assign it toa
. What happens whenunsigned long
has more bits is that not all of those are 1.And consider this one, which will fail on a non-two's complement representation:
The reason for that is that
~0
has to invert all bits. Inverting that will yield-1
on a two's complement machine (which is the value we need!), but will not yield-1
on another representation. On a one's complement machine, it yields zero. Thus, on a one's complement machine, the above will initializea
to zero.The thing you should understand is that it's all about values - not bits. The variable is initialized with a value. If in the initializer you modify the bits of the variable used for initialization, the value will be generated according to those bits. The value you need, to initialize
a
to the highest possible value, is-1
orUINT_MAX
. The second will depend on the type ofa
- you will need to useULONG_MAX
for anunsigned long
. However, the first will not depend on its type, and it's a nice way of getting the highest value.We are not talking about whether
-1
has all bits one (it doesn't always have). And we're not talking about whether~0
has all bits one (it has, of course).But what we are talking about is what the result of the initialized
flags
variable is. And for it, only-1
will work with every type and machine.unsigned int flags = -1;
是可移植的。unsigned int flags = ~0;
不可移植,因为它依赖于二进制补码表示。
unsigned int flags = 0xffffffff;
不可移植,因为它假定 32 位整数。
如果您想以 C 标准保证的方式设置所有位,请使用第一个。
unsigned int flags = -1;
is portable.unsigned int flags = ~0;
isn't portable because itrelies on a two's-complement representation.
unsigned int flags = 0xffffffff;
isn't portable becauseit assumes 32-bit ints.
If you want to set all bits in a way guaranteed by the C standard, use the first one.
坦率地说,我认为所有 fff 都更具可读性。 至于它是反模式的评论,如果您真的关心所有位都被设置/清除,我认为您可能处于这样的情况:您无论如何都关心变量的大小,这将需要诸如 boost 之类的东西::uint16_t 等
Frankly I think all fff's is more readable. As to the comment that its an antipattern, if you really care that all the bits are set/cleared, I would argue that you are probably in a situation where you care about the size of the variable anyway, which would call for something like boost::uint16_t, etc.
避免上述问题的一个方法就是简单地做到:
便携且切题。
A way which avoids the problems mentioned is to simply do:
Portable and to the point.
便携的? 是。
好的? 有争议,正如该帖子上显示的所有混乱所证明的那样。 足够清晰,让你的程序员同事能够毫无困惑地理解代码,应该是我们衡量优秀代码的维度之一。
此外,此方法很容易出现编译器警告。 要消除警告而不损害编译器,您需要显式强制转换。 例如,
显式强制转换要求您注意目标类型。 如果您关注目标类型,那么您自然会避免其他方法的陷阱。
我的建议是注意目标类型并确保没有隐式转换。 例如:
对于您的程序员同事来说,所有这些都是正确且更明显的。
对于 C++11:我们可以使用
auto
使这些变得更加简单:我认为正确且明显的比简单正确更好。
Portable? Yes.
Good? Debatable, as evidenced by all the confusion shown on this thread. Being clear enough that your fellow programmers can understand the code without confusion should be one of the dimensions we measure for good code.
Also, this method is prone to compiler warnings. To elide the warning without crippling your compiler, you'd need an explicit cast. For example,
The explicit cast requires that you pay attention to the target type. If you're paying attention to the target type, then you'll naturally avoid the pitfalls of the other approaches.
My advice would be to pay attention to the target type and make sure there are no implicit conversions. For example:
All of which are correct and more obvious to your fellow programmers.
And with C++11: We can use
auto
to make any of these even simpler:I consider correct and obvious better than simply correct.
我不确定在 C++ 中首先使用 unsigned int 作为标志是一个好主意。 那么bitset之类的呢?
std::numeric_limit::max()
更好,因为0xffffffff
假定 unsigned int 是 32 位整数。I am not sure using an unsigned int for flags is a good idea in the first place in C++. What about bitset and the like?
std::numeric_limit<unsigned int>::max()
is better because0xffffffff
assumes that unsigned int is a 32-bit integer.标准保证将 -1 转换为任何无符号类型都会得到全一。 使用
~0U
通常不好,因为0
具有类型unsigned int
并且不会填充较大无符号类型的所有位,除非您明确写一些类似于~0ULL
的内容。 在健全的系统上,~0
应该与-1
相同,但由于标准允许补码和符号/数值表示,严格来说它是不可移植的。当然,如果您知道自己需要 32 位,那么写出
0xffffffff
总是可以的,但是 -1 的优点是即使您不知道类型的大小,它也可以在任何上下文中工作,例如适用于多种类型的宏,或者类型的大小是否因实现而异。 如果您确实知道类型,获得全一的另一种安全方法是限制宏UINT_MAX
、ULONG_MAX
、ULLONG_MAX
等。我个人认为始终使用-1。 它总是有效的,你不必考虑它。
Converting -1 into any unsigned type is guaranteed by the standard to result in all-ones. Use of
~0U
is generally bad since0
has typeunsigned int
and will not fill all the bits of a larger unsigned type, unless you explicitly write something like~0ULL
. On sane systems,~0
should be identical to-1
, but since the standard allows ones-complement and sign/magnitude representations, strictly speaking it's not portable.Of course it's always okay to write out
0xffffffff
if you know you need exactly 32 bits, but -1 has the advantage that it will work in any context even when you do not know the size of the type, such as macros that work on multiple types, or if the size of the type varies by implementation. If you do know the type, another safe way to get all-ones is the limit macrosUINT_MAX
,ULONG_MAX
,ULLONG_MAX
, etc.Personally I always use -1. It always works and you don't have to think about it.
只要您将
#include
作为您的包含之一,您就应该使用如果您想要一个长的位,您可以使用
这些值保证具有所有无论有符号整数如何实现,结果的值位都会设置为 1。
As long as you have
#include <limits.h>
as one of your includes, you should just useIf you want a long's worth of bits, you could use
These values are guaranteed to have all the value bits of the result set to 1, regardless of how signed integers are implemented.
是的。 正如其他答案中提到的,
-1
是最可移植的; 然而,它不是很语义化并且会触发编译器警告。要解决这些问题,请尝试这个简单的助手:
用法:
Yes. As mentioned in other answers,
-1
is the most portable; however, it is not very semantic and triggers compiler warnings.To solve these issues, try this simple helper:
Usage:
请参阅 litb 的答案以获得对问题的非常清晰的解释。
我的不同意见是,严格来说,这两种情况都没有保证。 我不知道有任何体系结构不代表所有位设置的“一小于二的位数次方”的无符号值,但这是标准实际上所说的内容(3.9.1/7 plus注 44):
这就使得其中一个位有可能成为任何东西。
See litb's answer for a very clear explanation of the issues.
My disagreement is that, very strictly speaking, there are no guarantees for either case. I don't know of any architecture that does not represent an unsigned value of 'one less than two to the power of the number of bits' as all bits set, but here is what the Standard actually says (3.9.1/7 plus note 44):
That leaves the possibility for one of the bits to be anything at all.
在 Intel 的 IA-32 处理器上,可以将 0xFFFFFFFF 写入 64 位寄存器并获得预期结果。 这是因为 IA32e(IA32 的 64 位扩展)仅支持 32 位立即数。 在 64 位指令中,32 位立即数符号扩展为 64 位。
以下内容是非法的:
以下内容将 64 个 1 放入 RAX:
为了完整起见,以下内容将 32 个 1 放入 RAX 的下部(又名 EAX):
事实上,当我想将 0xffffffff 写入 a 时,程序失败了64 位变量,我得到的是 0xffffffffffffffff。 在 C 中,
结果是:
我想将其作为对所有表示 0xFFFFFFFF 假设 32 位的答案的评论,但很多人回答了它,我想我会将其添加为单独的答案。
On Intel's IA-32 processors it is OK to write 0xFFFFFFFF to a 64-bit register and get the expected results. This is because IA32e (the 64-bit extension to IA32) only supports 32-bit immediates. In 64-bit instructions 32-bit immediates are sign-extended to 64-bits.
The following is illegal:
The following puts 64 1s in RAX:
Just for completeness, the following puts 32 1s in the lower part of RAX (aka EAX):
And in fact I've had programs fail when I wanted to write 0xffffffff to a 64-bit variable and I got a 0xffffffffffffffff instead. In C this would be:
the result is:
I thought to post this as a comment to all the answers that said that 0xFFFFFFFF assumes 32 bits, but so many people answered it I figured I'd add it as a separate answer.
我不会做-1 的事情。 这是相当不直观的(至少对我来说)。 将有符号数据分配给无符号变量似乎违反了事物的自然顺序。
在你的情况下,我总是使用
0xFFFF
。 (当然,对可变大小使用正确的 F 数量。)[顺便说一句,我很少在现实世界的代码中看到 -1 技巧。]
此外,如果您真的关心变量中的各个位,那么它会开始使用固定宽度的
uint8_t
、uint16_t
、uint32_t
类型是个好主意。I would not do the -1 thing. It's rather non-intuitive (to me at least). Assigning signed data to an unsigned variable just seems to be a violation of the natural order of things.
In your situation, I always use
0xFFFF
. (Use the right number of Fs for the variable size of course.)[BTW, I very rarely see the -1 trick done in real-world code.]
Additionally, if you really care about the individual bits in a vairable, it would be good idea to start using the fixed-width
uint8_t
,uint16_t
,uint32_t
types.尽管
0xFFFF
(或0xFFFFFFFF
等)可能更容易阅读,但它可能会破坏代码的可移植性,否则代码是可移植的。 例如,考虑一个库例程来计算数据结构中有多少项设置了某些位(确切的位由调用者指定)。 该例程可能完全不知道这些位代表什么,但仍然需要有一个“所有位设置”常量。 在这种情况下,-1 将比十六进制常量好得多,因为它适用于任何位大小。另一种可能性是,如果使用
typedef
值作为位掩码,则使用 ~(bitMaskType)0; 如果位掩码恰好只是 16 位类型,则该表达式将仅设置 16 位(即使“int”否则为 32 位),但由于 16 位将是所需的全部,所以事情应该没问题 前提是人们实际上在类型转换中使用了适当的类型。顺便说一句,如果十六进制常量太大而无法放入 int 中,那么
longvar &= ~[hex_constant]
形式的表达式会遇到一个令人讨厌的问题,但可以放入 int 中。无符号整数
。 如果int
是 16 位,则longvar &= ~0x4000;
或longvar &= ~0x10000
; 将清除longvar
的一位,但longvar &= ~0x8000;
将清除第 15 位及其之上的所有位。 适合int
的值会将补码运算符应用于int
类型,但结果将符号扩展为long
,设置上限位。 对于unsigned int
来说太大的值将会对long
类型应用补码运算符。 然而,介于这些大小之间的值会将补码运算符应用于unsigned int
类型,然后将其转换为不带符号扩展的long
类型。Although the
0xFFFF
(or0xFFFFFFFF
, etc.) may be easier to read, it can break portability in code which would otherwise be portable. Consider, for example, a library routine to count how many items in a data structure have certain bits set (the exact bits being specified by the caller). The routine may be totally agnostic as to what the bits represent, but still need to have an "all bits set" constant. In such a case, -1 will be vastly better than a hex constant since it will work with any bit size.The other possibility, if a
typedef
value is used for the bitmask, would be to use ~(bitMaskType)0; if bitmask happens to only be a 16-bit type, that expression will only have 16 bits set (even if 'int' would otherwise be 32 bits) but since 16 bits will be all that are required, things should be fine provided that one actually uses the appropriate type in the typecast.Incidentally, expressions of the form
longvar &= ~[hex_constant]
have a nasty gotcha if the hex constant is too large to fit in anint
, but will fit in anunsigned int
. If anint
is 16 bits, thenlongvar &= ~0x4000;
orlongvar &= ~0x10000
; will clear one bit oflongvar
, butlongvar &= ~0x8000;
will clear out bit 15 and all bits above that. Values which fit inint
will have the complement operator applied to a typeint
, but the result will be sign extended tolong
, setting the upper bits. Values which are too big forunsigned int
will have the complement operator applied to typelong
. Values which are between those sizes, however, will apply the complement operator to typeunsigned int
, which will then be converted to typelong
without sign extension.正如其他人提到的,-1 是创建整数的正确方法,该整数将转换为所有位设置为 1 的无符号类型。但是,C++ 中最重要的是使用正确的类型。 因此,您的问题的正确答案(包括您提出的问题的答案)是这样的:
这将始终包含您需要的确切位数。 它构造一个
std::bitset
,其所有位都设置为 1,原因与其他答案中提到的相同。As others have mentioned, -1 is the correct way to create an integer that will convert to an unsigned type with all bits set to 1. However, the most important thing in C++ is using correct types. Therefore, the correct answer to your problem (which includes the answer to the question you asked) is this:
This will always contain the exact amount of bits you need. It constructs a
std::bitset
with all bits set to 1 for the same reasons mentioned in other answers.这当然是安全的,因为 -1 总是会设置所有可用的位,但我更喜欢 ~0。 -1 对于
unsigned int
来说没有多大意义。0xFF
...不好,因为它取决于类型的宽度。It is certainly safe, as -1 will always have all available bits set, but I like ~0 better. -1 just doesn't make much sense for an
unsigned int
.0xFF
... is not good because it depends on the width of the type.实际上:是
理论上:否。
-1 = 0xFFFFFFFF(或平台上 int 的任何大小)仅适用于二进制补码算术。 实际上,它是可行的,但在一些遗留机器(IBM 大型机等)中,您拥有实际的符号位而不是二进制补码表示形式。 您提出的 ~0 解决方案应该适用于任何地方。
Practically: Yes
Theoretically: No.
-1 = 0xFFFFFFFF (or whatever size an int is on your platform) is only true with two's complement arithmetic. In practice, it will work, but there are legacy machines out there (IBM mainframes, etc.) where you've got an actual sign bit rather than a two's complement representation. Your proposed ~0 solution should work everywhere.
利用以下事实:将无符号类型的所有位分配为 1 相当于获取给定类型的最大可能值,
并将问题的范围扩展到所有无符号整数类型:
分配 -1 适用于任何无符号 C 和 C++ 的整数类型(unsigned int、uint8_t、uint16_t 等)。
作为替代方案,对于 C++,您可以:
并使用std::numeric_limits
your_type >::max()
目的可以增加更多清晰度,因为分配
-1
总是需要一些解释性注释。Leveraging on the fact that assigning all bits to one for an unsigned type is equivalent to taking the maximum possible value for the given type,
and extending the scope of the question to all unsigned integer types:
Assigning -1 works for any unsigned integer type (unsigned int, uint8_t, uint16_t, etc.) for both C and C++.
As an alternative, for C++, you can either:
<limits>
and usestd::numeric_limits< your_type >::max()
The purpose could be add more clarity, as assigning
-1
would always need some explanatory comment.一种使含义更明显但又避免重复类型的方法:
A way to make the meaning bit more obvious and yet to avoid repeating the type:
再次强调,为什么 Adrian McCarthy 的方法可能是自 C++11 以来最新的最佳解决方案,在标准一致性、类型安全/显式清晰度和减少可能的歧义之间进行折衷:
我将在下面详细解释我的偏好。 正如约翰内斯完全正确地提到的,这里令人烦恼的根本根源是关于值与根据位表示语义的问题以及我们到底在谈论什么类型(分配的值类型与可能的编译时积分常量的类型)。 由于没有标准的内置机制来明确确保针对无符号整数值的 OP 的具体用例将所有位设置为 1,因此很明显,这里不可能完全独立于值语义(std::bitset是一个常见的纯位层引用容器,但问题通常是关于无符号整数)。 但我们也许可以减少这里的歧义。
“更好”的标准兼容方法的比较:
OP 的方式:
优点:
是可能的,无需任何进一步的调整。 缺点:
通过定义引用最大值:
这规避了 -1 方法的有符号无符号转换问题,但引入了几个新问题:有疑问,如果您想,至少必须再次在这里查看两次例如,将目标类型更改为 unsigned long。 在这里,我们必须确定这样一个事实:最大值会导致标准将所有位设置为 1(并且再次涉及填充位)。 位语义在这里也不能直接从代码中明显看出。
更明确地引用最大值:
在我看来,这是更好的最大值方法,因为它是宏/定义自由的,并且对于所涉及的类型是明确的。 但对方法类型本身的所有其他担忧仍然存在。
Adrian 的方法(以及为什么我认为它是 C++11 之前及之后的首选方法):
优点:
缺点:
例如,如果分配给成员,则很有可能与 C++11 之前的类型不匹配:
类中的声明:
构造函数中的初始化:
但从 C++11 开始,使用 decltype + auto 可以有效防止大多数这些可能的问题。 其中一些类型不匹配场景(例如在接口边界上)对于 -1 方法也是可能的。
针对预声明变量的鲁棒最终 C++11 方法:
因此,在全面了解此处所有方法的优点和缺点的权重之后,我推荐将此方法作为首选方法,至少从那时起C++11。
更新:由于安德鲁·亨勒(Andrew Henle)的提示,我删除了有关其可读性的声明,因为这可能是一个过于主观的声明。 但我仍然认为,它的可读性至少不比大多数最大值方法或通过编译时积分/文字提供显式最大值的方法差,因为 static_cast-usage 也是“建立的”并且是内置的,与定义/宏,甚至标准库。
An additional effort to emphasize, why Adrian McCarthy's approach here might be the best solution at latest since C++11 in terms of a compromise between standard conformity, type safety/explicit clearness and reduction of possible ambiguities:
I'm going to explain my preference in detail below. As Johannes mentioned totally correctly, the fundamental origin of irritations here is the question about value vs. according bit representation semantics and about what types we're talking about exactly (the assigned value type vs. the possible compile time integral constant's type). Since there's no standard built-in mechanism to explicitly ensure the set of all bits to 1 for the concrete use case of the OP about unsigned integer values, it's obvious, that it's impossible to be fully independent of value semantics here (std::bitset is a common pure bit-layer refering container but the question was about unsigned integers in general). But we might be able to reduce ambiguity here.
Comparison of the 'better' standard compliant approaches:
The OP's way:
PROs:
CONs:
Refering to maximum values via defines:
This circumvents the signed unsigned transition issue of the -1 approach but introduces several new problems: In doubt, one has to look twice here again, at the latest if you want to change the target type to unsigned long for instance. And here, one has to be sure about the fact, that the maximum value leads to all bits set to 1 by the standard (and padding bit concerns again). Bit semantics are also not obvious here directly from the code solely again.
Refering to maximum values more explicitly:
On my opinion, that's the better maximum value approach since it's macro/define free and one is explicit about the involved type. But all other concerns about the approach type itself remain.
Adrian's approach (and why I think, it's the preferred one before C++11 and since):
PROs:
CONs:
If assigned to a member for instance, there's a small chance that you mismatch types with pre C++11:
Declaration in class:
Initialization in constructor:
But since C++11, the usage of decltype + auto is powerful to prevent most of these possible issues. And some of these type mismatch scenarios (on interface boundaries for instance) are also possible for the -1 approach.
Robust final C++11 approach for pre-declared variables:
So with a full view on the weighting of the PROs and CONs of all approaches here, I recommend this one as the preferred approach, at latest since C++11.
Update: Thanks to a hint by Andrew Henle, I removed the statement about its readability since that might be a too subjective statement. But I still think, its readability is at least not that worse than most of the maximum value approaches or the ones with explicit maximum value provision via compile time integrals/literals since static_cast-usage is "established" too and built-in in contrast to defines/macros and even the std-lib.
我说:
这总会给你想要的结果。
I say:
This will always give you the desired result.
是的,显示的表示非常正确,就好像我们反过来做一样,u将需要一个运算符来反转所有位,但在这种情况下,如果我们考虑机器中整数的大小,
例如在大多数机器上整数是 2 字节 = 16 位,它可以容纳的最大值是 2^16-1=65535 2^16=65536
0%65536=0
-1%65536=65535 对应于 1111........................1 并且所有位都设置为 1(如果我们考虑残差类 mod 65536)
因此它非常简单。
如果你考虑这个概念,我想
不会,它非常适合无符号整数,而且实际上
只要检查以下程序片段
int main() 就可以了
{
}
b = 4294967295 的答案 whcih 在 4 字节整数上是 -1%2^32
它对于无符号整数是完全有效的,请报告
因此,如果有任何差异,
yes the representation shown is very much correct as if we do it the other way round u will require an operator to reverse all the bits but in this case the logic is quite straightforward if we consider the size of the integers in the machine
for instance in most machines an integer is 2 bytes = 16 bits maximum value it can hold is 2^16-1=65535 2^16=65536
0%65536=0
-1%65536=65535 which corressponds to 1111.............1 and all the bits are set to 1 (if we consider residue classes mod 65536)
hence it is much straight forward.
I guess
no if u consider this notion it is perfectly dine for unsigned ints and it actually works out
just check the following program fragment
int main()
{
}
answer for b = 4294967295 whcih is -1%2^32 on 4 byte integers
hence it is perfectly valid for unsigned integers
in case of any discrepancies plzz report