将变量声明为无符号的重要性

发布于 2024-09-19 05:03:09 字数 73 浏览 3 评论 0原文

如果您知道变量永远不应该为负数,那么将变量声明为无符号是否重要?它是否有助于防止除负数之外的任何内容被输入到不应该包含负数的函数中?

Is it important to declare a variable as unsigned if you know it should never be negative? Does it help prevent anything other than negative numbers being fed into a function that shouldn't have them?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

智商已欠费 2024-09-26 05:03:09

将语义上非负值的变量声明为无符号是一种很好的风格和良好的编程实践。

但是,请记住,这并不能阻止您犯错误。如果将负值分配给无符号整数是完全合法的,并且该值根据无符号算术规则隐式转换为无符号形式。在这种情况下,一些编译器可能会发出警告,而另一些编译器会悄悄地发出警告。

还值得注意的是,使用无符号整数需要了解一些专用的无符号技术。例如,与此问题相关的一个经常提到的“经典”示例是向后迭代。

for (int i = 99; i >= 0; --i) {
  /* whatever */
}

上面的循环对于有符号 i 看起来很自然,但它不能直接转换为无符号形式,这意味着它

for (unsigned i = 99; i >= 0; --i) {
  /* whatever */
}

不会并没有真正做到它想要做的事情(这实际上是一个无限循环)。在这种情况下,正确的技术是

for (unsigned i = 100; i > 0; ) {
  --i;
  /* whatever */
}

or

for (unsigned i = 100; i-- > 0; ) {
  /* whatever */
}

这通常用作反对无符号类型的参数,即据称上述循环的无符号版本看起来“不自然”和“不可读”。事实上,我们在这里处理的问题是在封闭开放范围的左端附近工作的一般问题。这个问题在 C 和 C++ 中以多种不同的方式表现出来(例如使用“滑动指针”技术对数组进行向后迭代,或者使用迭代器对标准容器进行向后迭代)。即,无论上述无符号循环对您来说有多不雅观,都无法完全避免它们,即使您从不使用无符号整数类型。因此,最好学习这些技巧并将它们纳入您的一套既定习语中。

Declaring variables for semantically non-negative values as unsigned is a good style and good programming practice.

However, keep in mind that it doesn't prevent you from making errors. If is perfectly legal to assign negative values to unsigned integers, with the value getting implicitly converted to unsigned form in accordance with the rules of unsigned arithmetic. Some compilers might issue warnings in such cases, others will do it quietly.

It is also worth noting that working with unsigned integers requires knowing some dedicated unsigned techniques. For example, a "classic" example that is often mentioned with relation to this issue is backward iteration

for (int i = 99; i >= 0; --i) {
  /* whatever */
}

The above cycle looks natural with signed i, but it cannot be directly converted to unsigned form, meaning that

for (unsigned i = 99; i >= 0; --i) {
  /* whatever */
}

doesn't really do what it is intended to do (it is actually an endless cycle). The proper technique in this case is either

for (unsigned i = 100; i > 0; ) {
  --i;
  /* whatever */
}

or

for (unsigned i = 100; i-- > 0; ) {
  /* whatever */
}

This is often used as an argument against unsigned types, i.e. allegedly the above unsigned versions of the cycle look "unnatural" and "unreadable". In reality though the issue we are dealing here is the generic issue of working near the left end of a closed-open range. This issue manifests itself in many different ways in C and C++ (like backward iteration over an array using the "sliding pointer" technique of backward iteration over a standard container using an iterator). I.e. regardless of how inelegant the above unsigned cycles might look to you, there's no way to avoid them entirely, even if you never use unsigned integer types. So, it is better to learn these techniques and include them into your set of established idioms.

它不能防止人们滥用您的界面,但至少他们应该收到警告,除非他们添加 C 风格强制转换或 static_cast 来使其消失(在这种情况下您无法进一步帮助他们) 。

是的,这很有价值,因为它正确地表达了您想要的语义。

It doesn't prevent people misusing your interface, but at least they should get a warning unless they add a C-style cast or static_cast to make it go away (in which case you cannot help them further).

Yes, there is value in this as it properly expresses the semantics you wish.

尾戒 2024-09-26 05:03:09

它做了两件事:

1)它为您提供了双倍的无符号值范围。当“有符号”时,最高位用作符号位(1 表示负,0 表示正),当“无符号”时,您可以将该位用作数据。例如,char 类型从 -128 到 127,unsigned char 类型从 0 到 255

2) 它影响 >> 运算符会起作用,特别是在右移负值时。

It does two things:

1) It gives you double the range for your unsigned values values. When "signed" the highest bit is used as the sign bit (1 means negative, 0 for positive), when "unsigned" you can use that bit for data. E.g., a char type goes from -128 to 127, an unsigned char goes form 0 to 255

2) It affects how the >> operator acts, specifically when right shifting negative values.

回眸一笑 2024-09-26 05:03:09

一个小好处是它减少了可能需要的数组边界检查测试的数量...例如,不必编写:

int idx = [...];
if ((idx >= 0)&&(idx < arrayLength)) printf("array value is %i\n", array[idx]);

您可以只编写:

unsigned int idx = [...];
if (idx < arrayLength) printf("array value is %i\n", array[idx]);

One minor nicety is that it cuts down on the amount of array bounds checking testing that might be necessary... e.g. instead of having to write:

int idx = [...];
if ((idx >= 0)&&(idx < arrayLength)) printf("array value is %i\n", array[idx]);

you can just write:

unsigned int idx = [...];
if (idx < arrayLength) printf("array value is %i\n", array[idx]);
我一直都在从未离去 2024-09-26 05:03:09

如果您知道变量永远不应该为负数,那么将变量声明为无符号是否重要?

当然这并不重要。有些人(Stroustrup 和 Scott Meyers,请参阅 "无符号与有符号 - Bjarne 是否弄错了?") 拒绝变量应该是无符号的,因为它代表的是无符号的量。如果使用无符号的目的是表明变量只能存储非负值,那么您需要以某种方式检查这一点。否则,你得到的只是

  • 一个默默隐藏错误的类型,因为它不会让负值暴露
  • 相应有符号类型的正范围的双精度
  • 定义的溢出/位移/等语义

当然它不会阻止人们提供函数的负值,编译器将无法警告您任何此类情况(考虑传递基于负运行时的 int 值)。为什么不在函数中断言呢?

assert((idx >= 0) && "Index must be greater/equal than 0!");

无符号类型也引入了许多陷阱。当您在可能暂时小于零的计算(向下计数循环或其他)中使用它时,尤其是在 C 和 C++ 语言中无符号和有符号值之间发生的自动提升时,您必须小心

// assume idx is unsigned. What if idx is 0 !?
if(idx - 1 > 3) /* do something */;

Is it important to declare a variable as unsigned if you know it should never be negative?

Certainly it is not important. Some people (Stroustrup and Scott Meyers, see "Unsigned vs signed - Is Bjarne mistaken?") reject the idea that a variable should be unsigned just because it represents an unsigned quantity. If the point of using unsigned would be to indicate that a variable can only store non-negative values, you need to somehow check that. Otherwise, all you get is

  • A type that silently hides errors because it doesn't let negative values to expose
  • Double of the positive range of the corresponding signed type
  • Defined overflow/bit-shift/etc semantics

Certainly it doesn't prevent people from supplying negative values to your function, and the compiler won't be able to warn you about any such cases (think about a negative runtime based int-value being passed). Why not assert in the function instead?

assert((idx >= 0) && "Index must be greater/equal than 0!");

The unsigned type introduces many pitfalls too. You have to be careful when you use it in calculations that can temporary be less than zero (down counting loop, or something) and especially the automatic promotions that happen in the C and C++ languages among unsigned and signed values

// assume idx is unsigned. What if idx is 0 !?
if(idx - 1 > 3) /* do something */;
对你的占有欲 2024-09-26 05:03:09

其他答案都很好,但有时会导致混乱。我相信这就是为什么某些语言选择不使用无符号整数类型的原因。

例如,假设您有一个如下所示的结构来表示屏幕对象:

struct T {
    int x;
    int y;
    unsigned int width;
    unsigned int height;
};

这个想法是不可能有负宽度。那么您使用什么数据类型来存储矩形的右边缘?

int right = r.x + r.width; // causes a warning on some compilers with certain flags

当然它仍然不能保护您免受任何整数溢出的影响。因此,在这种情况下,即使宽度和高度在概念上不能为负,但将它们设为无符号并没有真正的好处/code> 除了需要一些强制转换来消除有关混合有符号和无符号类型的警告之外。最后,至少对于这样的情况,最好将它们全部设为 int,毕竟,您可能没有足够宽的窗口来需要 它是无符号

The other answers are good, but sometimes it can lead to confusion. Which is I believe why some languages have opted to not have unsigned integer types.

For example, suppose you have a structure that looks like this to represent a screen object:

struct T {
    int x;
    int y;
    unsigned int width;
    unsigned int height;
};

The idea being that it is impossible to have a negative width. Well what data type do you use to store the right edge of the rectangle?

int right = r.x + r.width; // causes a warning on some compilers with certain flags

and certainly it still doesn't protect you from any integer overflows. So in this scenario, even though width and height cannot be conceptually be negative, there is no real gain in making them unsigned except for requiring some casts to get rid of warnings regarding mixing signed and unsigned types. In the end, at least for cases like this it is better to just make them all ints, after all, odds are you aren't have a window wide enough to need it to be unsigned.

朱染 2024-09-26 05:03:09

这与“const正确性”有价值的原因相同。如果您知道某个特定值不应更改,请将其声明为 const 并让编译器帮助您。如果您知道变量应始终为非负数,则将其声明为无符号,编译器将帮助您捕获不一致之处。

(也就是说,如果在这种情况下使用 unsigned int 而不是 int,则可以将数字表示为两倍大。)

This has value for the same reason that "const correctness" has value. If you know that a particular value shouldn't change, declare it const and let the compiler help you. If you know a variable should always be non-negative, then declare it as unsigned and the compiler will help you catch inconsistencies.

(That, and you can express numbers twice as big if you use unsigned int rather than int in this context.)

葬﹪忆之殇 2024-09-26 05:03:09

通过在不需要有符号值时使用无符号,除了确保数据类型不表示低于所需下限的值之外,还可以增加最大上限。原本用于表示负数的所有位组合都用于表示更大的正数集。

By using unsigned when signed values will not be needed, in addition to ensuring the datatype doesn't represent values below the desired lower bound, you increase the maximum upper bound. All the bit combinations that would otherwise be used to represent negative numbers, are used to represent a larger set of positive numbers.

生活了然无味 2024-09-26 05:03:09

它还可以让您在与其他接口交互时不必进行无符号转换。例如:

for (int i = 0; i < some_vector.size(); ++i)

这通常会惹恼那些需要在没有警告的情况下进行编译的人。

It also keeps you from having to cast to/from unsigned whatever when interacting with other interfaces. For example:

for (int i = 0; i < some_vector.size(); ++i)

That will generally annoy the hell out of anyone who needs to compile without warnings.

漫漫岁月 2024-09-26 05:03:09

它不会阻止负数被输入到函数中;相反,它会将它们解释为大的正数。如果您知道错误检查的上限,这可能会很有用,但您需要自己进行错误检查。有些编译器会发出警告,但如果您大量使用无符号类型,则可能会出现太多警告而难以轻松处理。这些警告可以通过强制类型转换来掩盖,但这比仅使用有符号类型更糟糕。

如果我知道变量不应该为负数,我就不会使用无符号类型,而是如果它不能为负数,我就不会使用无符号类型。例如,size_t 是无符号类型,因为数据类型根本不能具有负大小。如果一个值可以想象为负数但不应该是负数,那么通过将其作为有符号类型并使用类似 i i i i i i i i i i i i i i i 0i >= 0(如果 i,这些条件分别为 falsetrue > 是无符号类型,无论其值如何)。

如果您担心严格的标准一致性,那么了解无符号算术中的溢出是完全定义的,而在有符号算术中它们是未定义的行为可能会很有用。

It won't prevent negative numbers from being fed into a function; instead it will interpret them as large positive numbers. This may be moderately useful if you know an upper bound for error checking, but you need to do the error checking yourself. Some compilers will issue warnings, but if you're using unsigned types a lot there may be too many warnings to deal with easily. These warnings can be covered up with casts, but that's worse than sticking to signed types only.

I wouldn't use an unsigned type if I knew the variable shouldn't be negative, but rather if it couldn't be. size_t is an unsigned type, for example, since a data type simply can't have negative size. If a value could conceivably be negative but shouldn't be, it's easier to express that by having it as a signed type and using something like i < 0 or i >= 0 (these conditions come out as false and true respectively if i is an unsigned type, regardless of its value).

If you're concerned about strict Standard conformance, it may be useful to know that overflows in unsigned arithmetic are fully defined, while in signed arithmetic they're undefined behavior.

怕倦 2024-09-26 05:03:09

混合有符号和无符号类型可能是一个令人头疼的问题。生成的代码通常会臃肿、错误或两者兼而有之(*)。在许多情况下,除非您需要在 32 位变量中存储 2,147,483,648 到 4,294,967,295 之间的值,或者需要使用大于 9,223,372,036,854,775,807 的值,否则我建议根本不要使用无符号类型。

(*)例如,如果程序员这样做,会发生什么:

{ Question would be applicable to C, Pascal, Basic, or any other language }
  If SignedVar + UnsignedVar > OtherSignedVar Then DoSomething;

我相信 Borland 的旧 Pascal 会通过将 SignedVar 和 UnsignedVar 转换为更大的有符号类型来处理上述情况(最大支持的类型,顺便说一句,是有符号的,因此每个无符号类型可以转换为更大的签名)。这会产生大量代码,但它是正确的。在 C 语言中,如果一个有符号变量为负,即使 UnsignedVar 为零,结果也可能是错误的。还存在许多其他糟糕的情况。

Mixing signed and unsigned types can be a major headache. The resulting code will often be bloated, wrong, or both(*). In many cases, unless you need to store values between 2,147,483,648 and 4,294,967,295 within a 32-bit variable, or you need to work with values larger than 9,223,372,036,854,775,807, I'd recommend not bothering with unsigned types at all.

(*)What should happen, for example, if a programmer does:

{ Question would be applicable to C, Pascal, Basic, or any other language }
  If SignedVar + UnsignedVar > OtherSignedVar Then DoSomething;

I believe Borland's old Pascal would handle the above scenario by converting SignedVar and UnsignedVar to a larger signed type (the largest supported type, btw, was signed, so every unsigned type could be converted to a larger signed one). This would produce big code, but it would be correct. In C, if one signed variable is negative the result is likely to be numerically wrong even if UnsignedVar holds zero. Many other bad scenarios exist as well.

李白 2024-09-26 05:03:09

使用 unsigned 可以给您带来两个主要好处

  • 它允许您对任何值安全地使用右移 >> 运算符,因为它不能负数 - 对负值使用右移是未定义的。

  • 它为您提供了环绕 mod 2^n 算术。对于有符号值,下溢/上溢的影响是未定义的。对于无符号值,您可以得到 mod 2^n 算术,因此 0U - 1U 将始终为您提供最大可能的无符号值(该值始终小于 2 的幂 1)。

There are two main things using unsigned gives you

  • it allows you to use the right shift >> operator safely on any value, as it can't be negative -- using a right shift on a negative value is undefined.

  • it gives you wrap-around mod 2^n arithmetic. With signed values, the effect of underflow/overflow is undefined. With unsigned values you get mod 2^n arithmetic, so 0U - 1U will always give you the largest possible unsigned value (which will always be 1 less than a power of 2).

開玄 2024-09-26 05:03:09

使用 unsigned 的一个反驳是,您可能会发现自己处于非常合理的情况下,它会变得尴尬并且会引入无意的错误。使用以下方法考虑一个类(例如列表类或类似类):

unsigned int length() { ... }

看起来非常合理。但是当你想要迭代它时,你会得到以下结果:

for (unsigned int i = my_class.length(); i >= 0; --i) { ... }

你的循环不会终止,现在你被迫强制转换或做一些其他尴尬的事情。

使用unsigned的替代方法是断言您的值是非负的。

参考

A counter argument to using unsigned is that you may find yourself in very reasonable situations where it gets awkward and unintentional bugs are introduced. Consider a class—for example a list class or some such—with the following method:

unsigned int length() { ... }

Seems very reasonable. But then when you want to iterate over it, you get the following:

for (unsigned int i = my_class.length(); i >= 0; --i) { ... }

Your loop won't terminate and now you're forced to cast or do some other awkwardness.

An alternative to using unsigned is just to assert that your values are non-negative.

Reference.

囍孤女 2024-09-26 05:03:09

顺便说一句,我很少使用 intunsigned int,而是使用 {int16_tint32_t、. .. } 或 {uint16_t, uint32_t, ...}。 (不过,您必须包含 stdint.h 才能使用它们。)我不确定我的同事是如何找到它的,但我尝试以这种方式传达变量的大小。在某些地方,我尝试通过执行以下操作来更加明目张胆: typedef uint32_t Counter32;

As an aside, I seldom use int or unsigned int, rather I use either {int16_t, int32_t, ... } or {uint16_t, uint32_t, ...}. (You have to include stdint.h to use them though.) I am not sure how my colleagues find it, but I try to convey the size of the variable this way. In some places, I try to be more blatant by doing some thing like: typedef uint32_t Counter32;

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文