为什么 C 和 C 使用 int 而不是 unsigned int++ for 循环?

发布于 2024-12-05 20:05:03 字数 331 浏览 4 评论 0原文

这是一个相当愚蠢的问题,但是为什么在 C 或 C++ 中为数组定义 for 循环时通常使用 int 而不是 unsigned int 呢?

for(int i;i<arraySize;i++){}
for(unsigned int i;i<arraySize;i++){}

我认识到在执行数组索引以外的操作时使用 int 的好处以及在使用 C++ 容器时使用迭代器的好处。仅仅是因为循环数组时并不重要吗?或者我应该避免所有这些并使用不同的类型,例如 size_t

This is a rather silly question but why is int commonly used instead of unsigned int when defining a for loop for an array in C or C++?

for(int i;i<arraySize;i++){}
for(unsigned int i;i<arraySize;i++){}

I recognize the benefits of using int when doing something other than array indexing and the benefits of an iterator when using C++ containers. Is it just because it does not matter when looping through an array? Or should I avoid it all together and use a different type such as size_t?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

墨落画卷 2024-12-12 20:05:03

从逻辑角度来看,使用 int 来索引数组更正确。

C 和 C++ 中的无符号语义并不真正意味着“非负数”,但它更像是“位掩码”或“模整数”。

要理解为什么 unsigned 对于“非负”数来说不是一个好的类型,请考虑这些完全荒谬的陈述:

  • 将一个可能的负整数添加到一个非负整数,你会得到一个非负
  • 整数两个非负整数的差始终是一个非负整数
  • 将一个非负整数乘以一个负整数,你会得到一个非负结果

显然上述短语没有任何意义......但这就是 C 和 C++ <代码>无符号确实是语义作品。

实际上,使用无符号类型来表示容器的大小是 C++ 的一个设计错误,不幸的是我们现在注定要永远使用这个错误的选择(为了向后兼容)。您可能喜欢“无符号”这个名称,因为它与“非负”类似,但该名称无关紧要,重要的是语义......并且 unsigned 与“非负”相距甚远。

因此,当对向量进行大多数循环编码时,我个人首选的形式是:(

for (int i=0,n=v.size(); i<n; i++) {
    ...
}

当然假设向量的大小在迭代期间没有改变,并且我实际上需要主体中的索引,否则 for (auto& x : v)... 更好)。

尽快摆脱 unsigned 并使用普通整数的优点是可以避免由于 unsigned size_t 设计错误而导致的陷阱。例如,考虑一下:

// draw lines connecting the dots
for (size_t i=0; i<pts.size()-1; i++) {
    drawLine(pts[i], pts[i+1]);
}

如果 pts 向量为空,上面的代码将会出现问题,因为在这种情况下 pts.size()-1 是一个巨大的无意义数字。处理 a a a a a a a a a < 的表达式b-1a+1 不同。 b 即使对于常用的值也就像在雷区跳舞一样。

从历史上看,使用 size_t 无符号的理由是为了能够使用额外的位来表示值,例如,数组中能够有 65535 个元素,而不是 16 位平台上的 32767 个元素。在我看来,即使在那个时候,这种错误的语义选择所带来的额外成本也是不值得的(如果现在 32767 个元素还不够,那么 65535 个元素无论如何也不会足够长)。

无符号值非常有用,但不能用于表示容器大小或索引;对于大小和索引,常规有符号整数效果更好,因为语义正是您所期望的。

当您需要模算术属性或想要在位级别工作时,无符号值是理想的类型。

Using int is more correct from a logical point of view for indexing an array.

unsigned semantic in C and C++ doesn't really mean "not negative" but it's more like "bitmask" or "modulo integer".

To understand why unsigned is not a good type for a "non-negative" number please consider these totally absurd statements:

  • Adding a possibly negative integer to a non-negative integer you get a non-negative integer
  • The difference of two non-negative integers is always a non-negative integer
  • Multiplying a non-negative integer by a negative integer you get a non-negative result

Obviously none of the above phrases make any sense... but it's how C and C++ unsigned semantic indeed works.

Actually using an unsigned type for the size of containers is a design mistake of C++ and unfortunately we're now doomed to use this wrong choice forever (for backward compatibility). You may like the name "unsigned" because it's similar to "non-negative" but the name is irrelevant and what counts is the semantic... and unsigned is very far from "non-negative".

For this reason when coding most loops on vectors my personally preferred form is:

for (int i=0,n=v.size(); i<n; i++) {
    ...
}

(of course assuming the size of the vector is not changing during the iteration and that I actually need the index in the body as otherwise the for (auto& x : v)... is better).

This running away from unsigned as soon as possible and using plain integers has the advantage of avoiding the traps that are a consequence of unsigned size_t design mistake. For example consider:

// draw lines connecting the dots
for (size_t i=0; i<pts.size()-1; i++) {
    drawLine(pts[i], pts[i+1]);
}

the code above will have problems if the pts vector is empty because pts.size()-1 is a huge nonsense number in that case. Dealing with expressions where a < b-1 is not the same as a+1 < b even for commonly used values is like dancing in a minefield.

Historically the justification for having size_t unsigned is for being able to use the extra bit for the values, e.g. being able to have 65535 elements in arrays instead of just 32767 on 16-bit platforms. In my opinion even at that time the extra cost of this wrong semantic choice was not worth the gain (and if 32767 elements are not enough now then 65535 won't be enough for long anyway).

Unsigned values are great and very useful, but NOT for representing container size or for indexes; for size and index regular signed integers work much better because the semantic is what you would expect.

Unsigned values are the ideal type when you need the modulo arithmetic property or when you want to work at the bit level.

梦境 2024-12-12 20:05:03

这是一个更普遍的现象,人们通常不使用正确的整数类型。现代 C 的语义 typedef 比原始整数类型更可取。例如,所有“尺寸”都应该输入为 size_t。如果您系统地为应用程序变量使用语义类型,那么使用这些类型循环变量也会变得更加容易。

我已经看到了一些难以检测的错误,这些错误来自使用 int 左右。代码突然在大型矩阵和类似的东西上崩溃了。只要使用正确的类型正确编码就可以避免这种情况。

This is a more general phenomenon, often people don't use the correct types for their integers. Modern C has semantic typedefs that are much preferable over the primitive integer types. E.g everything that is a "size" should just be typed as size_t. If you use the semantic types systematically for your application variables, loop variables come much easier with these types, too.

And I have seen several bugs that where difficult to detect that came from using int or so. Code that all of a sudden crashed on large matrixes and stuff like that. Just coding correctly with correct types avoids that.

欲拥i 2024-12-12 20:05:03

这纯粹是懒惰和无知。您应该始终使用正确的索引类型,除非您有进一步的信息来限制可能的索引范围,否则 size_t 是正确的类型。

当然,如果维度是从文件中的单字节字段读取的,那么您就知道它的范围是 0-255,并且 int 将是一个完全合理的索引类型。同样,如果您循环固定次数(例如 0 到 99),则 int 也可以。但是还有另一个不使用 int 的原因:如果您使用 < code>i%2 在循环体中以不同方式处理偶数/奇数索引,当 i 签名时 i%2i%2 签名时要昂贵得多code>i 未签名...

It's purely laziness and ignorance. You should always use the right types for indices, and unless you have further information that restricts the range of possible indices, size_t is the right type.

Of course if the dimension was read from a single-byte field in a file, then you know it's in the range 0-255, and int would be a perfectly reasonable index type. Likewise, int would be okay if you're looping a fixed number of times, like 0 to 99. But there's still another reason not to use int: if you use i%2 in your loop body to treat even/odd indices differently, i%2 is a lot more expensive when i is signed than when i is unsigned...

ま昔日黯然 2024-12-12 20:05:03

差别不大。 int 的好处之一是它可以被签名。因此 int i 0 有意义,而 unsigned i 0 没什么意义。

如果计算索引,这可能是有益的(例如,如果某些结果为负,您可能会遇到永远不会进入循环的情况)。

是的,写得更少:-)

Not much difference. One benefit of int is it being signed. Thus int i < 0 makes sense, while unsigned i < 0 doesn't much.

If indexes are calculated, that may be beneficial (for example, you might get cases where you will never enter a loop if some result is negative).

And yes, it is less to write :-)

你的往事 2024-12-12 20:05:03

使用 int 来索引数组是传统做法,但仍然被广泛采用。 int只是一个通用的数字类型,并不对应平台的寻址能力。如果它恰好比这个更短或更长,当尝试索引超出这个范围的非常大的数组时,您可能会遇到奇怪的结果。

在现代平台上,off_tptrdiff_tsize_t 保证了更多的可移植性。

这些类型的另一个优点是它们为阅读代码的人提供了上下文。当您看到上述类型时,您知道代码将执行数组下标或指针算术,而不仅仅是任何计算。

因此,如果您想编写防弹、可移植且上下文相关的代码,您可以通过敲击几次键盘来完成。

GCC 甚至支持 typeof 扩展,使您不必在各处键入相同的类型名:

typeof(arraySize) i;

for (i = 0; i < arraySize; i++) {
  ...
}

然后,如果您更改 arraySize 的类型,i 的类型 自动更改。

Using int to index an array is legacy, but still widely adopted. int is just a generic number type and does not correspond to the addressing capabilities of the platform. In case it happens to be shorter or longer than that, you may encounter strange results when trying to index a very large array that goes beyond.

On modern platforms, off_t, ptrdiff_t and size_t guarantee much more portability.

Another advantage of these types is that they give context to someone who reads the code. When you see the above types you know that the code will do array subscripting or pointer arithmetic, not just any calculation.

So, if you want to write bullet-proof, portable and context-sensible code, you can do it at the expense of a few keystrokes.

GCC even supports a typeof extension which relieves you from typing the same typename all over the place:

typeof(arraySize) i;

for (i = 0; i < arraySize; i++) {
  ...
}

Then, if you change the type of arraySize, the type of i changes automatically.

老娘不死你永远是小三 2024-12-12 20:05:03

这实际上取决于编码器。一些编码员更喜欢类型完美主义,因此他们会使用他们要比较的任何类型。例如,如果他们正在迭代 C 字符串,您可能会看到:

size_t sz = strlen("hello");
for (size_t i = 0; i < sz; i++) {
    ...
}

而如果他们只是执行某件事 10 次,您可能仍然会看到 int

for (int i = 0; i < 10; i++) {
    ...
}

It really depends on the coder. Some coders prefer type perfectionism, so they'll use whatever type they're comparing against. For example, if they're iterating through a C string, you might see:

size_t sz = strlen("hello");
for (size_t i = 0; i < sz; i++) {
    ...
}

While if they're just doing something 10 times, you'll probably still see int:

for (int i = 0; i < 10; i++) {
    ...
}
紫罗兰の梦幻 2024-12-12 20:05:03

我使用 int 因为它需要更少的物理输入,但这并不重要 - 它们占用相同的空间,除非你的数组有几十亿个元素,否则如果你不使用 16 位编译器,我通常不使用。

I use int cause it requires less physical typing and it doesn't matter - they take up the same amount of space, and unless your array has a few billion elements you won't overflow if you're not using a 16-bit compiler, which I'm usually not.

情话难免假 2024-12-12 20:05:03

因为除非您的数组大小大于 2GB 类型的 char、4GB 类型的 short 或 8GB 类型的 int 等,变量是否有符号并不重要。

那么,当你可以少打字时,为什么要多打字呢?

Because unless you have an array with size bigger than two gigabyts of type char, or 4 gigabytes of type short or 8 gigabytes of type int etc, it doesn't really matter if the variable is signed or not.

So, why type more when you can type less?

左秋 2024-12-12 20:05:03

除了打字时间较短的问题之外,原因还在于它允许负数。

由于我们无法提前判断一个值是否可以为负数,因此大多数采用整数参数的函数都采用有符号变量。由于大多数函数都使用有符号整数,因此对于循环之类的事情使用有符号整数通常会减少工作量。否则,您有可能不得不添加一堆类型转换。

当我们转向 64 位平台时,有符号整数的无符号范围对于大多数用途来说应该足够了。在这些情况下,没有太多理由不使用有符号整数。

Aside from the issue that it's shorter to type, the reason is that it allows negative numbers.

Since we can't say in advance whether a value can ever be negative, most functions that take integer arguments take the signed variety. Since most functions use signed integers, it is often less work to use signed integers for things like loops. Otherwise, you have the potential of having to add a bunch of typecasts.

As we move to 64-bit platforms, the unsigned range of a signed integer should be more than enough for most purposes. In these cases, there's not much reason not to use a signed integer.

我纯我任性 2024-12-12 20:05:03

考虑以下简单示例:

int max = some_user_input; // or some_calculation_result
for(unsigned int i = 0; i < max; ++i)
    do_something;

如果 max 恰好是负值,例如 -1,则 -1 将被视为 UINT_MAX(当比较两个具有相同等级但符号不同的整数,有符号的将被视为无符号的)。另一方面,以下代码不会出现此问题:

int max = some_user_input;
for(int i = 0; i < max; ++i)
    do_something;

给出负 max 输入,循环将被安全地跳过。

Consider the following simple example:

int max = some_user_input; // or some_calculation_result
for(unsigned int i = 0; i < max; ++i)
    do_something;

If max happens to be a negative value, say -1, the -1 will be regarded as UINT_MAX (when two integers with the sam rank but different sign-ness are compared, the signed one will be treated as an unsigned one). On the other hand, the following code would not have this issue:

int max = some_user_input;
for(int i = 0; i < max; ++i)
    do_something;

Give a negative max input, the loop will be safely skipped.

顾冷 2024-12-12 20:05:03

在大多数情况下,使用带符号的int一个错误,很容易导致潜在的错误以及未定义的行为。

使用 size_t 匹配系统的字大小(64 位系统上为 64 位,32 位系统上为 32 位),始终允许循环的正确范围并最大限度地降低整数溢出的风险。

int 建议旨在解决缺乏经验的程序员经常错误地编写 reverse for 循环的问题(当然,int code> 可能不在循环的正确范围内):

/* a correct reverse for loop */
for (size_t i = count; i > 0;) {
   --i; /* note that this is not part of the `for` statement */
   /* code for loop where i is for zero based `index` */
}
/* an incorrect reverse for loop (bug on count == 0) */
for (size_t i = count - 1; i > 0; --i) {
   /* i might have overflowed and undefined behavior occurs */
}

一般来说,有符号和无符号变量不应混合在一起,因此有时不可避免地使用 int 。但是,for 循环的正确类型通常是size_t

有一篇关于带符号变量比无符号变量更好的误解的精彩讨论,您可以在 YouTube 上找到它(Signed Integers Thought Harmful)罗伯特·西科德)

TL;DR;:有符号变量比无符号变量更危险并且需要更多代码(几乎在所有情况下都应该首选无符号变量,并且在逻辑上不期望负值时绝对应该首选)。

对于无符号变量,唯一关心的是溢出边界,它具有严格定义的行为(环绕)并使用明确定义的模块化数学。

这允许单个边缘情况测试捕获溢出,并且可以在执行数学运算之后执行该测试。

但是,对于有符号变量,溢出行为是未定义 (UB),并且负范围实际上大于正范围 - 添加边缘情况的情况必须在之前<进行测试和显式处理/strong> 可以执行数学运算。

即,INT_MIN * -1 是多少? (预处理器会保护你,但没有它你就会陷入困境)。

PS

至于 @6502 在他们的答案中提供的例子,整个事情又是一个试图走捷径和一个简单的缺少 if 语句的问题。

当循环假设数组中至少有 2 个元素时,应事先测试此假设。 IE:

// draw lines connecting the dots - forward loop
if(pts.size() > 1) { // first make sure there's enough dots
  for (size_t i=0; i < pts.size()-1; i++) { // then loop
    drawLine(pts[i], pts[i+1]);
  }
}
// or test against i + 1 : which tests the desired pts[i+1]
for (size_t i = 0; i + 1 < pts.size(); i++) { // then loop
  drawLine(pts[i], pts[i+1]);
}
// or start i as 1 : but note that `-` is slower than `+`
for (size_t i = 1; i < pts.size(); i++) { // then loop
  drawLine(pts[i - 1], pts[i]);
}

Using a signed int is - in most cases - a mistake that could easily result in potential bugs as well as undefined behavior.

Using size_t matches the system's word size (64 bits on 64 bit systems and 32 bits on 32 bit systems), always allowing for the correct range for the loop and minimizing the risk of an integer overflow.

The int recommendation comes to solve an issue where reverse for loops were often written incorrectly by unexperienced programmers (of course, int might not be in the correct range for the loop):

/* a correct reverse for loop */
for (size_t i = count; i > 0;) {
   --i; /* note that this is not part of the `for` statement */
   /* code for loop where i is for zero based `index` */
}
/* an incorrect reverse for loop (bug on count == 0) */
for (size_t i = count - 1; i > 0; --i) {
   /* i might have overflowed and undefined behavior occurs */
}

In general, signed and unsigned variables shouldn't be mixed together, so at times using an int in unavoidable. However, the correct type for a for loop is as a rule size_t.

There's a nice talk about this misconception that signed variables are better than unsigned variables, you can find it on YouTube (Signed Integers Considered Harmful by Robert Seacord).

TL;DR;: Signed variables are more dangerous and require more code than unsigned variables (which should be preferred almost in all cases and definitely whenever negative values aren't logically expected).

With unsigned variables the only concern is the overflow boundary which has a strictly defined behavior (wrap-around) and uses clearly defined modular mathematics.

This allows a single edge case test to catch an overflow and that test can be performed after the mathematical operation was executed.

However, with signed variables the overflow behavior is undefined (UB) and the negative range is actually larger than the positive range - things that add edge cases that must be tested for and explicitly handled before the mathematical operation can be executed.

i.e., how much INT_MIN * -1? (the pre-processor will protect you, but without it you're in a jam).

P.S.

As for the example offered by @6502 in their answer, the whole thing is again an issue of trying to cut corners and a simple missing if statement.

When a loop assumes at least 2 elements in an array, this assumption should be tested beforehand. i.e.:

// draw lines connecting the dots - forward loop
if(pts.size() > 1) { // first make sure there's enough dots
  for (size_t i=0; i < pts.size()-1; i++) { // then loop
    drawLine(pts[i], pts[i+1]);
  }
}
// or test against i + 1 : which tests the desired pts[i+1]
for (size_t i = 0; i + 1 < pts.size(); i++) { // then loop
  drawLine(pts[i], pts[i+1]);
}
// or start i as 1 : but note that `-` is slower than `+`
for (size_t i = 1; i < pts.size(); i++) { // then loop
  drawLine(pts[i - 1], pts[i]);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文