数组的索引和 argc 符号

发布于 2024-08-25 05:57:49 字数 988 浏览 8 评论 0原文

C 标准 (5.1 .2.2.1 程序启动)说:

程序启动时调用的函数 被命名为主。 [...]
它应定义为 返回类型为 int 并且没有 参数:
int main(void) { /* ... */ }

或者有两个参数[...]
int main(int argc, char *argv[]) { /* ... */ }

后来说:

argc 的值应为非负数。

  • 为什么 不应该 argc 被定义为 unsigned intargc 应该意味着“参数计数”?
  • 是否应该将 argc 用作 argv 的索引?

所以我开始想知道 C 标准是否规定了有关数组索引类型的内容。签了吗?

6.5.2.1 数组下标

其中一个表达式应具有类型 ''指向对象类型的指针'',另一个 表达式应具有整数类型, 结果的类型为“type”。

它没有说明它的签名(或者我没有找到它)。使用负数数组索引 (array[-1]) 的代码很常见,但这不是未定义的行为吗?

  • 数组的索引应该是无符号的吗?

The C standard (5.1.2.2.1 Program startup) says:

The function called at program startup
is named main. [...]
It shall be defined with a
return type of int and with no
parameters:
int main(void) { /* ... */ }

or with two parameters [...] :
int main(int argc, char *argv[]) { /* ... */ }

And later says:

The value of argc shall be nonnegative.

  • Why shouldn't argc be defined as an unsigned int, argc supposedly meaning 'argument count'?
  • Should argc be used as an index for argv?

So I started wondering if the C standard says something about the type of array's index. Is it signed?

6.5.2.1 Array subscripting:

One of the expressions shall have type
‘‘pointer to object type’’, the other
expression shall have integer type,
and the result has type ‘‘type’’.

It doesn't say anything about its signedness (or I didn't find it). It is pretty common to see codes using negatives array indexes (array[-1]) but isn't it undefined behavior?

  • Should array's indexes be unsigned?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

榆西 2024-09-01 05:57:49

main() 中使用 int 的原因是历史性的 - 早在语言标准化之前就一直如此。数组索引的要求是它位于数组的范围内(或者在某些情况下,超出数组末尾) - 其他任何内容都是未定义的,因此符号性并不重要。

The reason for the int in main() is historical - it's always been that way, since long before the language was standardised. The requirement of an array index is that it is within the bounds of the array (or in some circumstances, one past the end) - anything else is undefined, so the signedness is immaterial.

So要识趣 2024-09-01 05:57:49

1) 关于 main() argc 类型:恕我直言,该标准延续了一个非常古老的传统(超过 30 年!),而现在......改变事情已经太晚了(注意:在大多数系统上,编译器和链接器都没有,如果“argc”被定义为“无符号”,CPU也不会抱怨,但你超出了标准!)

2)在大多数实现中,argv[argc]是合法的,并且计算结果为NULL。事实上,找到参数列表末尾的另一种方法是从 0 开始迭代 argv,当 argv[i] 为 NULL 时终止。

3) 只要从 (pn) 到 p 的地址范围属于同一个内存对象,负数的数组/指针运算是合法的。在 IE 中,您可以使用

char array[100];
char *p;

p = &array[50];
p += -30; /* Now p points to array[20]. */

指针算术的这种用法是合法的,因为结果指针仍然保留在原始内存对象(“数组”)内。在大多数系统上,指针算术可用于在内存中导航,这违反了此规则,但这不是可移植的,因为它完全依赖于系统。

1) About main() argc type: IMHO the standard continues a very old tradition (more than 30 years!), and now... it's simply too late to change things (NOTE: on most systems neither the compiler, nor the linker, nor the CPU will complain if "argc" is defined "unsigned", but you are out of the standard!)

2) On the majority of implementations argv[argc] is legal and evaluates to NULL. Indeed, an alternate way to find the end of the argument list is to iterate on argv from 0 terminating when argv[i] is NULL.

3) Array/pointer arithmetic with negative numbers is legal as far as the address range from (p-n) to p belongs to the same memory object. I.E. you can have

char array[100];
char *p;

p = &array[50];
p += -30; /* Now p points to array[20]. */

This usage of pointer arithmetic is legal because the resulting pointer still stays inside the original memory object ("array"). On most system the pointer arithmetic can be used to navigate in memory in violation of this rule, but this is NOT portable since it's completely system-dependent.

累赘 2024-09-01 05:57:49

一般来说,在 C 语言中,“最小意外原则”意味着最好使变量带符号,除非有充分的理由不带符号。这是因为当您混合有符号和无符号值时,类型提升规则可能会导致意外结果:例如,如果 argc 是无符号的,那么这个简单的比较将导致令人惊讶的结果:(

if (argc > -1)

-1 被提升为 unsigned int,因此它的值被转换为 UINT_MAX,这几乎肯定大于 argc)。

In general in C, the "principle of least surprise" implies that it is preferable to make a variable signed unless there is a good reason for it to be unsigned. This is because the type-promotion rules can lead to unexpected results when you mix signed and unsigned values: for example, if argc was unsigned then this simple comparison would lead to surprising results:

if (argc > -1)

(The -1 is promoted to unsigned int, so its value is converted to UINT_MAX, which is almost certainly greater than argc).

最佳男配角 2024-09-01 05:57:49

1) argc 是一个参数计数,但说实话,你怎么能在程序名 argv[0] 之前添加一个参数呢?想象一个名为 foo 的程序,你不能简单地说 args1 foo args2 因为这是没有意义的,尽管 argc的有符号类型int,即没有像argv[-1]这样的东西,它会让你得到'args1'...

2)原因argc并不是真正的参数向量的索引(因此' argv') 作为运行时将可执行程序名称填充到第零个偏移量,即 argv[0] 因此 argc 将相差 1。

3) 数组索引,就指针操作而言,假设位于指针所在内存块的边界内,使用数组下标因为负数是合法的,因为数组下标是指针的快捷方式,不仅如此,它们是可交换的,例如

char v[100];
char *p = &v[0];

You can do this:

p[55] = 'a'; 

Which is the same as

*(p + 55) = 'a';

You can even do this:

p = &v[55];

p[-10] = 'b' /* This will stuff 'b' into 45'th offset! */

Which is the same as

*(p - 10) = 'b';

此外,如果您以超出边界的方式使用和操作数组 - 这是未定义的行为,并且取决于运行时的实现如何处理它,可能是分段错误,或者程序崩溃......

4) 在 *nix 环境中,有些环境会向 main char **endvp< 提供第三个参数/code>,同样,这在 Microsoft 的 DOS/Windows 世界中很少使用。某些 *nix 运行时实现,出于史前原因,您可以通过运行时传入环境变量。

1) Argc is an argument count, but to be quite honest, how can you prepend an argument before the program name which argv[0]. Imagine a program called foo, you cannot simply say args1 foo args2 as that is meaningless, despite the argc being a signed type of int, i.e. no such thing as argv[-1] which will get you 'args1'...

2) The reason argc is not really an index to the argument vector (hence 'argv') as the run-time stuffs the executable program name into the zero'th offset, i.e. argv[0] hence the argc will be off by 1.

3) Array indexes, in terms of pointer manipulation, provided you are within the boundaries of the block of memory where the pointer is at, using array subscripts as negative is legal as the array subscripts are a shortcut for the pointers, and not alone that, they are commutative e.g.

char v[100];
char *p = &v[0];

You can do this:

p[55] = 'a'; 

Which is the same as

*(p + 55) = 'a';

You can even do this:

p = &v[55];

p[-10] = 'b' /* This will stuff 'b' into 45'th offset! */

Which is the same as

*(p - 10) = 'b';

Also if you use and manipulate arrays in such a way that is outside of the boundaries - that is undefined behaviour and will depend on the implementation of the run-time on how to handle it, perhaps a segmentation fault, or a program crash....

4) In *nix environments, some would have a third parameter supplied to main char **endvp, again this is rarely used in the Microsoft world of DOS/Windows. Some *nix run-time implementations, for pre-historic reasons, you could pass in the environment variables via the run-time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文