主函数参数的argv字符串位于哪里?

发布于 2024-10-03 06:52:21 字数 242 浏览 5 评论 0原文

在 C/C++ 中,主函数接收 char* 类型的参数。

int main(int argc, char* argv[]){
  return 0;
}

argv 是一个 char* 数组,指向字符串。这些字符串位于哪里?它们是在堆、栈还是其他地方?

In C/C++, the main function receives parameters which are of type char*.

int main(int argc, char* argv[]){
  return 0;
}

argv is an array of char*, and points to strings. Where are these string located? Are they on the heap, stack, or somewhere else?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

咽泪装欢 2024-10-10 06:52:21

以下是 C 标准 (n1256) 的规定:

5.1.2.2.1 程序启动
...
2 如果声明了它们,main函数的参数应遵守以下规定
约束:

  • argc 的值应为非负数。
  • argv[argc] 应为空指针。
  • 如果argc的值大于零,则数组成员argv[0]通过
    argv[argc-1] 包含应包含指向字符串的指针,这些字符串给出
    在程序启动之前由主机环境实现定义的值。这
    目的是向程序提供在程序启动之前确定的信息
    来自托管环境中的其他地方。如果宿主环境不具备
    提供包含大写和小写字母的字符串,实现
    应确保以小写形式接收字符串。

  • 如果argc的值大于零,则argv[0]指向的字符串
    代表程序名称argv[0][0] 应为空字符,如果
    程序名称无法从主机环境中获得。如果 argc 的值为
    大于 1,argv[1]argv[argc-1] 指向的字符串
    代表程序参数

  • 参数argcargv以及argv数组指向的字符串应
    可由程序修改,并在程序之间保留其最后存储的值
    启动和程序终止。

最后一个项目符号是最有趣的字符串值存储位置。它没有指定堆或堆栈,但它确实要求字符串可写并且具有静态范围,这对字符串内容的位置设置了一些限制。正如其他人所说,具体细节将取决于实施情况。

Here's what the C standard (n1256) says:

5.1.2.2.1 Program startup
...
2 If they are declared, the parameters to the main function shall obey the following
constraints:

  • The value of argc shall be nonnegative.
  • argv[argc] shall be a null pointer.
  • If the value of argc is greater than zero, the array members argv[0] through
    argv[argc-1] inclusive shall contain pointers to strings, which are given
    implementation-defined values by the host environment prior to program startup. The
    intent is to supply to the program information determined prior to program startup
    from elsewhere in the hosted environment. If the host environment is not capable of
    supplying strings with letters in both uppercase and lowercase, the implementation
    shall ensure that the strings are received in lowercase.

  • If the value of argc is greater than zero, the string pointed to by argv[0]
    represents the program name; argv[0][0] shall be the null character if the
    program name is not available from the host environment. If the value of argc is
    greater than one, the strings pointed to by argv[1] through argv[argc-1]
    represent the program parameters.

  • The parameters argc and argv and the strings pointed to by the argv array shall
    be modifiable by the program, and retain their last-stored values between program
    startup and program termination.

The last bullet is the most interesting wrt where the string values are stored. It doesn't specify heap or stack, but it does require that the strings be writable and have static extent, which places some limits on where the string contents may be located. As others have said, the exact details will depend on the implementation.

£噩梦荏苒 2024-10-10 06:52:21

它们是编译器的魔法,并且依赖于实现。

They are compiler magic, and implementation-dependent.

想你只要分分秒秒 2024-10-10 06:52:21

它实际上是编译器依赖和操作系统依赖的结合。 main() 是一个与任何其他 C 函数一样的函数,因此两个参数 argcargv 的位置将遵循编译器的标准在平台上。例如,对于大多数针对 x86 的 C 编译器,它们将位于返回地址和保存的基指针上方的堆栈上(记住,堆栈向下增长)。在 x86_64 上,参数在寄存器中传递,因此 argc 将位于 %edi 中,argv 将位于 %rsi 中。然后编译器生成的主函数中的代码将它们复制到堆栈中,这就是后面的引用所指向的地方。这样寄存器就可以用于来自 main 的函数调用。

argv 指向的 char* 块和实际的字符序列可以位于任何位置。它们将从某些操作系统定义的位置开始,并且可以通过链接器生成的前导码复制到堆栈或其他位置。您必须查看 exec() 的代码以及链接器生成的汇编器前导码才能找到答案。

It's actually a combination of compiler dependence and operating system dependence. main() is a function just like any other C function, so the location of the two parameters argc and argv will follow standard for the compiler on the platform. e.g. for most C compilers targeting x86 they will be on the stack just above the return address and the saved base pointer (the stack grows downwards, remember). On x86_64 parameters are passed in registers, so argc will be in %edi and argv will be in %rsi. Code in the main function generated by the compiler then copies them to the stack, and that is where later references point. This is so the registers can be used for function calls from main.

The block of char*s that argv points to and the actual sequences of characters could be anywhere. They will start in some operating system defined location and may be copied by the pre-amble code that the linker generates to the stack or somewhere else. You'll have to look at the code for exec() and the assembler pre-amble generated by the linker to find out.

帝王念 2024-10-10 06:52:21

这个问题的答案取决于编译器。这意味着它没有在 C 标准中得到处理,因此任何人都可以按照他或她的意愿实现它。这是正常的,因为操作系统也没有普遍接受的标准方法来启动和完成进程。

让我们想象一个简单的、为什么不的场景。

该进程通过某种机制接收在命令行中写入的参数。 argc 只是一个 int,它被编译器作为程序进程(运行时的一部分)的入口点的引导函数推送到堆栈。实际值是从操作系统获取的,并且可以写入堆的内存块中。然后构建 argv 向量,并将其第一个位置的地址也推入堆栈。

然后,调用必须由程序员提供的函数 main(),并保存其返回值以供以后(几乎立即)使用。堆中的结构被释放,并且 main 获得的退出代码被导出到操作系统。该过程结束。

The answer to this question is compiler-dependent. This means it is not treated in the C standard, so anyone can implement that as he or she would like to. This is normal since also operating systems don't have a common accepted, standard way to start processes and finish them.

Let's imagine a simple, why-not scenario.

The process receives by some mechanism the arguments written in the command line. argc is then just an int which is pushed to the stack by the bootstrap function the compiler put as the entry point for the process of the program (part of the runtime). The actual values are get from the operating system, and can be, say, written in a memory block of the Heap. Then the argv vector is built and the address to its first position also pushed into the stack.

Then the function main(), which must be provided by the programmer, is called, and its return value is saved for later (nearly inmediate) use. The structures in the Heap are freed, and the exit code obtained for main is exported to the operating system. The process finishes.

兲鉂ぱ嘚淚 2024-10-10 06:52:21

这些参数与任何其他函数的参数没有什么不同。
如果架构的调用序列需要参数通过堆栈,那么它们就在堆栈上。如果像 x86-64 一样,某些参数进入寄存器,这些参数也会进入寄存器。

These parameters are no different than any other function's parameters.
If the architecture's calling sequence requires parameters to go through stack they are on stack. If, like on, x86-64 some parameters go in registers these also go in registers.

冷心人i 2024-10-10 06:52:21

正如pmg提到的,当递归调用main时,参数指向的调用者决定。基本上,答案与 main 的原始调用相同,只是“调用者”是 C 实现/操作系统。

在 UNIX-y 系统上,argv 指向的字符串、argv 指针本身以及进程的初始环境变量几乎总是存储在堆栈的最顶部。

As pmg mentions, when main is called recursively, it's up to the caller where the arguments point to. Basically the answer is the same on the original invocation of main, except that the "caller" is the C implementation/OS.

On UNIX-y systems, the strings that argv points to, the argv pointers themselves, and the process's initial environment variables are almost always stored at the very top of the stack.

颜漓半夏 2024-10-10 06:52:21

正如这里许多其他答案所指出的那样,标准未指定编译器实现用于将参数传递给 main 的精确机制(编译器用于将任何参数传递给函数的机制也是如此)。严格来说,编译器甚至不需要在这些参数中传递任何有用的内容,因为这些值是实现定义的。但这些都不是特别有用的答案。

典型的 C(或 C++)程序是针对所谓的“托管”执行环境进行编译的(使用函数 main() 作为程序的起点是托管环境的要求之一) 。需要了解的关键一点是,编译器会进行安排,以便当操作系统启动可执行文件时,编译器的运行时会首先获得控制权,而不是 main() 函数。运行时的初始化代码执行任何必要的初始化,包括为 main() 的参数分配内存,然后将控制权转移给 main()

main() 参数的内存可以来自堆,可以在堆栈上分配(可能使用标准 C 代码不可用的技术),或者可以使用静态分配的内存,尽管这是一个不太可能的选择,因为它不太灵活。该标准确实要求用于 argv 指向的字符串的内存是可修改的,并且对这些字符串所做的修改在程序的整个生命周期中持续存在。

请注意,在执行到达 main() 之前,已经运行了相当多的代码来设置程序运行的环境。

As many other answers here point out, the precise mechanism a compiler implementation uses to pass arguments to main is unspecified by the standard (as is the mechanism a compiler uses to pass any arguments to a function). Strictly speaking, the compiler need not even pass anything useful in those parameters, since the values are implementation-defined. But neither of these are particularly helpful answers.

The typical C (or C++) program is compiled for what's known as a 'hosted' execution environment (using function main() as the starting point of your program is one of the requirements for a hosted environment). The key thing to know is that the compiler arranges things so that when the executable is launched by the operating system, the compiler's runtime gets control initially - not the main() function. The runtime's initialization code performs whatever initialization is necessary, including allocating memory for the arguments to main(), then it transfers control to main().

The memory for the arguments to main() could come from the heap, could be allocated on the stack (possibly using techniques that aren't available to standard C code), or could use statically allocated memory, though that's a less likely option just because it's less flexible. The standard does require that the memory used for the strings pointed to by argv are modifiable and that modifications made to those string persist throughout the program's lifetime.

Just be aware that before execution reaches main(), quite a bit of code has already been run that's setting up the environment for your program to run in.

樱桃奶球 2024-10-10 06:52:21

参数列表是进程环境的一部分,类似于(但不同于)环境变量。

The argument list is part of the process environment, similar to (but distinct from) environment variables.

蹲在坟头点根烟 2024-10-10 06:52:21

通常不知道他们在哪里。

#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
  char **foo;
  char *bar[] = {"foo", "bar"};

  (void)argv; /* avoid unused argv warning */

  foo = malloc(sizeof *foo);
  foo[0] = malloc(42);
  strcpy(foo[0], "forty two");

  /* where is foo located? stack? heap? somewhere else? */
  if (argc != 42) main(42, foo); else return 0;

  /* where is bar located? stack? heap? somewhere else? */
  if (argc != 43) main(43, bar); else return 0;
  /* except for the fact that bar elements
  ** point to unmodifiable strings
  ** this call to main is perfectably reasonable */

  return 0;
  /* please ignore memory leaks, thank you */
}

Usually it is unknown where they are.

#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
  char **foo;
  char *bar[] = {"foo", "bar"};

  (void)argv; /* avoid unused argv warning */

  foo = malloc(sizeof *foo);
  foo[0] = malloc(42);
  strcpy(foo[0], "forty two");

  /* where is foo located? stack? heap? somewhere else? */
  if (argc != 42) main(42, foo); else return 0;

  /* where is bar located? stack? heap? somewhere else? */
  if (argc != 43) main(43, bar); else return 0;
  /* except for the fact that bar elements
  ** point to unmodifiable strings
  ** this call to main is perfectably reasonable */

  return 0;
  /* please ignore memory leaks, thank you */
}
御守 2024-10-10 06:52:21

虽然您可以访问实际参数,但我认为它们的实际位置根本不重要。

While you are able to access to the actual parameters, I think their actual location does not matter at all.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文