在没有 /proc/self/exe 的情况下查找当前可执行文件的路径

发布于 2024-07-25 21:40:52 字数 257 浏览 4 评论 0原文

在我看来,Linux 使用 /proc/self/exe 很容易。 但我想知道是否有一种方便的方法可以通过跨平台接口在C/C++中找到当前应用程序的目录。 我见过一些项目在使用 argv[0],但它似乎并不完全可靠。

例如,如果您必须支持没有 /proc/ 的 Mac OS X,您会怎么做? 使用 #ifdefs 隔离特定于平台的代码(例如 NSBundle)? 或者尝试从 argv[0]、$PATH 等推导出可执行文件的路径,冒着在边缘情况下发现错误的风险?

It seems to me that Linux has it easy with /proc/self/exe. But I'd like to know if there is a convenient way to find the current application's directory in C/C++ with cross-platform interfaces. I've seen some projects mucking around with argv[0], but it doesn't seem entirely reliable.

If you ever had to support, say, Mac OS X, which doesn't have /proc/, what would you have done? Use #ifdefs to isolate the platform-specific code (NSBundle, for example)? Or try to deduce the executable's path from argv[0], $PATH and whatnot, risking finding bugs in edge cases?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(16

乖乖 2024-08-01 21:40:53

根据QNX Neutrino的版本,可以使用不同的方法来查找用于启动正在运行的进程的可执行文件的完整路径和名称。 我将进程标识符表示为 。 请尝试以下操作:

  1. 如果文件 /proc/self/exefile 存在,则其内容就是请求的信息。
  2. 如果文件 /proc//exefile 存在,则其内容就是所请求的信息。
  3. 如果文件 /proc/self/as 存在,则:
    1. open() 文件。
    2. 分配至少 sizeof(procfs_debuginfo) + _POSIX_PATH_MAX 的缓冲区。
    3. 将该缓冲区作为 devctl(fd, DCMD_PROC_MAPDEBUG_BASE,... 的输入。
    4. 将缓冲区转换为 procfs_debuginfo*
    5. 请求的信息位于 procfs_debuginfo 结构的 path 字段。 警告:出于某种原因,有时,QNX 会省略文件路径的第一个斜杠 /。 需要时在前面加上 /
    6. 清理(关闭文件、释放缓冲区等)。
  4. 尝试使用文件 /proc/中的过程。 PID>/as
  5. 尝试 dladdr(dlsym(RTLD_DEFAULT, "main"), &dlinfo),其中 dlinfoDl_info > 结构,其 dli_fname 可能包含所请求的信息,

我希望这会有所帮助。

Depending on the version of QNX Neutrino, there are different ways to find the full path and name of the executable file that was used to start the running process. I denote the process identifier as <PID>. Try the following:

  1. If the file /proc/self/exefile exists, then its contents are the requested information.
  2. If the file /proc/<PID>/exefile exists, then its contents are the requested information.
  3. If the file /proc/self/as exists, then:
    1. open() the file.
    2. Allocate a buffer of, at least, sizeof(procfs_debuginfo) + _POSIX_PATH_MAX.
    3. Give that buffer as input to devctl(fd, DCMD_PROC_MAPDEBUG_BASE,....
    4. Cast the buffer to a procfs_debuginfo*.
    5. The requested information is at the path field of the procfs_debuginfo structure. Warning: For some reason, sometimes, QNX omits the first slash / of the file path. Prepend that / when needed.
    6. Clean up (close the file, free the buffer, etc.).
  4. Try the procedure in 3. with the file /proc/<PID>/as.
  5. Try dladdr(dlsym(RTLD_DEFAULT, "main"), &dlinfo) where dlinfo is a Dl_info structure whose dli_fname might contain the requested information.

I hope this helps.

oО清风挽发oО 2024-08-01 21:40:53

除了 mark4o 的回答FreeBSD 也有

const char* getprogname(void)

它应该也可以在 macOS 中使用。 它可以通过 libbsd 在 GNU/Linux 中使用。

In addition to mark4o's answer, FreeBSD also has

const char* getprogname(void)

It should also be available in macOS. It's available in GNU/Linux through libbsd.

西瑶 2024-08-01 21:40:53

AFAIK,没有这样的方法。 而且还存在一个歧义:如果同一个可执行文件有多个硬链接“指向”它,您希望得到什么答案? (硬链接实际上并不“指向”,它们是同一个文件,只是位于文件系统层次结构中的另一个位置。)

一旦 execve() 成功执行新的二进制文件,有关原始程序参数的所有信息都会丢失。

AFAIK, no such way. And there is also an ambiguity: what would you like to get as the answer if the same executable has multiple hard-links "pointing" to it? (Hard-links don't actually "point", they are the same file, just at another place in the file system hierarchy.)

Once execve() successfully executes a new binary, all information about the arguments to the original program is lost.

你的心境我的脸 2024-08-01 21:40:53

当然,这并不适用于所有项目。
尽管如此,在 6 年的 C++/Qt 开发过程中,QCoreApplication::applicationFilePath() 从未让我失望过。

当然,在尝试使用它之前应该仔细阅读文档:

警告:在 Linux 上,此函数将尝试从
/proc 文件系统。 如果失败,则假定 argv[0] 包含
可执行文件的绝对文件名。 该函数还假设
当前目录尚未被应用程序更改。

老实说,我认为 #ifdef 和其他类似的解决方案根本不应该在现代代码中使用。

我确信还存在更小的跨平台库。 让他们将所有特定于平台的东西封装在里面。

Well, of course, this doesn't apply to all projects.
Still, QCoreApplication::applicationFilePath() never failed me in 6 years of C++/Qt development.

Of course, one should read documentation thoroughly before attempting to use it:

Warning: On Linux, this function will try to get the path from the
/proc file system. If that fails, it assumes that argv[0] contains the
absolute file name of the executable. The function also assumes that
the current directory has not been changed by the application.

To be honest, I think that #ifdef and other solutions like that shouldn't be used in modern code at all.

I'm sure that smaller cross-platform libraries also exist. Let them encapsulate all that platform-specific stuff inside.

酒与心事 2024-08-01 21:40:53

如果您正在编写 GPLed 代码并使用 GNU 自动工具,那么 gnulib 的 relocatable-prog 模块。

If you're writing GPLed code and using GNU autotools, then a portable way that takes care of the details on many OSes (including Windows and macOS) is gnulib's relocatable-prog module.

清风夜微凉 2024-08-01 21:40:53

您可以使用argv[0]并分析PATH环境变量。
请参阅:示例一个可以找到自己的程序

You can use argv[0] and analyze the PATH environment variable.
Look at : A sample of a program that can find itself

青春有你 2024-08-01 21:40:53

只是我的两分钱。 使用此代码可以通过跨平台接口在C/C++中找到当前应用程序的目录。

void getExecutablePath(char ** path, unsigned int * pathLength)
{
    // Early exit when invalid out-parameters are passed
    if (!checkStringOutParameter(path, pathLength))
    {
        return;
    }

#if defined SYSTEM_LINUX

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    // Return written bytes, indicating if memory was sufficient
    int len = readlink("/proc/self/exe", exePath, PATH_MAX);

    if (len <= 0 || len == PATH_MAX) // memory not sufficient or general error occured
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_WINDOWS

    // Preallocate MAX_PATH (e.g., 4095) characters and hope the executable path isn't longer (including null byte)
    char exePath[MAX_PATH];

    // Return written bytes, indicating if memory was sufficient
    unsigned int len = GetModuleFileNameA(GetModuleHandleA(0x0), exePath, MAX_PATH);
    if (len == 0) // memory not sufficient or general error occured
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_SOLARIS

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    // Convert executable path to canonical path, return null pointer on error
    if (realpath(getexecname(), exePath) == 0x0)
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    unsigned int len = strlen(exePath);
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_DARWIN

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    unsigned int len = (unsigned int)PATH_MAX;

    // Obtain executable path to canonical path, return zero on success
    if (_NSGetExecutablePath(exePath, &len) == 0)
    {
        // Convert executable path to canonical path, return null pointer on error
        char * realPath = realpath(exePath, 0x0);

        if (realPath == 0x0)
        {
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        // Copy contents to caller, create caller ownership
        unsigned int len = strlen(realPath);
        copyToStringOutParameter(realPath, len, path, pathLength);

        free(realPath);
    }
    else // len is initialized with the required number of bytes (including zero byte)
    {
        char * intermediatePath = (char *)malloc(sizeof(char) * len);

        // Convert executable path to canonical path, return null pointer on error
        if (_NSGetExecutablePath(intermediatePath, &len) != 0)
        {
            free(intermediatePath);
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        char * realPath = realpath(intermediatePath, 0x0);

        free(intermediatePath);

        // Check if conversion to canonical path succeeded
        if (realPath == 0x0)
        {
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        // Copy contents to caller, create caller ownership
        unsigned int len = strlen(realPath);
        copyToStringOutParameter(realPath, len, path, pathLength);

        free(realPath);
    }

#elif defined SYSTEM_FREEBSD

    // Preallocate characters and hope the executable path isn't longer (including null byte)
    char exePath[2048];

    unsigned int len = 2048;

    int mib[] = { CTL_KERN, KERN_PROC, KERN_PROC_PATHNAME, -1 };

    // Obtain executable path by syscall
    if (sysctl(mib, 4, exePath, &len, 0x0, 0) != 0)
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#else

    // If no OS could be detected ... degrade gracefully
    invalidateStringOutParameter(path, pathLength);

#endif
}

您可以在这里详细查看

Just my two cents. You can find the current application's directory in C/C++ with cross-platform interfaces by using this code.

void getExecutablePath(char ** path, unsigned int * pathLength)
{
    // Early exit when invalid out-parameters are passed
    if (!checkStringOutParameter(path, pathLength))
    {
        return;
    }

#if defined SYSTEM_LINUX

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    // Return written bytes, indicating if memory was sufficient
    int len = readlink("/proc/self/exe", exePath, PATH_MAX);

    if (len <= 0 || len == PATH_MAX) // memory not sufficient or general error occured
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_WINDOWS

    // Preallocate MAX_PATH (e.g., 4095) characters and hope the executable path isn't longer (including null byte)
    char exePath[MAX_PATH];

    // Return written bytes, indicating if memory was sufficient
    unsigned int len = GetModuleFileNameA(GetModuleHandleA(0x0), exePath, MAX_PATH);
    if (len == 0) // memory not sufficient or general error occured
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_SOLARIS

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    // Convert executable path to canonical path, return null pointer on error
    if (realpath(getexecname(), exePath) == 0x0)
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    unsigned int len = strlen(exePath);
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_DARWIN

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    unsigned int len = (unsigned int)PATH_MAX;

    // Obtain executable path to canonical path, return zero on success
    if (_NSGetExecutablePath(exePath, &len) == 0)
    {
        // Convert executable path to canonical path, return null pointer on error
        char * realPath = realpath(exePath, 0x0);

        if (realPath == 0x0)
        {
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        // Copy contents to caller, create caller ownership
        unsigned int len = strlen(realPath);
        copyToStringOutParameter(realPath, len, path, pathLength);

        free(realPath);
    }
    else // len is initialized with the required number of bytes (including zero byte)
    {
        char * intermediatePath = (char *)malloc(sizeof(char) * len);

        // Convert executable path to canonical path, return null pointer on error
        if (_NSGetExecutablePath(intermediatePath, &len) != 0)
        {
            free(intermediatePath);
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        char * realPath = realpath(intermediatePath, 0x0);

        free(intermediatePath);

        // Check if conversion to canonical path succeeded
        if (realPath == 0x0)
        {
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        // Copy contents to caller, create caller ownership
        unsigned int len = strlen(realPath);
        copyToStringOutParameter(realPath, len, path, pathLength);

        free(realPath);
    }

#elif defined SYSTEM_FREEBSD

    // Preallocate characters and hope the executable path isn't longer (including null byte)
    char exePath[2048];

    unsigned int len = 2048;

    int mib[] = { CTL_KERN, KERN_PROC, KERN_PROC_PATHNAME, -1 };

    // Obtain executable path by syscall
    if (sysctl(mib, 4, exePath, &len, 0x0, 0) != 0)
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#else

    // If no OS could be detected ... degrade gracefully
    invalidateStringOutParameter(path, pathLength);

#endif
}

You can take a look in detail here.

海螺姑娘 2024-08-01 21:40:53

但是我想知道是否有一种方便的方法可以通过跨平台接口在C/C++中查找当前应用程序的目录。

你不能这样做(至少在 Linux 上),

因为可执行文件可以在执行 进程< /a> 运行它,重命名(2) 其文件(同一文件系统的)不同目录的路径。 另请参阅 syscalls(2)inode(7)

在 Linux 上,可执行文件甚至可以(原则上) remove(3)< /a> 通过调用 unlink(2) 本身。 Linux 内核 应该保留分配的文件,直到没有进程再引用它。 使用 proc(5) 你可以做一些奇怪的事情(例如重命名(2) /proc/self /exe 文件等...)

换句话说,在 Linux 上,“当前应用程序目录”的概念没有任何意义。

另请阅读高级Linux编程操作系统:三个简单的部分了解更多信息。

还可以在 OSDEV 上查找多个开源操作系统(包括 FreeBSDGNU Hurd) 。 其中一些提供了接近 POSIX 接口 (API)。

考虑使用(经许可)跨平台 C++ 框架,例如 QtPOCO,也许可以通过将它们移植到您最喜欢的操作系统来为它们做出贡献。

But I'd like to know if there is a convenient way to find the current application's directory in C/C++ with cross-platform interfaces.

You cannot do that (at least on Linux)

Since an executable could, during execution of a process running it, rename(2) its file path to a different directory (of the same file system). See also syscalls(2) and inode(7).

On Linux, an executable could even (in principle) remove(3) itself by calling unlink(2). The Linux kernel should then keep the file allocated till no process references it anymore. With proc(5) you could do weird things (e.g. rename(2) that /proc/self/exe file, etc...)

In other words, on Linux, the notion of a "current application's directory" does not make any sense.

Read also Advanced Linux Programming and Operating Systems: Three Easy Pieces for more.

Look also on OSDEV for several open source operating systems (including FreeBSD or GNU Hurd). Several of them provide an interface (API) close to POSIX ones.

Consider using (with permission) cross-platform C++ frameworks like Qt or POCO, perhaps contributing to them by porting them to your favorite OS.

北城孤痞 2024-08-01 21:40:53

问题可能是您想要实现的目标。

例如,如果您编写的游戏访问与其自身相关或附加的游戏文件,并且您仅针对 Windows、Mac 和 Linux,那么使用这些系统提供的 API 就足够了。

如果您也想支持其他 Unices,您可以创建一个小的包装 shell 脚本,它可以执行诸如解析脚本路径和 cd 到其中或设置指向数据的环境变量等操作。在 OpenBSD 等系统上工作。

如果您不打算稍后修改数据,也可以将数据嵌入到源代码中。 使用 xxd -i myfile.bin 创建头文件或使用 Qt 资源编译器。

The question is probably what you are trying to achieve.

For example, if you write a game that accesses game files relative to or attached to itself and you're targeting only Windows, Mac and Linux, then using the APIs provided by these systems should be sufficient.

If you want to support other Unices as well you could create a small wrapper shell script that does stuff like resolving the script path and cd into it or setup environment variables pointing to the data, etc. That should also work on systems like OpenBSD.

You could also embed data into the source code if you don't plan to modify it later. Create a header file with xxd -i myfile.bin or use the Qt resource compiler.

我们的影子 2024-08-01 21:40:53

我没有在编写它的标准中找到它,但在 C++17 及更高版本中,这在我测试的所有平台上都有效:

std::filesystem::canonical(argv[0])

即使 argv[0] 包含不同的内容,具体取决于该平台一旦规范化,就可以正常工作。 它给出了可执行文件的绝对路径,无论 cwd 是什么。

I have not found it in the standard where it is written to work, but in C++17 and onward, this works on all platforms I tested:

std::filesystem::canonical(argv[0])

Even though argv[0] contains different things depending on the platform, when canonicalized, it just works. It gives the absolute path to the executable, no matter the cwd.

阿楠 2024-08-01 21:40:52

一些特定于操作系统的接口:

还有第三方库可用于获取此信息,例如prideout中提到的whereami回答,或者如果您使用的是 Qt,则 QCoreApplication::applicationFilePath() 正如评论中提到的。

可移植(但不太可靠)的方法是使用argv[0]。 尽管调用程序可以将其设置为任何内容,但按照惯例,它被设置为可执行文件的路径名或使用 $PATH 找到的名称。

某些 shell,包括 bash 和 ksh,设置环境变量“_" 到可执行文件执行之前的完整路径。 在这种情况下,您可以使用 getenv("_") 来获取它。 然而,这是不可靠的,因为并非所有 shell 都这样做,并且它可以设置为任何值,或者是父进程留下的,而父进程在执行程序之前没有更改它。

Some OS-specific interfaces:

There are also third party libraries that can be used to get this information, such as whereami as mentioned in prideout's answer, or if you are using Qt, QCoreApplication::applicationFilePath() as mentioned in the comments.

The portable (but less reliable) method is to use argv[0]. Although it could be set to anything by the calling program, by convention it is set to either a path name of the executable or a name that was found using $PATH.

Some shells, including bash and ksh, set the environment variable "_" to the full path of the executable before it is executed. In that case you can use getenv("_") to get it. However this is unreliable because not all shells do this, and it could be set to anything or be left over from a parent process which did not change it before executing your program.

带上头具痛哭 2024-08-01 21:40:52

/proc/self/exe 的使用是不可移植且不可靠的。 在我的 Ubuntu 12.04 系统上,您必须是 root 才能读取/遵循符号链接。 这将使 Boost 示例以及发布的 whereami() 解决方案失败。

这篇文章很长,但讨论了实际问题并提供了实际与测试套件验证一起工作的代码。

找到您的程序的最佳方法是追溯系统使用的相同步骤。 这是通过使用针对文件系统根、密码、路径环境解析的 argv[0] 并考虑符号链接和路径名规范化来完成的。 这是凭记忆写的,但我过去已经成功地做到了这一点,并在各种不同的情况下进行了测试。 它不能保证有效,但如果不能,您可能会遇到更大的问题,并且总体上它比所讨论的任何其他方法都更可靠。 在 Unix 兼容系统上存在这样的情况:正确处理 argv[0] 不会让您进入程序,但您却在一个可证明损坏的环境中执行。 它也相当可移植到 1970 年左右以来的所有 Unix 派生系统,甚至一些非 Unix 派生系统,因为它基本上依赖于 libc() 标准功能和标准命令行功能。 它应该适用于 Linux(所有版本)、Android、Chrome OS、Minix、原始版本贝尔实验室 Unix、FreeBSDNetBSD, OpenBSD, BSD xx、SunOS、Solaris、SYSV、HP-UX、Concentrix、SCO、Darwin、AIX、OS X、NeXTSTEP 等。稍微修改一下可能VMS、VM/CMS、DOS/Windows、ReactOS< /a>、OS/2 等。如果直接启动程序从 GUI 环境中,它应该将 argv[0] 设置为绝对路径。

要知道,几乎所有已发布的 Unix 兼容操作系统上的每个 shell 基本上都以相同的方式查找程序并以几乎相同的方式设置操作环境(带有一些可选的附加功能)。 启动一个程序的任何其他程序都应该为该程序创建相同的环境(argv、环境字符串等),就好像它是从 shell 运行一样,并带有一些可选的附加功能。 程序或用户可以为其启动的其他从属程序设置一个偏离此约定的环境,但如果这样做,则这是一个错误,并且该程序没有合理的期望从属程序或其从属程序将正常运行。

argv[0] 的可能值包括:

  • /path/to/executable — 绝对路径
  • ../bin/executable — 相对于 pwd
  • < code>bin/executable — 相对于 pwd
  • ./foo — 相对于 pwd
  • executable — 基本名称,在路径中查找
  • bin//executable code> — 相对于 pwd,非规范
  • src/../bin/executable — 相对于 pwd,非规范,回溯
  • bin/./echoargc — 相对于 非规范

pwd,您不应该看到的

  • 值: ~/bin/executable — 在程序运行之前重写。
  • ~user/bin/executable — 在程序运行之前重写
  • alias — 在程序运行之前重写
  • $shellvariable — 在程序运行之前重写
  • *foo* — 通配符,在程序运行之前重写,不是很有用
  • ?foo? — 通配符,在程序运行之前重写,不是很有用

另外,这些可能包含非 -规范路径名和多层符号链接。 在某些情况下,同一个程序可能有多个硬链接。 例如,/bin/ls/bin/ps/bin/chmod/bin/rm等可能是到 /bin/busybox 的硬链接。

要找到自己,请按照以下步骤操作:

  • 在程序入口(或库初始化)时保存 pwd、PATH 和 argv[0],因为它们稍后可能会更改。

  • 可选:特别是对于非 Unix 系统,单独但不丢弃路径名主机/用户/驱动器前缀部分(如果存在); 通常位于冒号之前或开头的“//”之后的部分。

  • 如果argv[0]是绝对路径,则使用它作为起点。 绝对路径可能以“/”开头,但在某些非 Unix 系统上,它可能以“”或驱动器号或名称前缀后跟冒号开头。

  • 否则如果argv[0]是相对路径(包含“/”或“”但不以此开头,例如“../../bin/foo”,然后组合 pwd+"/"+argv[0] (使用程序启动时的当前工作目录,而不是当前目录)。

  • 如果 argv[0] 是普通基本名称(无斜杠),则将其与 PATH 环境变量中的每个条目合并。依次尝试这些并使用第一个成功的。

  • 可选:或者尝试特定于平台的 /proc/self/exe/proc/curproc/file (BSD) 和 (char *) getauxval(AT_EXECFN)dlgetname(...)(如果存在)。 您甚至可以在基于 argv[0] 的方法之前尝试这些方法(如果它们可用并且您没有遇到权限问题)。 在不太可能发生的情况下(当您考虑所有系统的所有版本时)它们存在并且不会失败,它们可能会更具权威性。

  • 可选:检查使用命令行参数传入的路径名。

  • 可选:检查包装脚本显式传入的环境中的路径名(如果有)。

  • 可选:作为最后的手段尝试环境变量“_”。 它可能指向完全不同的程序,例如用户 shell。

  • 解析符号链接,可能有多层。 存在无限循环的可能性,但如果它们存在,您的程序可能不会被调用。

  • 通过将“/foo/../bar/”等子字符串解析为“/bar/”来规范化文件名。 请注意,如果您跨越网络安装点,这可能会改变含义,因此规范化并不总是一件好事。 在网络服务器上,符号链接中的“..”可用于遍历服务器上下文中而不是客户端上的另一个文件的路径。 在这种情况下,您可能需要客户端上下文,因此规范化就可以了。 还将“/./”等模式转换为“/”,将“//”转换为“/”。
    在 shell 中,readlink --canonicalize 将解析多个符号链接并规范化名称。 Chase 可能会做类似的事情,但没有安装。 realpath()canonicalize_file_name()(如果存在)可能会有所帮助。

如果 realpath() 在编译时不存在,您可以从许可的库发行版借用一个副本,并自行编译它,而不是重新发明轮子。 如果您将使用小于 PATH_MAX 的缓冲区,请修复潜在的缓冲区溢出(传入 sizeof 输出缓冲区,想想 strncpy() 与 strcpy())。 使用重命名的私有副本可能比测试它是否存在更容易。 来自 android/darwin/bsd 的许可许可证副本:
https://android .googlesource.com/platform/bionic/+/f077784/libc/upstream-freebsd/lib/libc/stdlib/realpath.c

请注意,多次尝试可能会成功或部分成功,并且它们可能并不全部指向相同的可执行文件,因此请考虑验证您的可执行文件; 但是,您可能没有读取权限 - 如果您无法读取它,请不要将其视为失败。 或者验证可执行文件附近的某些内容,例如您尝试查找的“../lib/”目录。 您可能有多个版本,打包版本和本地编译版本,本地版本和网络版本,本地版本和USB驱动便携式版本等,并且有很小的可能性,您可能会从不同的定位方法中得到两个不兼容的结果。 而“_”可能只是指向错误的程序。

使用 execve 的程序可以故意将 argv[0] 设置为与用于加载程序的实际路径不兼容并损坏 PATH、“_”、pwd 等。通常没有太多理由这样做; 但是,如果您的代码存在漏洞,而该代码忽略了执行环境可以通过多种方式更改这一事实,包括但不限于此(chroot、fuse 文件系统、硬链接等),则这可能会产生安全隐患。 shell 命令设置 PATH 但无法导出它。

您不一定需要为非 Unix 系统编写代码,但最好了解一些特殊性,这样您就可以以一种以后移植起来不那么困难的方式编写代码。 请注意,某些系统(DEC VMS、DOS、URL 等)可能具有以冒号结尾的驱动器名称或其他前缀,例如“C:”​​、“sys$drive:[foo]bar”和“file: ///foo/bar/baz”。 旧的 DEC VMS 系统使用“[”和“]”来括起路径的目录部分,但如果您的程序是在 POSIX 环境中编译的,这可能会发生变化。 某些系统(例如 VMS)可能有文件版本(末尾用分号分隔)。 某些系统使用两个连续的斜杠,如“//drive/path/to/file”或“user@host:/path/to/file”(scp 命令)或“file://hostname/path/to/file” (网址)。 在某些情况下(DOS 和 Windows),PATH 可能有不同的分隔符 — “;” 与“:”和“”对比“/”作为路径分隔符。 在 csh/tsh 中,有“path”(用空格分隔)和“PATH”用冒号分隔,但您的程序应该接收 PATH,因此您无需担心路径。 DOS 和其他一些系统可以具有以驱动器前缀开头的相对路径。 C:foo.exe 引用驱动器 C 上当前目录中的 foo.exe,因此您需要查找 C: 上的当前目录并将其用于 pwd。

我的系统上的符号链接和包装器的示例:

/usr/bin/google-chrome is symlink to
/etc/alternatives/google-chrome  which is symlink to
/usr/bin/google-chrome-stable which is symlink to
/opt/google/chrome/google-chrome which is a bash script which runs
/opt/google/chome/chrome

请注意,用户 bill 在上面发布了一个指向 HP 程序的链接,该程序处理 argv[0] 的三种基本情况。 不过,它需要一些更改:

  • 需要重写所有 strcat()strcpy() 才能使用 strncat() 和 <代码>strncpy()。 即使变量声明的长度为 PATHMAX,长度为 PATHMAX-1 的输入值加上连接字符串的长度仍大于 1。 PATHMAX 和长度为 PATHMAX 的输入值将不终止。
  • 它需要被重写为库函数,而不仅仅是打印结果。
  • 它无法规范化名称(使用我上面链接到的真实路径代码)
  • 它无法解析符号链接(使用真实路径代码)

因此,如果您将 HP 代码和真实路径代码结合起来并修复两者以防止缓冲区溢出,那么你应该有一些可以正确解释argv[0]的东西。

下面说明了在 Ubuntu 12.04 上调用同一程序的各种方式的 argv[0] 的实际值。 是的,该程序被意外命名为 echoargc 而不是 echoargv。 这是使用脚本进行干净复制来完成的,但在 shell 中手动执行会得到相同的结果(除非别名在脚本中不起作用,除非您明确启用它们)。

cat ~/src/echoargc.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
main(int argc, char **argv)
{
  printf("  argv[0]=\"%s\"\n", argv[0]);
  sleep(1);  /* in case run from desktop */
}
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
echoargc
  argv[0]="echoargc"
bin/echoargc
  argv[0]="bin/echoargc"
bin//echoargc
  argv[0]="bin//echoargc"
bin/./echoargc
  argv[0]="bin/./echoargc"
src/../bin/echoargc
  argv[0]="src/../bin/echoargc"
cd ~/bin
*echo*
  argv[0]="echoargc"
e?hoargc
  argv[0]="echoargc"
./echoargc
  argv[0]="./echoargc"
cd ~/src
../bin/echoargc
  argv[0]="../bin/echoargc"
cd ~/junk
~/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
  argv[0]="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
  argv[0]="/home/whitis/bin/echoargc"
ln -s ~/bin/echoargc junk1
./junk1
  argv[0]="./junk1"
ln -s /home/whitis/bin/echoargc junk2
./junk2
  argv[0]="./junk2"
ln -s junk1 junk3
./junk3
  argv[0]="./junk3"


gnome-desktop-item-edit --create-new ~/Desktop
# interactive, create desktop link, then click on it
  argv[0]="/home/whitis/bin/echoargc"
# interactive, right click on gnome application menu, pick edit menus
# add menu item for echoargc, then run it from gnome menu
 argv[0]="/home/whitis/bin/echoargc"

 cat ./testargcscript 2>&1 | sed -e 's/^/    /g'
#!/bin/bash
# echoargc is in ~/bin/echoargc
# bin is in path
shopt -s expand_aliases
set -v
cat ~/src/echoargc.c
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
echoargc
bin/echoargc
bin//echoargc
bin/./echoargc
src/../bin/echoargc
cd ~/bin
*echo*
e?hoargc
./echoargc
cd ~/src
../bin/echoargc
cd ~/junk
~/bin/echoargc
~whitis/bin/echoargc
alias echoit=~/bin/echoargc
echoit
echoarg=~/bin/echoargc
$echoarg
ln -s ~/bin/echoargc junk1
./junk1
ln -s /home/whitis/bin/echoargc junk2
./junk2
ln -s junk1 junk3
./junk3

这些示例说明了本文中描述的技术应该适用于各种情况,以及为什么某些步骤是必要的。

编辑:现在,打印 argv[0] 的程序已更新为实际找到自身。

// Copyright 2015 by Mark Whitis.  License=MIT style
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <limits.h>
#include <assert.h>
#include <string.h>
#include <errno.h>

// "look deep into yourself, Clarice"  -- Hanibal Lector
char findyourself_save_pwd[PATH_MAX];
char findyourself_save_argv0[PATH_MAX];
char findyourself_save_path[PATH_MAX];
char findyourself_path_separator='/';
char findyourself_path_separator_as_string[2]="/";
char findyourself_path_list_separator[8]=":";  // could be ":; "
char findyourself_debug=0;

int findyourself_initialized=0;

void findyourself_init(char *argv0)
{

  getcwd(findyourself_save_pwd, sizeof(findyourself_save_pwd));

  strncpy(findyourself_save_argv0, argv0, sizeof(findyourself_save_argv0));
  findyourself_save_argv0[sizeof(findyourself_save_argv0)-1]=0;

  strncpy(findyourself_save_path, getenv("PATH"), sizeof(findyourself_save_path));
  findyourself_save_path[sizeof(findyourself_save_path)-1]=0;
  findyourself_initialized=1;
}


int find_yourself(char *result, size_t size_of_result)
{
  char newpath[PATH_MAX+256];
  char newpath2[PATH_MAX+256];

  assert(findyourself_initialized);
  result[0]=0;

  if(findyourself_save_argv0[0]==findyourself_path_separator) {
    if(findyourself_debug) printf("  absolute path\n");
     realpath(findyourself_save_argv0, newpath);
     if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
     if(!access(newpath, F_OK)) {
        strncpy(result, newpath, size_of_result);
        result[size_of_result-1]=0;
        return(0);
     } else {
    perror("access failed 1");
      }
  } else if( strchr(findyourself_save_argv0, findyourself_path_separator )) {
    if(findyourself_debug) printf("  relative path to pwd\n");
    strncpy(newpath2, findyourself_save_pwd, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    realpath(newpath2, newpath);
    if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
    if(!access(newpath, F_OK)) {
        strncpy(result, newpath, size_of_result);
        result[size_of_result-1]=0;
        return(0);
     } else {
    perror("access failed 2");
      }
  } else {
    if(findyourself_debug) printf("  searching $PATH\n");
    char *saveptr;
    char *pathitem;
    for(pathitem=strtok_r(findyourself_save_path, findyourself_path_list_separator,  &saveptr); pathitem; pathitem=strtok_r(NULL, findyourself_path_list_separator, &saveptr) ) {
       if(findyourself_debug>=2) printf("pathitem=\"%s\"\n", pathitem);
       strncpy(newpath2, pathitem, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       realpath(newpath2, newpath);
       if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
      if(!access(newpath, F_OK)) {
          strncpy(result, newpath, size_of_result);
          result[size_of_result-1]=0;
          return(0);
      }
    } // end for
    perror("access failed 3");

  } // end else
  // if we get here, we have tried all three methods on argv[0] and still haven't succeeded.   Include fallback methods here.
  return(1);
}

main(int argc, char **argv)
{
  findyourself_init(argv[0]);

  char newpath[PATH_MAX];
  printf("  argv[0]=\"%s\"\n", argv[0]);
  realpath(argv[0], newpath);
  if(strcmp(argv[0],newpath)) { printf("  realpath=\"%s\"\n", newpath); }
  find_yourself(newpath, sizeof(newpath));
  if(1 || strcmp(argv[0],newpath)) { printf("  findyourself=\"%s\"\n", newpath); }
  sleep(1);  /* in case run from desktop */
}

这是输出,表明在之前的每一项测试中它确实找到了自己。

tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
echoargc
  argv[0]="echoargc"
  realpath="/home/whitis/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin/echoargc
  argv[0]="bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin//echoargc
  argv[0]="bin//echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin/./echoargc
  argv[0]="bin/./echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
src/../bin/echoargc
  argv[0]="src/../bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/bin
*echo*
  argv[0]="echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
e?hoargc
  argv[0]="echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
./echoargc
  argv[0]="./echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/src
../bin/echoargc
  argv[0]="../bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/junk
~/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
rm junk1 junk2 junk3
ln -s ~/bin/echoargc junk1
./junk1
  argv[0]="./junk1"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
ln -s /home/whitis/bin/echoargc junk2
./junk2
  argv[0]="./junk2"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
ln -s junk1 junk3
./junk3
  argv[0]="./junk3"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"

上述两个 GUI 启动也可以正确找到该程序。

有一个潜在的陷阱。 如果程序在测试前设置了 setuid,则 access() 函数会删除权限。 如果存在可以以提升用户身份找到该程序但不能以普通用户身份找到该程序的情况,则这些测试可能会失败,尽管在这种情况下该程序实际上不太可能执行。 可以使用 euidaccess() 来代替。 但是,它可能比实际用户更早地在路径上发现无法访问的程序。

The use of /proc/self/exe is non-portable and unreliable. On my Ubuntu 12.04 system, you must be root to read/follow the symlink. This will make the Boost example and probably the whereami() solutions posted fail.

This post is very long but discusses the actual issues and presents code which actually works along with validation against a test suite.

The best way to find your program is to retrace the same steps the system uses. This is done by using argv[0] resolved against file system root, pwd, path environment and considering symlinks, and pathname canonicalization. This is from memory but I have done this in the past successfully and tested it in a variety of different situations. It is not guaranteed to work, but if it doesn't you probably have much bigger problems and it is more reliable overall than any of the other methods discussed. There are situations on a Unix compatible system in which proper handling of argv[0] will not get you to your program but then you are executing in a certifiably broken environment. It is also fairly portable to all Unix derived systems since around 1970 and even some non-Unix derived systems as it basically relies on libc() standard functionality and standard command line functionality. It should work on Linux (all versions), Android, Chrome OS, Minix, original Bell Labs Unix, FreeBSD, NetBSD, OpenBSD, BSD x.x, SunOS, Solaris, SYSV, HP-UX, Concentrix, SCO, Darwin, AIX, OS X, NeXTSTEP, etc. And with a little modification probably VMS, VM/CMS, DOS/Windows, ReactOS, OS/2, etc. If a program was launched directly from a GUI environment, it should have set argv[0] to an absolute path.

Understand that almost every shell on every Unix compatible operating system that has ever been released basically finds programs the same way and sets up the operating environment almost the same way (with some optional extras). And any other program that launches a program is expected to create the same environment (argv, environment strings, etc.) for that program as if it were run from a shell, with some optional extras. A program or user can setup an environment that deviates from this convention for other subordinate programs that it launches but if it does, this is a bug and the program has no reasonable expectation that the subordinate program or its subordinates will function correctly.

Possible values of argv[0] include:

  • /path/to/executable — absolute path
  • ../bin/executable — relative to pwd
  • bin/executable — relative to pwd
  • ./foo — relative to pwd
  • executable — basename, find in path
  • bin//executable — relative to pwd, non-canonical
  • src/../bin/executable — relative to pwd, non-canonical, backtracking
  • bin/./echoargc — relative to pwd, non-canonical

Values you should not see:

  • ~/bin/executable — rewritten before your program runs.
  • ~user/bin/executable — rewritten before your program runs
  • alias — rewritten before your program runs
  • $shellvariable — rewritten before your program runs
  • *foo* — wildcard, rewritten before your program runs, not very useful
  • ?foo? — wildcard, rewritten before your program runs, not very useful

In addition, these may contain non-canonical path names and multiple layers of symbolic links. In some cases, there may be multiple hard links to the same program. For example, /bin/ls, /bin/ps, /bin/chmod, /bin/rm, etc. may be hard links to /bin/busybox.

To find yourself, follow the steps below:

  • Save pwd, PATH, and argv[0] on entry to your program (or initialization of your library) as they may change later.

  • Optional: particularly for non-Unix systems, separate out but don't discard the pathname host/user/drive prefix part, if present; the part which often precedes a colon or follows an initial "//".

  • If argv[0] is an absolute path, use that as a starting point. An absolute path probably starts with "/" but on some non-Unix systems it might start with "" or a drive letter or name prefix followed by a colon.

  • Else if argv[0] is a relative path (contains "/" or "" but doesn't start with it, such as "../../bin/foo", then combine pwd+"/"+argv[0] (use present working directory from when program started, not current).

  • Else if argv[0] is a plain basename (no slashes), then combine it with each entry in PATH environment variable in turn and try those and use the first one which succeeds.

  • Optional: Else try the very platform specific /proc/self/exe, /proc/curproc/file (BSD), and (char *)getauxval(AT_EXECFN), and dlgetname(...) if present. You might even try these before argv[0]-based methods, if they are available and you don't encounter permission issues. In the somewhat unlikely event (when you consider all versions of all systems) that they are present and don't fail, they might be more authoritative.

  • Optional: check for a path name passed in using a command line parameter.

  • Optional: check for a pathname in the environment explicitly passed in by your wrapper script, if any.

  • Optional: As a last resort try environment variable "_". It might point to a different program entirely, such as the users shell.

  • Resolve symlinks, there may be multiple layers. There is the possibility of infinite loops, though if they exist your program probably won't get invoked.

  • Canonicalize filename by resolving substrings like "/foo/../bar/" to "/bar/". Note this may potentially change the meaning if you cross a network mount point, so canonization is not always a good thing. On a network server, ".." in symlink may be used to traverse a path to another file in the server context instead of on the client. In this case, you probably want the client context so canonicalization is ok. Also convert patterns like "/./" to "/" and "//" to "/".
    In shell, readlink --canonicalize will resolve multiple symlinks and canonicalize name. Chase may do similar but isn't installed. realpath() or canonicalize_file_name(), if present, may help.

If realpath() doesn't exist at compile time, you might borrow a copy from a permissively licensed library distribution, and compile it in yourself rather than reinventing the wheel. Fix the potential buffer overflow (pass in sizeof output buffer, think strncpy() vs strcpy()) if you will be using a buffer less than PATH_MAX. It may be easier just to use a renamed private copy rather than testing if it exists. Permissive license copy from android/darwin/bsd:
https://android.googlesource.com/platform/bionic/+/f077784/libc/upstream-freebsd/lib/libc/stdlib/realpath.c

Be aware that multiple attempts may be successful or partially successful and they might not all point to the same executable, so consider verifying your executable; however, you may not have read permission — if you can't read it, don't treat that as a failure. Or verify something in proximity to your executable such as the "../lib/" directory you are trying to find. You may have multiple versions, packaged and locally compiled versions, local and network versions, and local and USB-drive portable versions, etc. and there is a small possibility that you might get two incompatible results from different methods of locating. And "_" may simply point to the wrong program.

A program using execve can deliberately set argv[0] to be incompatible with the actual path used to load the program and corrupt PATH, "_", pwd, etc. though there isn't generally much reason to do so; but this could have security implications if you have vulnerable code that ignores the fact that your execution environment can be changed in variety of ways including, but not limited, to this one (chroot, fuse filesystem, hard links, etc.) It is possible for shell commands to set PATH but fail to export it.

You don't necessarily need to code for non-Unix systems but it would be a good idea to be aware of some of the peculiarities so you can write the code in such a way that it isn't as hard for someone to port later. Be aware that some systems (DEC VMS, DOS, URLs, etc.) might have drive names or other prefixes which end with a colon such as "C:", "sys$drive:[foo]bar", and "file:///foo/bar/baz". Old DEC VMS systems use "[" and "]" to enclose the directory portion of the path though this may have changed if your program is compiled in a POSIX environment. Some systems, such as VMS, may have a file version (separated by a semicolon at the end). Some systems use two consecutive slashes as in "//drive/path/to/file" or "user@host:/path/to/file" (scp command) or "file://hostname/path/to/file" (URL). In some cases (DOS and Windows), PATH might have different separator characters — ";" vs ":" and "" vs "/" for a path separator. In csh/tsh there is "path" (delimited with spaces) and "PATH" delimited with colons but your program should receive PATH so you don't need to worry about path. DOS and some other systems can have relative paths that start with a drive prefix. C:foo.exe refers to foo.exe in the current directory on drive C, so you do need to lookup current directory on C: and use that for pwd.

An example of symlinks and wrappers on my system:

/usr/bin/google-chrome is symlink to
/etc/alternatives/google-chrome  which is symlink to
/usr/bin/google-chrome-stable which is symlink to
/opt/google/chrome/google-chrome which is a bash script which runs
/opt/google/chome/chrome

Note that user bill posted a link above to a program at HP that handles the three basic cases of argv[0]. It needs some changes, though:

  • It will be necessary to rewrite all the strcat() and strcpy() to use strncat() and strncpy(). Even though the variables are declared of length PATHMAX, an input value of length PATHMAX-1 plus the length of concatenated strings is > PATHMAX and an input value of length PATHMAX would be unterminated.
  • It needs to be rewritten as a library function, rather than just to print out results.
  • It fails to canonicalize names (use the realpath code I linked to above)
  • It fails to resolve symbolic links (use the realpath code)

So, if you combine both the HP code and the realpath code and fix both to be resistant to buffer overflows, then you should have something which can properly interpret argv[0].

The following illustrates actual values of argv[0] for various ways of invoking the same program on Ubuntu 12.04. And yes, the program was accidentally named echoargc instead of echoargv. This was done using a script for clean copying but doing it manually in shell gets same results (except aliases don't work in script unless you explicitly enable them).

cat ~/src/echoargc.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
main(int argc, char **argv)
{
  printf("  argv[0]=\"%s\"\n", argv[0]);
  sleep(1);  /* in case run from desktop */
}
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
echoargc
  argv[0]="echoargc"
bin/echoargc
  argv[0]="bin/echoargc"
bin//echoargc
  argv[0]="bin//echoargc"
bin/./echoargc
  argv[0]="bin/./echoargc"
src/../bin/echoargc
  argv[0]="src/../bin/echoargc"
cd ~/bin
*echo*
  argv[0]="echoargc"
e?hoargc
  argv[0]="echoargc"
./echoargc
  argv[0]="./echoargc"
cd ~/src
../bin/echoargc
  argv[0]="../bin/echoargc"
cd ~/junk
~/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
  argv[0]="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
  argv[0]="/home/whitis/bin/echoargc"
ln -s ~/bin/echoargc junk1
./junk1
  argv[0]="./junk1"
ln -s /home/whitis/bin/echoargc junk2
./junk2
  argv[0]="./junk2"
ln -s junk1 junk3
./junk3
  argv[0]="./junk3"


gnome-desktop-item-edit --create-new ~/Desktop
# interactive, create desktop link, then click on it
  argv[0]="/home/whitis/bin/echoargc"
# interactive, right click on gnome application menu, pick edit menus
# add menu item for echoargc, then run it from gnome menu
 argv[0]="/home/whitis/bin/echoargc"

 cat ./testargcscript 2>&1 | sed -e 's/^/    /g'
#!/bin/bash
# echoargc is in ~/bin/echoargc
# bin is in path
shopt -s expand_aliases
set -v
cat ~/src/echoargc.c
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
echoargc
bin/echoargc
bin//echoargc
bin/./echoargc
src/../bin/echoargc
cd ~/bin
*echo*
e?hoargc
./echoargc
cd ~/src
../bin/echoargc
cd ~/junk
~/bin/echoargc
~whitis/bin/echoargc
alias echoit=~/bin/echoargc
echoit
echoarg=~/bin/echoargc
$echoarg
ln -s ~/bin/echoargc junk1
./junk1
ln -s /home/whitis/bin/echoargc junk2
./junk2
ln -s junk1 junk3
./junk3

These examples illustrate that the techniques described in this post should work in a wide range of circumstances and why some of the steps are necessary.

EDIT: Now, the program that prints argv[0] has been updated to actually find itself.

// Copyright 2015 by Mark Whitis.  License=MIT style
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <limits.h>
#include <assert.h>
#include <string.h>
#include <errno.h>

// "look deep into yourself, Clarice"  -- Hanibal Lector
char findyourself_save_pwd[PATH_MAX];
char findyourself_save_argv0[PATH_MAX];
char findyourself_save_path[PATH_MAX];
char findyourself_path_separator='/';
char findyourself_path_separator_as_string[2]="/";
char findyourself_path_list_separator[8]=":";  // could be ":; "
char findyourself_debug=0;

int findyourself_initialized=0;

void findyourself_init(char *argv0)
{

  getcwd(findyourself_save_pwd, sizeof(findyourself_save_pwd));

  strncpy(findyourself_save_argv0, argv0, sizeof(findyourself_save_argv0));
  findyourself_save_argv0[sizeof(findyourself_save_argv0)-1]=0;

  strncpy(findyourself_save_path, getenv("PATH"), sizeof(findyourself_save_path));
  findyourself_save_path[sizeof(findyourself_save_path)-1]=0;
  findyourself_initialized=1;
}


int find_yourself(char *result, size_t size_of_result)
{
  char newpath[PATH_MAX+256];
  char newpath2[PATH_MAX+256];

  assert(findyourself_initialized);
  result[0]=0;

  if(findyourself_save_argv0[0]==findyourself_path_separator) {
    if(findyourself_debug) printf("  absolute path\n");
     realpath(findyourself_save_argv0, newpath);
     if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
     if(!access(newpath, F_OK)) {
        strncpy(result, newpath, size_of_result);
        result[size_of_result-1]=0;
        return(0);
     } else {
    perror("access failed 1");
      }
  } else if( strchr(findyourself_save_argv0, findyourself_path_separator )) {
    if(findyourself_debug) printf("  relative path to pwd\n");
    strncpy(newpath2, findyourself_save_pwd, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    realpath(newpath2, newpath);
    if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
    if(!access(newpath, F_OK)) {
        strncpy(result, newpath, size_of_result);
        result[size_of_result-1]=0;
        return(0);
     } else {
    perror("access failed 2");
      }
  } else {
    if(findyourself_debug) printf("  searching $PATH\n");
    char *saveptr;
    char *pathitem;
    for(pathitem=strtok_r(findyourself_save_path, findyourself_path_list_separator,  &saveptr); pathitem; pathitem=strtok_r(NULL, findyourself_path_list_separator, &saveptr) ) {
       if(findyourself_debug>=2) printf("pathitem=\"%s\"\n", pathitem);
       strncpy(newpath2, pathitem, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       realpath(newpath2, newpath);
       if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
      if(!access(newpath, F_OK)) {
          strncpy(result, newpath, size_of_result);
          result[size_of_result-1]=0;
          return(0);
      }
    } // end for
    perror("access failed 3");

  } // end else
  // if we get here, we have tried all three methods on argv[0] and still haven't succeeded.   Include fallback methods here.
  return(1);
}

main(int argc, char **argv)
{
  findyourself_init(argv[0]);

  char newpath[PATH_MAX];
  printf("  argv[0]=\"%s\"\n", argv[0]);
  realpath(argv[0], newpath);
  if(strcmp(argv[0],newpath)) { printf("  realpath=\"%s\"\n", newpath); }
  find_yourself(newpath, sizeof(newpath));
  if(1 || strcmp(argv[0],newpath)) { printf("  findyourself=\"%s\"\n", newpath); }
  sleep(1);  /* in case run from desktop */
}

And here is the output which demonstrates that in every one of the previous tests it actually did find itself.

tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
echoargc
  argv[0]="echoargc"
  realpath="/home/whitis/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin/echoargc
  argv[0]="bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin//echoargc
  argv[0]="bin//echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin/./echoargc
  argv[0]="bin/./echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
src/../bin/echoargc
  argv[0]="src/../bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/bin
*echo*
  argv[0]="echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
e?hoargc
  argv[0]="echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
./echoargc
  argv[0]="./echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/src
../bin/echoargc
  argv[0]="../bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/junk
~/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
rm junk1 junk2 junk3
ln -s ~/bin/echoargc junk1
./junk1
  argv[0]="./junk1"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
ln -s /home/whitis/bin/echoargc junk2
./junk2
  argv[0]="./junk2"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
ln -s junk1 junk3
./junk3
  argv[0]="./junk3"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"

The two GUI launches described above also correctly find the program.

There is one potential pitfall. The access() function drops permissions if the program is setuid before testing. If there is a situation where the program can be found as an elevated user but not as a regular user, then there might be a situation where these tests would fail, although it is unlikely the program could actually be executed under those circumstances. One could use euidaccess() instead. It is possible, however, that it might find an inaccessable program earlier on path than the actual user could.

时光暖心i 2024-08-01 21:40:52

Gregory Pakosz 的 whereami在各种平台上实现了这一点,使用mark4o 的帖子中提到的 API。 如果您“只是”需要一个适用于便携式项目的解决方案并且对各种平台的特性不感兴趣,那么这是最有趣的。

在撰写本文时,支持的平台有:

  • Windows
  • Linux
  • Mac
  • iOS
  • Android
  • QNX Neutrino
  • FreeBSD
  • NetBSD
  • DragonFly BSD
  • SunOS

该库由 whereami.cwhereami.h 组成,并已获得许可在 MIT 和 WTFPL2 下。 将文件放入您的项目中,包含标头并使用它:

#include "whereami.h"

int main() {
  int length = wai_getExecutablePath(NULL, 0, NULL);
  char* path = (char*)malloc(length + 1);
  wai_getExecutablePath(path, length, &dirname_length);
  path[length] = '\0';

  printf("My path: %s", path);

  free(path);
  return 0;
}

The whereami library by Gregory Pakosz implements this for a variety of platforms, using the APIs mentioned in mark4o's post. This is most interesting if you "just" need a solution that works for a portable project and are not interested in the peculiarities of the various platforms.

At the time of writing, supported platforms are:

  • Windows
  • Linux
  • Mac
  • iOS
  • Android
  • QNX Neutrino
  • FreeBSD
  • NetBSD
  • DragonFly BSD
  • SunOS

The library consists of whereami.c and whereami.h and is licensed under MIT and WTFPL2. Drop the files into your project, include the header and use it:

#include "whereami.h"

int main() {
  int length = wai_getExecutablePath(NULL, 0, NULL);
  char* path = (char*)malloc(length + 1);
  wai_getExecutablePath(path, length, &dirname_length);
  path[length] = '\0';

  printf("My path: %s", path);

  free(path);
  return 0;
}
橙幽之幻 2024-08-01 21:40:52

Linux 上使用 /proc/self/exeargv[0] 的替代方法是使用 ELF 解释器传递的信息,由 glibc 提供,如下所示

#include <stdio.h>
#include <sys/auxv.h>

int main(int argc, char **argv)
{
    printf("%s\n", (char *)getauxval(AT_EXECFN));
    return(0);
}

getauxval 是一个 glibc 扩展,为了保持鲁棒性,您应该检查它是否返回 NULL (表明 ELF 解释器尚未提供 AT_EXECFN 参数),但我认为这在 Linux 上实际上并不是一个问题。

An alternative on Linux to using either /proc/self/exe or argv[0] is using the information passed by the ELF interpreter, made available by glibc as such:

#include <stdio.h>
#include <sys/auxv.h>

int main(int argc, char **argv)
{
    printf("%s\n", (char *)getauxval(AT_EXECFN));
    return(0);
}

Note that getauxval is a glibc extension, and to be robust you should check so that it doesn't return NULL (indicating that the ELF interpreter hasn't provided the AT_EXECFN parameter), but I don't think this is ever actually a problem on Linux.

七度光 2024-08-01 21:40:52

要使其跨平台可靠地工作,需要使用#ifdef 语句。

下面的代码在 Windows、Linux、MacOS、Solaris 或 FreeBSD 中查找可执行文件的路径(尽管 FreeBSD 未经测试)。 它用
Boost 1.55.0(或更高版本)以简化代码,但如果您愿意,删除它也很容易。 只需根据操作系统和编译器的要求使用 _MSC_VER 和 __linux 等定义即可。

#include <string>
#include <boost/predef/os.h>

#if (BOOST_OS_WINDOWS)
#  include <stdlib.h>
#elif (BOOST_OS_SOLARIS)
#  include <stdlib.h>
#  include <limits.h>
#elif (BOOST_OS_LINUX)
#  include <unistd.h>
#  include <limits.h>
#elif (BOOST_OS_MACOS)
#  include <mach-o/dyld.h>
#elif (BOOST_OS_BSD_FREE)
#  include <sys/types.h>
#  include <sys/sysctl.h>
#endif

/*
 * Returns the full path to the currently running executable,
 * or an empty string in case of failure.
 */
std::string getExecutablePath() {
    #if (BOOST_OS_WINDOWS)
        char *exePath;
        if (_get_pgmptr(&exePath) != 0)
            exePath = "";
    #elif (BOOST_OS_SOLARIS)
        char exePath[PATH_MAX];
        if (realpath(getexecname(), exePath) == NULL)
            exePath[0] = '\0';
    #elif (BOOST_OS_LINUX)
        char exePath[PATH_MAX];
        ssize_t len = ::readlink("/proc/self/exe", exePath, sizeof(exePath));
        if (len == -1 || len == sizeof(exePath))
            len = 0;
        exePath[len] = '\0';
    #elif (BOOST_OS_MACOS)
        char exePath[PATH_MAX];
        uint32_t len = sizeof(exePath);
        if (_NSGetExecutablePath(exePath, &len) != 0) {
            exePath[0] = '\0'; // buffer too small (!)
        } else {
            // resolve symlinks, ., .. if possible
            char *canonicalPath = realpath(exePath, NULL);
            if (canonicalPath != NULL) {
                strncpy(exePath,canonicalPath,len);
                free(canonicalPath);
            }
        }
    #elif (BOOST_OS_BSD_FREE)
        char exePath[2048];
        int mib[4];  mib[0] = CTL_KERN;  mib[1] = KERN_PROC;  mib[2] = KERN_PROC_PATHNAME;  mib[3] = -1;
        size_t len = sizeof(exePath);
        if (sysctl(mib, 4, exePath, &len, NULL, 0) != 0)
            exePath[0] = '\0';
    #endif
        return std::string(exePath);
}

上述版本返回包括可执行文件名称的完整路径。 如果您想要不带可执行文件名称的路径,请 #include boost/filesystem.hpp> 并将 return 语句更改为:

return strlen(exePath)>0 ? boost::filesystem::path(exePath).remove_filename().make_preferred().string() : std::string();

Making this work reliably across platforms requires using #ifdef statements.

The below code finds the executable's path in Windows, Linux, MacOS, Solaris or FreeBSD (although FreeBSD is untested). It uses
Boost 1.55.0 (or later) to simplify the code, but it's easy enough to remove if you want. Just use defines like _MSC_VER and __linux as the OS and compiler require.

#include <string>
#include <boost/predef/os.h>

#if (BOOST_OS_WINDOWS)
#  include <stdlib.h>
#elif (BOOST_OS_SOLARIS)
#  include <stdlib.h>
#  include <limits.h>
#elif (BOOST_OS_LINUX)
#  include <unistd.h>
#  include <limits.h>
#elif (BOOST_OS_MACOS)
#  include <mach-o/dyld.h>
#elif (BOOST_OS_BSD_FREE)
#  include <sys/types.h>
#  include <sys/sysctl.h>
#endif

/*
 * Returns the full path to the currently running executable,
 * or an empty string in case of failure.
 */
std::string getExecutablePath() {
    #if (BOOST_OS_WINDOWS)
        char *exePath;
        if (_get_pgmptr(&exePath) != 0)
            exePath = "";
    #elif (BOOST_OS_SOLARIS)
        char exePath[PATH_MAX];
        if (realpath(getexecname(), exePath) == NULL)
            exePath[0] = '\0';
    #elif (BOOST_OS_LINUX)
        char exePath[PATH_MAX];
        ssize_t len = ::readlink("/proc/self/exe", exePath, sizeof(exePath));
        if (len == -1 || len == sizeof(exePath))
            len = 0;
        exePath[len] = '\0';
    #elif (BOOST_OS_MACOS)
        char exePath[PATH_MAX];
        uint32_t len = sizeof(exePath);
        if (_NSGetExecutablePath(exePath, &len) != 0) {
            exePath[0] = '\0'; // buffer too small (!)
        } else {
            // resolve symlinks, ., .. if possible
            char *canonicalPath = realpath(exePath, NULL);
            if (canonicalPath != NULL) {
                strncpy(exePath,canonicalPath,len);
                free(canonicalPath);
            }
        }
    #elif (BOOST_OS_BSD_FREE)
        char exePath[2048];
        int mib[4];  mib[0] = CTL_KERN;  mib[1] = KERN_PROC;  mib[2] = KERN_PROC_PATHNAME;  mib[3] = -1;
        size_t len = sizeof(exePath);
        if (sysctl(mib, 4, exePath, &len, NULL, 0) != 0)
            exePath[0] = '\0';
    #endif
        return std::string(exePath);
}

The above version returns full paths including the executable name. If instead you want the path without the executable name, #include boost/filesystem.hpp> and change the return statement to:

return strlen(exePath)>0 ? boost::filesystem::path(exePath).remove_filename().make_preferred().string() : std::string();
风启觞 2024-08-01 21:40:52

如果您曾经需要支持 Mac
OS X 没有 /proc/,什么
你会这么做吗? 使用 #ifdefs 来
隔离特定于平台的代码
(例如 NSBundle)?

是的,使用 #ifdefs 隔离特定于平台的代码是完成此操作的传统方法。

另一种方法是使用干净的#ifdef-less标头,其中包含函数声明并将实现放在特定于平台的源文件中。

例如,查看POCO(可移植组件)C++ 库如何为其环境类。

If you ever had to support, say, Mac
OS X, which doesn't have /proc/, what
would you have done? Use #ifdefs to
isolate the platform-specific code
(NSBundle, for example)?

Yes, isolating platform-specific code with #ifdefs is the conventional way this is done.

Another approach would be to have a have clean #ifdef-less header which contains function declarations and put the implementations in platform specific source files.

For example, check out how POCO (Portable Components) C++ library does something similar for their Environment class.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文