从 C 中的文件描述符中检索文件名

发布于 2024-07-29 03:55:52 字数 35 浏览 11 评论 0原文

是否可以在 C 中获取文件描述符(Linux)的文件名?

Is it possible to get the filename of a file descriptor (Linux) in C?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

情归归情 2024-08-05 03:55:53

在 OpenBSD 上没有官方 API 可以执行此操作,尽管有一些非常复杂的解决方法,但仍然可以使用以下代码,注意您需要链接 -lkvm-lc. 使用FTS遍历文件系统的代码来自这个答案

#include <string>
#include <vector>

#include <cstdio>
#include <cstring>

#include <sys/stat.h>
#include <fts.h>

#include <sys/sysctl.h>
#include <kvm.h>

using std::string;
using std::vector;

string pidfd2path(int pid, int fd) {
  string path; char errbuf[_POSIX2_LINE_MAX];
  static kvm_t *kd = nullptr; kinfo_file *kif = nullptr; int cntp = 0;
  kd = kvm_openfiles(nullptr, nullptr, nullptr, KVM_NO_FILES, errbuf); if (!kd) return "";
  if ((kif = kvm_getfiles(kd, KERN_FILE_BYPID, pid, sizeof(struct kinfo_file), &cntp))) {
    for (int i = 0; i < cntp; i++) {
      if (kif[i].fd_fd == fd) {
        FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
        vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
        file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
        if (file_system) {
          while ((parent = fts_read(file_system))) {
            child = fts_children(file_system, 0);
            while (child && child->fts_link) {
              child = child->fts_link;
              if (!S_ISSOCK(child->fts_statp->st_mode)) {
                if (child->fts_statp->st_dev == kif[i].va_fsid) {
                  if (child->fts_statp->st_ino == kif[i].va_fileid) {
                    path = child->fts_path + string(child->fts_name);
                    goto finish;
                  }
                }
              }
            }
          }
          finish:
          fts_close(file_system); 
        }
      }
    }
  }
  kvm_close(kd);
  return path;
}

int main(int argc, char **argv) {
  if (argc == 3) {
    printf("%s\n", pidfd2path((int)strtoul(argv[1], nullptr, 10), 
      (int)strtoul(argv[2], nullptr, 10)).c_str());
  } else {
    printf("usage: \"%s\" <pid> <fd>\n", argv[0]);
  }
  return 0;
}

如果该函数无法找到该文件(例如,因为它不再存在),它将返回一个空字符串。 如果文件被移动,根据我的经验,将文件移动到垃圾箱时,如果 FTS 尚未搜索到该位置,则会返回文件的新位置。 对于具有更多文件的文件系统来说,速度会更慢。

在整个文件系统的目录树中搜索越深而找不到文件,就越有可能出现竞争条件,尽管由于性能的原因,这种可能性仍然很小。 我知道我的 OpenBSD 解决方案是 C++ 而不是 C。请随意将其更改为 C,大部分代码逻辑将是相同的。 如果我有时间,我会尝试用 C 语言重写它,希望很快。 与 macOS 一样,该解决方案随机获取一个硬链接(需要引用),以便与 Windows 和其他只能获取一个硬链接的平台进行移植。 如果您不想关心跨平台并希望获得所有硬链接,您可以删除 while 循环中的中断并返回一个向量。 DragonFly BSD 和 NetBSD 对于当前问题与 macOS 解决方案具有相同的解决方案(完全相同的代码),我手动验证了这一点。 如果 macOS 用户希望从打开的任何进程的文件描述符中获取路径,则可以通过插入进程 ID,而不仅限于调用进程,同时还可能获取所有硬链接,而不仅限于随机进程,请参阅此答案。 与遍历整个文件系统相比,它的性能应该要高得多,类似于 Linux 和其他更直接、更切题的解决方案上的速度。 FreeBSD 用户可以在这个问题中得到他们想要的东西,因为该问题中提到的操作系统级错误已经被解决了解决了较新操作系统版本的问题。

这是一个更通用的解决方案,它只能检索调用进程打开的文件描述符的路径,但是它应该适用于大多数开箱即用的类 Unix,与前一个解决方案在以下方面具有相同的问题硬链接和竞争条件,尽管由于 if-then、for 循环等较少,执行速度稍快:

#include <string>
#include <vector>

#include <cstring>

#include <sys/stat.h>
#include <fts.h>

using std::string;
using std::vector;

string fd2path(int fd) {
  string path;
  FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
  vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
  file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
  if (file_system) {
    while ((parent = fts_read(file_system))) {
      child = fts_children(file_system, 0);
      while (child && child->fts_link) {
        child = child->fts_link; struct stat info = { 0 }; 
        if (!S_ISSOCK(child->fts_statp->st_mode)) {
          if (!fstat(fd, &info) && !S_ISSOCK(info.st_mode)) {
            if (child->fts_statp->st_dev == info.st_dev) {
              if (child->fts_statp->st_ino == info.st_ino) {
                path = child->fts_path + string(child->fts_name);
                goto finish;
              }
            }
          }
        }
      }
    }
    finish: 
    fts_close(file_system); 
  }
  return path;
}

一个更快的解决方案,也仅限于调用过程,但性能应该更高一些,您可以将所有调用包装到fopen() 和 open() 带有一个辅助函数,该函数基本上存储与 std::unordered_map 等效的任何 C 语言,并将文件描述符与传递给 fopen()/open() 的绝对路径版本配对包装器(以及仅限 Windows 的等效项,它无法在 UWP 上工作,如 _wopen_s() 以及支持 UTF-8 的所有废话),这可以在 Unix 类系统上使用 realpath() 或 GetFullPathNameW() (*W Windows 上的 UTF-8 支持)。 realpath() 将解析符号链接(与 Windows 上常用的符号链接不同),而 realpath() / GetFullPathNameW() 会将您打开的现有文件从相对路径(如果是一个)转换为绝对路径。 将文件描述符和绝对路径存储为相当于 std::unordered_map 的 C(您可能必须使用 malloc()'d 并最终使用 free()'d int 和 c-string 数组自己编写),这将再次强调,比任何其他动态搜索文件系统的解决方案都要快,但它有一个不同且不吸引人的限制,即它不会记录在文件系统上移动的文件,但至少您可以检查是否使用您自己的代码删除该文件来测试其存在性,它也不会记录该文件自您打开该文件并将描述符的路径存储在内存中以来是否已被替换,从而可能为您提供过时的结果。 如果您想查看这方面的代码示例,请告诉我,但由于文件位置发生变化,我不推荐此解决方案。

There is no official API to do this on OpenBSD, though with some very convoluted workarounds, it is still possible with the following code, note you need to link with -lkvm and -lc. The code using FTS to traverse the filesystem is from this answer.

#include <string>
#include <vector>

#include <cstdio>
#include <cstring>

#include <sys/stat.h>
#include <fts.h>

#include <sys/sysctl.h>
#include <kvm.h>

using std::string;
using std::vector;

string pidfd2path(int pid, int fd) {
  string path; char errbuf[_POSIX2_LINE_MAX];
  static kvm_t *kd = nullptr; kinfo_file *kif = nullptr; int cntp = 0;
  kd = kvm_openfiles(nullptr, nullptr, nullptr, KVM_NO_FILES, errbuf); if (!kd) return "";
  if ((kif = kvm_getfiles(kd, KERN_FILE_BYPID, pid, sizeof(struct kinfo_file), &cntp))) {
    for (int i = 0; i < cntp; i++) {
      if (kif[i].fd_fd == fd) {
        FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
        vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
        file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
        if (file_system) {
          while ((parent = fts_read(file_system))) {
            child = fts_children(file_system, 0);
            while (child && child->fts_link) {
              child = child->fts_link;
              if (!S_ISSOCK(child->fts_statp->st_mode)) {
                if (child->fts_statp->st_dev == kif[i].va_fsid) {
                  if (child->fts_statp->st_ino == kif[i].va_fileid) {
                    path = child->fts_path + string(child->fts_name);
                    goto finish;
                  }
                }
              }
            }
          }
          finish:
          fts_close(file_system); 
        }
      }
    }
  }
  kvm_close(kd);
  return path;
}

int main(int argc, char **argv) {
  if (argc == 3) {
    printf("%s\n", pidfd2path((int)strtoul(argv[1], nullptr, 10), 
      (int)strtoul(argv[2], nullptr, 10)).c_str());
  } else {
    printf("usage: \"%s\" <pid> <fd>\n", argv[0]);
  }
  return 0;
}

If the function fails to find the file, (for example, because it no longer exists), it will return an empty string. If the file was moved, in my experience when moving the file to the trash, the new location of the file is returned instead if that location wasn't already searched through by FTS. It'll be slower for filesystems that have more files.

The deeper the search goes in the directory tree of your entire filesystem without finding the file, the more likely you are to have a race condition, though still very unlikely due to how performant this is. I'm aware my OpenBSD solution is C++ and not C. Feel free to change it to C and most of the code logic will be the same. If I have time I'll try to rewrite this in C hopefully soon. Like macOS, this solution gets a hardlink at random (citation needed), for portability with Windows and other platforms which can only get one hard link. You could remove the break in the while loop and return a vector if you want don't care about being cross-platform and want to get all the hard links. DragonFly BSD and NetBSD have the same solution (the exact same code) as the macOS solution on the current question, which I verified manually. If a macOS user wishes to get a path from a file descriptor opened any process, by plugging in a process id, and not be limited to just the calling one, while also getting all hard links potentially, and not being limited to a random one, see this answer. It should be a lot more performant that traversing your entire filesystem, similar to how fast it is on Linux and other solutions that are more straight-forward and to-the-point. FreeBSD users can get what they are looking for in this question, because the OS-level bug mentioned in that question has since been resolved for newer OS versions.

Here's a more generic solution which can only retrieve the path of a file descriptor opened by the calling process, however it should work for most Unix-likes out-of-the-box, with all the same concerns as the former solution in regards to hard links and race conditions, although performs slightly faster due to less if-then, for-loops, etc:

#include <string>
#include <vector>

#include <cstring>

#include <sys/stat.h>
#include <fts.h>

using std::string;
using std::vector;

string fd2path(int fd) {
  string path;
  FTS *file_system = nullptr; FTSENT *child = nullptr; FTSENT *parent = nullptr;
  vector<char *> root; char buffer[2]; strcpy(buffer, "/"); root.push_back(buffer);
  file_system = fts_open(&root[0], FTS_COMFOLLOW | FTS_NOCHDIR, nullptr);
  if (file_system) {
    while ((parent = fts_read(file_system))) {
      child = fts_children(file_system, 0);
      while (child && child->fts_link) {
        child = child->fts_link; struct stat info = { 0 }; 
        if (!S_ISSOCK(child->fts_statp->st_mode)) {
          if (!fstat(fd, &info) && !S_ISSOCK(info.st_mode)) {
            if (child->fts_statp->st_dev == info.st_dev) {
              if (child->fts_statp->st_ino == info.st_ino) {
                path = child->fts_path + string(child->fts_name);
                goto finish;
              }
            }
          }
        }
      }
    }
    finish: 
    fts_close(file_system); 
  }
  return path;
}

An even quicker solution which is also limited to the calling process, but should be somewhat more performant, you could wrap all your calls to fopen() and open() with a helper function which stores basically whatever C equivalent there is to an std::unordered_map, and pair up the file descriptor with the absolute path version of what is passed to your fopen()/open() wrappers (and the Windows-only equivalents which won't work on UWP like _wopen_s() and all that nonsense to support UTF-8), which can be done with realpath() on Unix-likes, or GetFullPathNameW() (*W for UTF-8 support) on Windows. realpath() will resolve symbolic links (which aren't near as commonly used on Windows), and realpath() / GetFullPathNameW() will convert your existing file you opened from a relative path, if it is one, to an absolute path. With the file descriptor and absolute path stored an a C equivalent to a std::unordered_map (which you likely will have to write yourself using malloc()'d and eventually free()'d int and c-string arrays), this will again, be faster than any other solution that does a dynamic search of your filesystem, but it has a different and unappealing limitation, which is it will not make note of files which were moved around on your filesystem, however at least you can check whether the file was deleted using your own code to test existence, it also won't make note of the file in whether it was replaced since the time you opened it and stored the path to the descriptor in memory, thus giving you outdated results potentially. Let me know if you would like to see a code example of this, though due to files changing location I do not recommend this solution.

自控 2024-08-05 03:55:52

您可以在 /proc/self 上使用 readlink /fd/NNN 其中 NNN 是文件描述符。 这将为您提供文件打开时的名称 - 但是,如果文件从那时起被移动或删除,它可能不再准确(尽管 Linux 在某些情况下可以跟踪重命名)。 要验证,stat 给定的文件名和 fstat 您拥有的 fd,并确保 st_devst_ino 是相同的。

当然,并非所有文件描述符都引用文件,对于这些文件描述符,您会看到一些奇怪的文本字符串,例如 pipe:[1538488]。 由于所有真实的文件名都是绝对路径,因此您可以很容易地确定哪些文件名。 此外,正如其他人所指出的,文件可以有多个指向它们的硬链接 - 这只会报告打开它的那个。 如果您想查找给定文件的所有名称,您只需遍历整个文件系统即可。

You can use readlink on /proc/self/fd/NNN where NNN is the file descriptor. This will give you the name of the file as it was when it was opened — however, if the file was moved or deleted since then, it may no longer be accurate (although Linux can track renames in some cases). To verify, stat the filename given and fstat the fd you have, and make sure st_dev and st_ino are the same.

Of course, not all file descriptors refer to files, and for those you'll see some odd text strings, such as pipe:[1538488]. Since all of the real filenames will be absolute paths, you can determine which these are easily enough. Further, as others have noted, files can have multiple hardlinks pointing to them - this will only report the one it was opened with. If you want to find all names for a given file, you'll just have to traverse the entire filesystem.

雨的味道风的声音 2024-08-05 03:55:52

我在 Mac OS X 上遇到了这个问题。我们没有 /proc 虚拟文件系统,因此接受的解决方案无法工作。

相反,我们有一个用于 fcntlF_GETPATH 命令:

 F_GETPATH          Get the path of the file descriptor Fildes.  The argu-
                    ment must be a buffer of size MAXPATHLEN or greater.

因此,要获取与文件描述符关联的文件,您可以使用以下代码片段:

#include <sys/syslimits.h>
#include <fcntl.h>

char filePath[PATH_MAX];
if (fcntl(fd, F_GETPATH, filePath) != -1)
{
    // do something with the file path
}

因为我不记得在哪里 MAXPATHLEN 已定义,我认为 syslimits 中的 PATH_MAX 就可以了。

I had this problem on Mac OS X. We don't have a /proc virtual file system, so the accepted solution cannot work.

We do, instead, have a F_GETPATH command for fcntl:

 F_GETPATH          Get the path of the file descriptor Fildes.  The argu-
                    ment must be a buffer of size MAXPATHLEN or greater.

So to get the file associated to a file descriptor, you can use this snippet:

#include <sys/syslimits.h>
#include <fcntl.h>

char filePath[PATH_MAX];
if (fcntl(fd, F_GETPATH, filePath) != -1)
{
    // do something with the file path
}

Since I never remember where MAXPATHLEN is defined, I thought PATH_MAX from syslimits would be fine.

甜心 2024-08-05 03:55:52

在 Windows 中,使用 GetFileInformationByHandleEx,传递 FileNameInfo,可以检索文件名。

In Windows, with GetFileInformationByHandleEx, passing FileNameInfo, you can retrieve the file name.

山川志 2024-08-05 03:55:52

正如 Tyler 指出的那样,没有办法“直接且可靠”地完成您所需要的操作,因为给定的 FD 可能对应于 0 个文件名(在各种情况下)或 > 0 个文件名。 1(多个“硬链接”是后一种情况的一般描述)。 如果您仍然需要具有所有限制的功能(关于速度以及获得 0、2、... 结果而不是 1 的可能性),那么您可以这样做:首先, fstat FD ——这会在生成的 struct stat 中告诉您该文件所在的设备它有多少个硬链接,是否是一个特殊文件等。这可能已经回答了您的问题 - 例如,如果 0 个硬链接,您将知道磁盘上实际上没有相应的文件名。

如果统计数据给您带来希望,那么您必须“遍历相关设备上的目录树”,直到找到所有硬链接(或者仅找到第一个,如果您不需要多个硬链接,并且任何一个都可以) )。 为此,您使用 readdir (当然还有 opendir &c)递归打开子目录直到您在 struct dirent 中找到与原始 struct stat 中相同的 inode 编号(此时,如果您想要整个路径,而不仅仅是名称) ,您需要向后遍历目录链才能重建它)。

如果这种通用方法是可以接受的,但您需要更详细的 C 代码,请告诉我们,它不会很难编写(尽管如果它没有用,我宁愿不写它,即您无法承受不可避免的缓慢性能或出于您的申请目的,获得 != 1 结果的可能性;-)。

As Tyler points out, there's no way to do what you require "directly and reliably", since a given FD may correspond to 0 filenames (in various cases) or > 1 (multiple "hard links" is how the latter situation is generally described). If you do still need the functionality with all the limitations (on speed AND on the possibility of getting 0, 2, ... results rather than 1), here's how you can do it: first, fstat the FD -- this tells you, in the resulting struct stat, what device the file lives on, how many hard links it has, whether it's a special file, etc. This may already answer your question -- e.g. if 0 hard links you will KNOW there is in fact no corresponding filename on disk.

If the stats give you hope, then you have to "walk the tree" of directories on the relevant device until you find all the hard links (or just the first one, if you don't need more than one and any one will do). For that purpose, you use readdir (and opendir &c of course) recursively opening subdirectories until you find in a struct dirent thus received the same inode number you had in the original struct stat (at which time if you want the whole path, rather than just the name, you'll need to walk the chain of directories backwards to reconstruct it).

If this general approach is acceptable, but you need more detailed C code, let us know, it won't be hard to write (though I'd rather not write it if it's useless, i.e. you cannot withstand the inevitably slow performance or the possibility of getting != 1 result for the purposes of your application;-).

凉城凉梦凉人心 2024-08-05 03:55:52

在将此视为不可能之前,我建议您查看 lsof 命令的源代码。

可能存在限制,但 lsof 似乎能够确定文件描述符和文件名。 此信息存在于 /proc 文件系统中,因此应该可以从您的程序中获取。

Before writing this off as impossible I suggest you look at the source code of the lsof command.

There may be restrictions but lsof seems capable of determining the file descriptor and file name. This information exists in the /proc filesystem so it should be possible to get at from your program.

蓝天 2024-08-05 03:55:52

您可以使用 fstat() 通过 struct stat 获取文件的 inode。 然后,使用 readdir() 您可以将找到的 inode 与目录中存在的 inode (struct dirent) 进行比较(假设您知道该目录,否则您将不得不搜索整个文件系统)并找到相应的文件名。
可恶的?

You can use fstat() to get the file's inode by struct stat. Then, using readdir() you can compare the inode you found with those that exist (struct dirent) in a directory (assuming that you know the directory, otherwise you'll have to search the whole filesystem) and find the corresponding file name.
Nasty?

思念绕指尖 2024-08-05 03:55:52

不可能的。 文件描述符在文件系统中可能有多个名称,也可能根本没有名称。

编辑:假设您正在谈论一个普通的旧 POSIX 系统,没有任何特定于操作系统的 API,因为您没有指定操作系统。

Impossible. A file descriptor may have multiple names in the filesystem, or it may have no name at all.

Edit: Assuming you are talking about a plain old POSIX system, without any OS-specific APIs, since you didn't specify an OS.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文