为什么是“.” Unix 中的硬链接?

发布于 2024-12-10 06:36:41 字数 1459 浏览 0 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

记忆消瘦 2024-12-17 06:36:41

为什么不让 shell 或采用路径的系统调用知道如何操作
来解释它?

为了透明度。如果文件系统做到了这一点,那么应用程序(以及无数的系统调用)就不必对“.”做任何特殊的事情。就像“哦,用户想要当前目录!”。 cwd 的概念及其含义都被整齐地存储在 FS 级别。

这似乎也导致了一个丑陋的特殊情况
实施——你会认为你只能释放已使用的空间
通过链接计数小于 1 的 inode,但如果它们是
目录,您实际上需要检查链接计数是否小于 2。

这不是特殊情况。 Unix 中的所有文件都有许多链接。您取消链接的任何文件都会被选中“这是最后一个链接吗?”。如果是的话,它就会被砍掉。如果没有,它就会徘徊在周围。

Why not just have shells or the system calls that take paths know how
to interpret it?

For transparency. If the filesystem does it the applications (and the myriad of system calls) don't have to do anything special with "." like "Oh, the user wants the current directory!". The notion of cwd and whatever it means is stored neatly out of the way at the FS level.

It also seems like it leads to an ugly special case in the
implementation -- you would think you could only free the space used
by inodes that have a link count less than 1, but if they're
directories, you actually need to check for a link count less than 2.

It's not a special case. All files in Unix have a number of links. Any file you unlink is checked "Is this the last link?". If it is, it gets the chop. If not, it lingers around.

李不 2024-12-17 06:36:41

(嗯:下面的内容现在有点史诗般的......)

unix 文件系统上的目录设计(迂腐地说,通常,但不一定附加到 unix 操作系统)代表了一个奇妙的洞察力,实际上减少了所需的特殊情况的数量。

“目录”实际上只是文件系统中的一个文件。文件系统中文件的所有实际内容都在 inode 中(从你的问题中,我可以看到你已经知道其中的一些内容)。磁盘上的索引节点没有任何结构——它们只是一大堆编号的字节块,像花生酱一样散布在磁盘上。这没有任何用处,而且确实让任何有一丝整洁意识的人都感到厌恶。

唯一特殊 inode 是 inode 编号 2(出于传统原因,不是 0 或 1); inode 2 是一个目录文件:根目录。当系统安装文件系统时,它“知道”必须读取dir inode 2,才能启动。

目录文件只是一个文件,具有供 opendir(3) 和其他朋友读取的内部结构。您可以在 dir(5) 中看到其内部结构(取决于您的操作系统);如果你看一下,你会发现目录文件条目几乎不包含有关该文件的信息——所有这些都在文件索引节点中。该文件的几个特殊之处之一是,如果您尝试使用允许写入的模式打开目录文件,则 open(2) 函数将给出错误。各种其他命令(仅举一个例子,hexdump)将拒绝以正常方式处理目录文件,只是因为这可能不是您想要做的(但这是它们的特殊情况,而不是文件系统)。

硬链接只不过是目录文件映射中的一个条目。在这样的映射中可以有两个(或更多)条目,它们都映射到相同的索引节点号:因此该索引节点有两个(或更多)硬链接。这也解释了为什么每个文件都至少有一个“硬链接”。 inode 有一个引用计数,它记录文件系统中某个目录文件中提及该 inode 的次数(这是您执行 ls -l 时看到的数字)。

好的:我们现在进入正题了。

目录文件是字符串(“文件名”)到数字(索引节点号)的映射。这些索引节点号是该目录“中”的文件的索引节点号。该目录“中”的文件可能包括其他目录文件,因此它们的索引节点号将在目录中列出的索引节点号中。因此,如果您有一个文件 /tmp/foo/bar,那么目录文件 foo 包含一个 bar 条目,将该字符串映射到该文件的索引节点。目录文件 /tmp 中还有一个条目,用于目录文件 foo,该文件位于目录 /tmp 中。

当您使用 mkdir(2) 创建目录时,该函数

  1. 会创建一个具有正确内部结构的目录文件(带有一些 inode 编号),
  2. 向父目录添加一个条目,将新目录的名称映射到这个新 inode(占链接之一),
  3. 向新目录添加一个条目,映射字符串“.”到同一个 inode(这说明了另一个链接),并向
  4. 新目录添加另一个条目,将字符串“..”映射到它在步骤(2)中修改的目录文件的 inode(这说明了较大的数字)您将在包含子目录的目录文件上看到硬链接的数量)。

最终结果是(几乎)唯一的特殊情况是:

  • open(2) 函数试图通过阻止您打开目录文件进行写入来让自己更难搬起石头砸自己的脚。
  • mkdir(2) 函数通过向新目录文件添加几个额外的条目(“.”和“..”)使事情变得美好而简单,纯粹是为了方便在文件系统中移动。我怀疑文件系统在没有“.”的情况下也能正常工作。和“..”,但使用起来会很痛苦。
  • 目录文件是少数被标记为“特殊”的文件类型之一——这实际上是告诉 open(2) 之类的东西的行为略有不同。请参阅 stat(2) 中的 st_mode

(Hmm: the following is now a bit of an epic...)

The design of the directory on unix filesystems (which, to be pedantic, are typically but not necessarily attached to unix OSs) represents a wonderful insight, which actually reduces the number of special cases required.

A 'directory' is really just a file in the filesystem. All the actual content of files in the filesystem is in inodes (from your question, I can see that you're already aware of some of this stuff). There's no structure to the inodes on the disk -- they're just a big bunch of numbered blobs of bytes, spread like peanut-butter over the disk. This is not useful, and indeed is repellent to anyone with a shred of tidy-mindedness.

The only special inode is inode number 2 (not 0 or 1, for reasons of Tradition); inode 2 is a directory file: the root directory. When the system mounts the filesystem, it 'knows' it has to readdir inode 2, to get itself started.

A directory file is just a file, with an internal structure which is intended to be read by opendir(3) and friends. You can see its internal structure documented in dir(5) (depending on your OS); if you look at that, you'll see that the directory file entry contains almost no information about the file -- that's all in the file inode. One of the few things that's special about this file is that the open(2) function will given an error if you try to open a directory file with a mode which permits writing. Various other commands (to pick just one example, hexdump) will refuse to act in the normal way with directory files, just because that's probably not what you want to do (but that's their special case, not the filesystem's).

A hard link is nothing more nor less than an entry in a directory file's map. You can have two (or more) entries in such a map which both map to the same inode number: that inode therefore has two (or more) hard links. This also explains why every file has at least one 'hard link'. The inode has a reference count, which records how many times that inode is mentioned in a directory file somewhere in the filesystem (this is the number which you see when you do ls -l).

OK: we're getting to the point now.

The directory file is a map of strings ('filenames') to numbers (inode numbers). Those inode numbers are the numbers of the inodes of the files which are 'in' that directory. The files which are 'in' that directory might include other directory files, so their inode numbers will be amongst those listed in the directory. Thus, if you have a file /tmp/foo/bar, then the directory file foo includes an entry for bar, mapping that string to the inode for that file. There's also an entry in the directory file /tmp, for the directory file foo which is 'in' the directory /tmp.

When you create a directory with mkdir(2), that function

  1. creates a directory file (with some inode number) with the correct internal structure,
  2. adds an entry to the parent directory, mapping the new directory's name to this new inode (that accounts for one of the links),
  3. adds an entry to the new directory, mapping the string '.' to the same inode (this accounts for the other link), and
  4. adds another entry to the new directory, mapping the string '..' to the inode of the directory file it modified in step (2) (this accounts for the larger number of hard links you'll see on on directory files which contain subdirectories).

The end result is that (almost) the only special cases are:

  • The open(2) function tries to make it harder to shoot yourself in the foot, by preventing you opening directory files for writing.
  • The mkdir(2) function makes things nice and easy by adding a couple of extra entries ('.' and '..') to the new directory file, purely to make it convenient to move around the filesystem. I suspect that the filesystem would work perfectly well without '.' and '..', but would be a pain to use.
  • The directory file is one of the few types of files which are flagged as 'special' -- this is really what tells things like open(2) to behave slightly differently. See st_mode in stat(2).
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文