Linux 文件系统的历史视角
Jonathan Leffler 在问题 “如何找到某些指定文件的大小?”耐人寻味。 我将把它分成几部分进行分析。
——文件存储在页面上;
你通常会得到更多的空间 使用比计算给出的 因为 1 字节文件(通常)占用 一页(可能 512 字节)。
- 该 确切的值有所不同 - 更容易 第七版 Unix 文件的日子 系统(尽管那时也不是微不足道的)
4-5。 如果你想考虑 引用的间接块 inode 以及原始数据块)。
有关部分的问题
- “页面”的定义是什么?
- 为什么事后想到“也许”这个词是“一页(也许512字节)”?
- 为什么在“第七版 Unix 文件系统”中测量精确大小更容易?
- “间接块”的定义是什么?
- 如何通过“inode”和“原始数据块”这两个东西来引用?
出现的历史问题
一、莱弗勒所说的历史背景是什么?
二. 有 定义随着时间的推移而改变?
Jonathan Leffler's comment in the question "How can I find the Size of some specified files?" is thought-provoking. I will break it into parts for analysis.
-- files are stored on pages;
you normally end up with more space being
used than that calculation gives
because a 1 byte file (often) occupies
one page (of maybe 512 bytes).- The
exact values vary - it was easier in
the days of the 7th Edition Unix file
system (though not trivial even then4-5. if you wanted to take account of
indirect blocks referenced by the
inode as well as the raw data blocks).
Questions about the parts
- What is the definition of "page"?
- Why is the word "maybe" in the after-thought "one page (of maybe 512 bytes)"?
- Why was it easier to measure exact sizes in the "7th Edition Unix file system"?
- What is the definition of "indirect block"?
- How can you have references by two things: "the inode" and "the raw data blocks"?
Historical Questions Emerged
I. What is the historical context Leffler is speaking about?
II. Have the
definitions changed over time?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为他的意思是块而不是页,块是文件系统上的最小可寻址单元。
我认为他的意思是块而不是页,块
块大小可以变化
不知道为什么,但也许是文件系统接口暴露了API允许更精确的测量。
间接块是由指针引用的块
inode 与原始数据一样占用空间(块)。 这就是作者的意思。
I think he means block instead of page, a block being the minimum addressable unit on the filesystem.
block sizes can vary
Not sure why but perhaps it the filesystem interface exposed api's allowing a more exact measurement.
An indirect block is a block referenced by a pointer
The inode occupies space (blocks) just as the raw data does. This is what the author meant.
与维基百科页面一样,块(数据存储) 提供了丰富的信息,尽管距离很远过于热衷于链接所有关键字。
还有对经典 Unix 文件系统 的合理概述。
传统上,硬盘几何结构(磁盘本身上块的布局)一直是CHS。
CHS 已不再使用,因为
但由于历史原因,这影响了块大小:因为扇区大小几乎总是 512B,所以文件系统块大小始终是 512B 的倍数。 (正在采取行动引入 1kB 和 4kB 扇区大小的驱动器, 开销
一般来说,较小的文件系统块大小会在存储许多小文件时减少浪费的空间(除非使用尾部合并等先进技术),而较大的块大小会减少外部碎片并在大磁盘上具有较低的 。 文件系统块大小通常是 2 的幂,低于块设备的扇区大小,并且通常高于操作系统的页面大小。
页面大小因操作系统和平台而异(对于 Linux ,也可能因配置而异)。 与块大小一样,较小的块大小可以减少内部碎片,但需要更多的管理开销。 32 位平台上 4kB 的页面大小很常见。
现在,继续描述间接块。 在UFS设计中,
因此,当使用间接指针时,文件所需的存储量可能大于包含其数据的块。
并非所有文件系统都使用此方法来跟踪属于文件的数据块。 FAT 仅使用单个文件分配表,该表实际上是一系列巨大的链表,并且许多现代文件系统使用范围。
As usual for Wikipedia pages, Block (data storage) is informative despite being far too exuberant about linking all keywords.
There's also a reasonable overview of the classical Unix File System.
Traditionally, hard disk geometry (the layout of blocks on the disk itself) has been CHS.
CHS isn't used much these days, as
but for historical reasons, this has affected block sizes: because sector sizes were almost always 512B, filesystem block sizes have always been multiples of 512B. (There is a movement afoot to introduce drives with 1kB and 4kB sector sizes, but compatibility looks rather painful.)
Generally speaking, smaller filesystem block sizes result in less wasted space when storing many small files (unless advanced techniques like tail merging are in use), while larger block sizes reduce external fragmentation and have lower overhead on large disks. The filesystem block size is usually a power of 2, is limited below by the block device's sector size, and is often limited above by the OS's page size.
The page size varies by OS and platform (and, in the case of Linux, can vary by configuration as well). Like block size, smaller block sizes reduce internal fragmentation but require more administrative overhead. 4kB page sizes on 32-bit platforms is common.
Now, on to describe indirect blocks. In the UFS design,
Thus the amount of storage required for a file may be greater than just the blocks containing its data, when indirect pointers are in use.
Not all filesystems use this method for keeping track of the data blocks belong to a file. FAT simply uses a single file allocation table which is effectively a gigantic series of linked lists, and many modern filesystems use extents.