使用 fseek 和 ftell 确定文件大小存在漏洞？

发布于 2024-11-06 00:17:29 字数 593 浏览 7 评论 0原文

我读过一些帖子，介绍了如何使用 fseek 和 ftell 来确定文件的大小。

FILE *fp;
long file_size;
char *buffer;

fp = fopen("foo.bin", "r");
if (NULL == fp) {
 /* Handle Error */
}

if (fseek(fp, 0 , SEEK_END) != 0) {
  /* Handle Error */
}

file_size = ftell(fp);
buffer = (char*)malloc(file_size);
if (NULL == buffer){
  /* handle error */
}

我正要使用这种技术，但后来遇到了这个链接，描述了潜在的漏洞。

该链接建议使用 fstat 代替。有人可以对此发表评论吗？

原文

I've read posts that show how to use fseek and ftell to determine the size of a file.

FILE *fp;
long file_size;
char *buffer;

fp = fopen("foo.bin", "r");
if (NULL == fp) {
 /* Handle Error */
}

if (fseek(fp, 0 , SEEK_END) != 0) {
  /* Handle Error */
}

file_size = ftell(fp);
buffer = (char*)malloc(file_size);
if (NULL == buffer){
  /* handle error */
}

I was about to use this technique but then I ran into this link that describes a potential vulnerability.

The link recommends using fstat instead. Can anyone comment on this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

少年亿悲伤 2024-11-13 00:17:29

该链接是来自 CERT 的众多无意义的 C 编码建议之一。他们的理由是基于 C 标准允许实现的自由，但 POSIX 不允许这些自由，因此在所有有 fstat 作为替代方案的情况下都无关紧要。

POSIX 要求：

fopen 的"b" 修饰符无效，即文本和二进制模式的行为相同。这意味着他们对在文本文件上调用 UB 的担忧是无稽之谈。
文件具有由写入操作和截断操作设置的字节分辨率大小。这意味着他们对文件末尾的空字节随机数的担忧是无意义的。

可悲的是，他们发表了这么多这样的废话，很难知道哪些 CERT 出版物值得认真对待。这是一种耻辱，因为其中很多都是严肃的。

回复收藏 0 原文

孤檠 2024-11-13 00:17:29

如果您的目标是查找文件的大小，那么您绝对应该使用 fstat() 或其朋友。这是一种更直接、更具表现力的方法——您实际上是在要求系统告诉您文件的统计信息，而不是更迂回的 fseek/ftell 方法。

额外提示：如果您只想知道文件是否可用，请使用 access() 而不是打开文件或什至对其进行统计。这是一个许多程序员不知道的更简单的操作。

回复收藏 0 原文

寄居人 2024-11-13 00:17:29

我倾向于同意他们的基本结论，即您通常不应直接在主流代码中使用 fseek/ftell 代码 - 但您可能也不应该使用fstat。如果您想要文件的大小，则大多数代码应该使用具有清晰、直接名称的名称，例如 filesize。

现在，在可用的情况下使用 fstat 和（例如）Windows 上的 FindFirstFile（最明显的平台，其中< code>fstat 通常不可用）。

故事的另一面是，fseek 对二进制文件的许多（大多数？）限制实际上源于 CP/M，它没有在任何地方显式存储文件的大小。文本文件的结束由 control-Z 表示。然而，对于二进制文件，您真正知道的是使用哪些扇区来存储该文件。在最后一个扇区中，您有一些未使用的数据，这些数据通常（但并非总是）填充为零。不幸的是，可能存在重要的零和/或不重要的非零值。

如果整个 C 标准是在批准之前编写的（例如，如果它于 1988 年开始并于 1989 年完成），他们可能会完全忽略 CP/M。然而，无论好坏，他们在 1982 年左右开始研究 C 标准，当时 CP/M 的使用仍然足够广泛，以至于不容忽视。当 CP/M 离开时，许多决定已经做出，我怀疑有人愿意重新审视它们。

然而，对于今天的大多数人来说，这毫无意义——如果不进行大量工作，大多数代码都不会移植到 CP/M；这是需要处理的相对较小的问题之一。让现代程序仅在 48K（左右）的内存中运行代码和数据是一个更严重的问题（海量存储的最大内存为 1 MB 左右将是另一个严重的问题））。

不过，CERT 确实有一个优点：您可能不应该（正如通常所做的那样）查找文件的大小，分配那么多空间，然后假设文件的内容适合该文件。尽管 fseek/ftell 将为您提供现代系统的正确大小，但当您实际读取数据时，该数据可能已经过时，因此无论如何您都可能会溢出缓冲区。

I'd tend to agree with their basic conclusion that you generally shouldn't use the fseek/ftell code directly in the mainstream of your code -- but you probably shouldn't use fstat either. If you want the size of a file, most of your code should use something with a clear, direct name like filesize.

Now, it probably is better to implement that using fstat where available, and (for example) FindFirstFile on Windows (the most obvious platform where fstat usually won't be available).

The other side of the story is that many (most?) of the limitations on fseek with respect to binary files actually originated with CP/M, which didn't explicitly store the size of a file anywhere. The end of a text file was signaled by a control-Z. For a binary file, however, all you really knew was what sectors were used to store the file. In the last sector, you had some amount of unused data that was often (but not always) zero-filled. Unfortunately, there might be zeros that were significant, and/or non-zero values that weren't significant.

If the entire C standard had been written just before being approved (e.g., if it had been started in 1988 and finished in 1989) they'd probably have ignored CP/M completely. For better or worse, however, they started work on the C standard in something like 1982 or so, when CP/M was still in wide enough use that it couldn't be ignored. By the time CP/M was gone, many of the decisions had already been made and I doubt anybody wanted to revisit them.

For most people today, however, there's just no point -- most code won't port to CP/M without massive work; this is one of the relatively minor problems to deal with. Making a modern program run in only 48K (or so) of memory for both the code and data is a much more serious problem (having a maximum of a megabyte or so for mass storage would be another serious problem).

CERT does have one good point though: you probably should not (as is often done) find the size of a file, allocate that much space, and then assume the contents of the file will fit there. Even though the fseek/ftell will give you the correct size with modern systems, that data could be stale by the time you actually read the data, so you could overrun your buffer anyway.

回复收藏 0 原文

叶落知秋 2024-11-13 00:17:29

不使用fstat的原因是fstat是POSIX，但是fopen、ftell > 和 fseek 是 C 标准的一部分。

可能存在实现 C 标准但不实现 POSIX 的系统。在这样的系统上，fstat 根本不起作用。

回复收藏 0 原文

烟花易冷人易散 2024-11-13 00:17:29

根据 C 标准，§7.21.3 :

将文件位置指示器设置为文件结尾，与 fseek(file, 0，SEEK_END），对于二进制流有未定义的行为（因为
可能的尾随空字符）或任何带有
与状态相关的编码不一定以初始状态结束
转变状态。

法律上的人可能会认为可以通过计算文件大小来避免这个 UB：

fseek(file, -1, SEEK_END);
size = ftell(file) + 1;

但是 C 标准也这么说：

二进制流不需要有意义地支持带有 a 的 fseek 调用
SEEK_END 的值由此而来。

因此，对于 fseek / SEEK_END，我们无法修复此问题。尽管如此，我还是更喜欢 fseek / ftell 而不是特定于操作系统的 API 调用。

According to C standard, §7.21.3:

Setting the ﬁle position indicator to end-of-ﬁle, as with fseek(file, 0, SEEK_END), has undeﬁned behavior for a binary stream (because of
possible trailing null characters) or for any stream with
state-dependent encoding that does not assuredly end in the initial
shift state.

A letter-of-the-law kind of guy might think this UB can be avoided by calculating file size with:

fseek(file, -1, SEEK_END);
size = ftell(file) + 1;

But the C standard also says this:

A binary stream need not meaningfully support fseek calls with a
whence value of SEEK_END.

As a result, there is nothing we can do to fix this with regard to fseek / SEEK_END. Still, I would prefer fseek / ftell instead of OS-specific API calls.

回复收藏 0 原文

~没有更多了~