fread/fwrite 将大小和计数作为参数的基本原理是什么?

发布于 2024-09-24 15:53:55 字数 571 浏览 4 评论 0原文

我们在工作中讨论了为什么 fread()fwrite() 获取每个成员的大小并计算并返回读/写的成员数量,而不是仅仅获取缓冲区和大小。我们可以想到的唯一用途是,如果您想读/写一个结构数组,这些结构不能被平台对齐整除,因此已被填充,但这不可能如此常见以保证这种选择在设计中。

来自 fread(3)

函数 fread() 读取 nmemb 数据元素,每个大小字节长, 从流指向的流中,将它们存储在给定的位置 通过 ptr。

函数fwrite()写入nmemb元素的数据,每个大小字节 long,到stream指向的流,从该位置获取它们 由 ptr 给出。

fread() 和 fwrite() 返回成功读取或写入的项目数 (即不是字符数)。如果发生错误,或者 到达文件结尾时,返回值是短项目计数(或零)。

We had a discussion here at work regarding why fread() and fwrite() take a size per member and count and return the number of members read/written rather than just taking a buffer and size. The only use for it we could come up with is if you want to read/write an array of structures which aren't evenly divisible by the platform alignment and hence have been padded but that can't be so common as to warrant this choice in design.

From fread(3):

The function fread() reads nmemb elements of data, each size bytes long,
from the stream pointed to by stream, storing them at the location given
by ptr.

The function fwrite() writes nmemb elements of data, each size bytes
long, to the stream pointed to by stream, obtaining them from the location
given by ptr.

fread() and fwrite() return the number of items successfully read or written
(i.e., not the number of characters). If an error occurs, or the
end-of-file is reached, the return value is a short item count (or zero).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

能否归途做我良人 2024-10-01 15:53:55

fread(buf, 1000, 1, stream)fread(buf, 1, 1000, stream) 的区别在于,在第一种情况下你只得到一个块1000 字节或什么都没有,如果文件较小,在第二种情况下,您将获得文件中小于和最多 1000 字节的所有内容。

The difference in fread(buf, 1000, 1, stream) and fread(buf, 1, 1000, stream) is, that in the first case you get only one chunk of 1000 bytes or nothing, if the file is smaller and in the second case you get everything in the file less than and up to 1000 bytes.

慕巷 2024-10-01 15:53:55

它基于 fread 的实现方式。

Single UNIX 规范说

对于每个对象,大小调用应为
对 fgetc() 函数和
结果按读取顺序存储在
完全由无符号字符组成的数组
覆盖对象。

fgetc 也有此注释:

由于 fgetc() 对字节进行操作,
读取由以下内容组成的字符
多个字节(或“多字节
字符”)可能需要多次调用
到 fgetc()。

当然,这早于 UTF-8 等花哨的可变字节字符编码。

SUS 指出这实际上取自 ISO C 文档。

It's based on how fread is implemented.

The Single UNIX Specification says

For each object, size calls shall be
made to the fgetc() function and the
results stored, in the order read, in
an array of unsigned char exactly
overlaying the object.

fgetc also has this note:

Since fgetc() operates on bytes,
reading a character consisting of
multiple bytes (or "a multi-byte
character") may require multiple calls
to fgetc().

Of course, this predates fancy variable-byte character encodings like UTF-8.

The SUS notes that this is actually taken from the ISO C documents.

碍人泪离人颜 2024-10-01 15:53:55

这纯粹是猜测,但是在过去(有些仍然存在),许多文件系统并不是硬盘驱动器上的简单字节流。

许多文件系统是基于记录的,因此为了以有效的方式满足此类文件系统,您必须指定项目(“记录”)的数量,允许 fwrite/fread 作为记录而不只是字节流在存储上进行操作。

This is pure speculations, however back in the days(Some are still around) many filesystems were not simple byte streams on a hard drive.

Many file systems were record based, thus to satisfy such filesystems in an efficient manner, you'll have to specify the number of items ("records"), allowing fwrite/fread to operate on the storage as records, not just byte streams.

淡淡绿茶香 2024-10-01 15:53:55

在这里,让我修复这些函数:

size_t fread_buf( void* ptr, size_t size, FILE* stream)
{
    return fread( ptr, 1, size, stream);
}


size_t fwrite_buf( void const* ptr, size_t size, FILE* stream)
{
    return fwrite( ptr, 1, size, stream);
}

至于 fread()/fwrite() 参数的基本原理,我很久以前就丢失了 K&R 的副本所以我只能猜测。我认为一个可能的答案是 Kernighan 和 Ritchie 可能只是认为在对象数组上执行二进制 I/O 最自然。此外,他们可能认为块 I/O 在某些架构上会更快/更容易实现。

尽管 C 标准指定使用 fgetc()fputc() 来实现 fread()fwrite(),请记住,该标准是在 K&R 定义 C 很久之后才存在的,并且标准中指定的内容可能并不在最初设计者的想法中。甚至有可能 K&R 的《C 编程语言》中所说的内容可能与该语言最初设计时不同。

最后,这是 PJ Plauger 在《标准 C 库》中对 fread() 的评价:

如果size(第二个)参数大于一,您无法确定
该函数是否还读取超出其报告内容的 size - 1 个附加字符。
通常,您最好将该函数调用为 fread(buf, 1, size * n, stream); 而不是
fread(buf, 大小, n, 流);

基本上,他是说 fread() 的接口被破坏了。对于fwrite(),他指出,“写入错误通常很少见,因此这不是一个主要缺点”——我不同意这种说法。

Here, let me fix those functions:

size_t fread_buf( void* ptr, size_t size, FILE* stream)
{
    return fread( ptr, 1, size, stream);
}


size_t fwrite_buf( void const* ptr, size_t size, FILE* stream)
{
    return fwrite( ptr, 1, size, stream);
}

As for a rationale for the parameters to fread()/fwrite(), I've lost my copy of K&R long ago so I can only guess. I think that a likely answer is that Kernighan and Ritchie may have simply thought that performing binary I/O would be most naturally done on arrays of objects. Also, they may have thought that block I/O would be faster/easier to implement or whatever on some architectures.

Even though the C standard specifies that fread() and fwrite() be implemented in terms of fgetc() and fputc(), remember that the standard came into existence long after C was defined by K&R and that things specified in the standard might not have been in the original designers ideas. It's even possible that things said in K&R's "The C Programming Language" might not be the same as when the language was first being designed.

Finally, here's what P.J. Plauger has to say about fread() in "The Standard C Library":

If the size (second) argument is greater than one, you cannot determine
whether the function also read up to size - 1 additional characters beyond what it reports.
As a rule, you are better off calling the function as fread(buf, 1, size * n, stream); instead of
fread(buf, size, n, stream);

Bascially, he's saying that fread()'s interface is broken. For fwrite() he notes that, "Write errors are generally rare, so this is not a major shortcoming" - a statement I wouldn't agree with.

长安忆 2024-10-01 15:53:55

很可能它可以追溯到文件 I/O 的实现方式。 (过去)以块的形式写入/读取文件可能比一次写入所有内容更快。

Likely it goes back to the way that file I/O was implemented. (back in the day) It might have been faster to write / read to files in blocks then to write everything at once.

一抹淡然 2024-10-01 15:53:55

对于可以避免读取任何部分记录的实现来说,具有单独的大小和计数参数可能是有利的。如果要从管道之类的东西中使用单字节读取,即使使用固定格式的数据,也必须考虑到记录被分割为两次读取的可能性。如果可以改为请求,例如,当有 293 字节可用时,对最多 40 条记录(每条记录 10 字节)进行非阻塞读取,并让系统返回 290 字节(29 个完整记录),同时为下一次读取留出 3 字节,这将方便多了。

我不知道 fread 的实现可以在多大程度上处理此类语义,但它们在承诺支持它们的实现上肯定会很方便。

Having separate arguments for size and count could be advantageous on an implementation that can avoid reading any partial records. If one were to use single-byte reads from something like a pipe, even if one was using fixed-format data, one would have to allow for the possibility of a record getting split over two reads. If could instead requests e.g. a non-blocking read of up to 40 records of 10 bytes each when there are 293 bytes available, and have the system return 290 bytes (29 whole records) while leaving 3 bytes ready for the next read, that would be much more convenient.

I don't know to what extent implementations of fread can handle such semantics, but they could certainly be handy on implementations that could promise to support them.

甜味超标? 2024-10-01 15:53:55

我认为这是因为C缺乏函数重载。如果有的话,尺寸就会显得多余。但在 C 中你无法确定数组元素的大小,你必须指定一个。

考虑一下:

int intArray[10];
fwrite(intArray, sizeof(int), 10, fd);

如果 fwrite 接受字节数,您可以编写以下内容:

int intArray[10];
fwrite(intArray, sizeof(int)*10, fd);

但效率很低。您将有 sizeof(int) 倍的系统调用。

应该考虑的另一点是,您通常不希望将数组元素的一部分写入文件。你想要整个整数或者什么都不想要。 fwrite 返回成功写入的元素数量。那么如果你发现某个元素只写入了 2 个低字节,你会怎么做?

在某些系统上(由于对齐),如果不创建副本和移位,则无法访问整数的一个字节。

I think it is because C lacks function overloading. If there was some, size would be redundant. But in C you can't determine a size of an array element, you have to specify one.

Consider this:

int intArray[10];
fwrite(intArray, sizeof(int), 10, fd);

If fwrite accepted number of bytes, you could write the following:

int intArray[10];
fwrite(intArray, sizeof(int)*10, fd);

But it is just inefficient. You will have sizeof(int) times more system calls.

Another point that should be taked into consideration is that you usually don't want a part of an array element be written to a file. You want the whole integer or nothing. fwrite returns a number of elements succesfully written. So if you discover that only 2 low bytes of an element is written what would you do?

On some systems (due to alignment) you can't access one byte of an integer without creating a copy and shifting.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文