为什么使用无符号字符写入二进制文件?为什么不应该使用流运算符写入二进制文件?

发布于 2024-12-04 22:04:23 字数 477 浏览 1 评论 0原文

我的第一个问题是,为什么习惯上使用无符号字符以二进制模式写入文件?在我见过的所有示例中,在写入二进制文件之前,任何其他数值都会被转换为 unsigned char。

我的第二个问题是,使用流运算符写入二进制文件有什么不好?我听说 read() 和 write() 运算符最适合写入二进制文件,但我不太明白为什么会这样。如果我首先将值转换为无符号字符,则使用流运算符写入二进制文件对我来说效果很好。

float num = 500.5;
ostream file("file.txt", ios::binary);

file << num  // results in gibberish when I try to read the file later
file << (unsigned char)num  // no problems reading the file with stream operators

提前致谢。

My first question is, why is it customary to use unsigned chars for writing to files in binary mode? In all of the examples I have seen, any other numerical value is casted to unsigned char before writing to the binary file.

My second question is, what's so bad about using stream operators to write to binary files? I've heard that read() and write() operators are best used for writing to binary files, but I don't really understand why that's the case. Using stream operators to write to binary files works fine for me IF I first cast the value to unsigned char.

float num = 500.5;
ostream file("file.txt", ios::binary);

file << num  // results in gibberish when I try to read the file later
file << (unsigned char)num  // no problems reading the file with stream operators

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

书信已泛黄 2024-12-11 22:04:23

chars 是 C/C++ 中最小的类型(根据定义,sizeof( char ) == 1)。这是将对象视为字节序列的常用方法。 unsigned 用于避免有符号算术妨碍,因为它最好地表示二进制内容(0 到 255 之间的值)。

为了操作二进制文件,流提供了readwrite函数。插入和提取功能已格式化。它只是偶然为您工作,例如,如果您输出一个带有 << 的整数那么它实际上会输出整数值的文本表示形式,而不是其二进制表示形式。在您提供的示例中,您在输出之前将浮点数转换为无符号字符,实际上将实际值转换为小整数。当您尝试从文件中读回浮点数时,您会得到什么?

chars are the smallest type in C/C++ (by definition, sizeof( char ) == 1). Its the usual way to see objects as a sequence of bytes. unsigned is used to avoid signed arithmethic to get in the way, and because it best represents binary contents (a value between 0 and 255).

To operate on binary files, streams provide the read and write functions. The insertion and extraction functionality is formatted. It's working for you just by chance, for instance if you output an integer with << then it will actually output the textual representation of the integer value and not its binary representation. In your provided example, you cast a float to an unsigned char before outputing, actually casting the real value to a small integer. What do you get when you try to read the float back from the file?

撧情箌佬 2024-12-11 22:04:23

因为operator<<的所有重载都被称为格式化函数。它们在写入输出文件之前格式化数据。换句话说,如果您想将二进制数据写入文件,则不能使用它们。可以使用未格式化函数(不格式化数据的函数)将二进制数据写入文件。

std::ostream 提供了一个名为 write()未格式化输出函数,具有以下签名:

ostream& write ( const char* s , streamsize n );

它还回答了其他问题:

为什么习惯上使用无符号字符以二进制模式写入文件?

不,这是错误的。函数write()接受const char*,而不是const unsigned char *

--

在线文档介绍了 operator<<

应用于输出流的该运算符 (<<) 称为插入运算符。它对流执行输出操作,通常涉及某种数据格式(例如将数值写入字符序列)。

它说关于 write()

这是一个未格式化的输出函数,写入的内容不一定是 C 字符串,因此在数组 s 中找到的任何空字符都会被复制到目标位置,并且不会结束写入过程.

Because all the overloads of operator<< are called formatted functions. They format the data before writing to the output file. In other words, they cannot be used if you want to write binary data to file. Binary data can be written to file with unformatted functions - those which don't format the data.

std::ostream provides one unformatted output function called write(), with the following signature:

ostream& write ( const char* s , streamsize n );

which also answers other question that:

why is it customary to use unsigned chars for writing to files in binary mode?

No. It is wrong. The function write() accepts const char*, not const unsigned char *.

--

The online doc says about operator<<:

This operator (<<) applied to an output stream is known as insertion operator. It performs an output operation on a stream generally involving some sort of formatting of the data (like for example writing a numerical value as a sequence of characters).

and it says about write():

This is an unformatted output function and what is written is not necessarily a c-string, therefore any null-character found in the array s is copied to the destination and does not end the writing process.

回心转意 2024-12-11 22:04:23

使用 unsigned char 的原因是它可以保证是 unsigned,这在按位运算时非常理想——在操作二进制时可以派上用场数据。您必须记住 char (也称为普通 char)是 将类型unsigned char分开,并且未指定这是有符号类型还是无符号类型。

最后,流的格式化函数被设计为输出/解析数据的文本人类可读表示,例如123456789可以< support>1 被表示为九个字符“123456789”,可以容纳九个字节。作为比较,0x75BCD15 等可能的二进制表示可以容纳四个字节,其紧凑程度是其两倍多。

您所做的事情成功并不完全出乎意料,因为某个东西是否是二进制文件仅取决于您正在对它所做的事情。如果您将文本写入文件,稍后检索该文本是正常的。

1:取决于例如语言环境,这是特定于格式化函数的另一个功能。

The reason to use unsigned char is that it is guaranteed to be unsigned, which is very much desirable when it comes to bitwise operations -- which can come in handy when manipulating binary data. You have to keep in mind that char (also known as plain char) is a separate type from unsigned char and it is not specified whether this is a signed or unsigned type.

Finally, the formatted functions of streams are designed to output/parse a textual, human-readable representation of data, where for instance 123456789 could1 be represented as the nine characters "123456789", which can fit in nine bytes. For comparison, a possible binary representation as 0x75BCD15 can fit in four bytes, which is more than twice as compact.

It is not entirely unexpected that what you're doing succeeds, since whether something is a binary file or not is simply determined by what you're doing with it. If you're writing text to the file, it is normal to retrieve that text back later on.

1: depending on e.g. locales, which is another feature specific to formatted functions.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文