关于 EOF 和 EOL 的问题
我试图了解 EOF 和 EOL,以及 C++ iostream 的实际工作原理。
当通过 getchar()
或 getche()
将输入输入到 char
变量时,我发现如果我写这样的行:
char a;
a = getche(); // it returns char '\r' if pressed enter
a = getchar(); // it returns char '\n' if pressed enter
为什么这些价值观?
实际上是什么让 C++ 认为我们已经用完了输入(即是否总是
'\n'
让 C++ 认为它已到达输入末尾?).在读取/写入包含一些以
'\n'
结尾的字符串句子的文件时,如果行以NULL
字符结尾(也表示结束)会发生什么-of-line?
您能举例说明一下吗?
I am trying to understand EOF and EOL, and how C++ iostream actually works.
While taking the the input through getchar()
or getche()
into a char
variable, I found that if I write lines like:
char a;
a = getche(); // it returns char '\r' if pressed enter
a = getchar(); // it returns char '\n' if pressed enter
Why these values?
What actually makes C++ think that we have run out of input (i.e is it always
'\n'
that makes C++ think that it's at the end of its input?).While reading/writing a file that has some string sentences ending with
'\n'
then what happens if lines end with aNULL
character, which also represents an end-of-line?
Could you explain all these briefly with examples?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
首先,
getche
是来自conio.h
的 POSIX 函数,它是非标准的,并且在所有主要工具链中均已弃用。这是一个无缓冲、无格式的读取操作。当您的输入流使用
\r\n
作为行结尾(在 Windows 上常见)时,您正在读取第一个字符\r
。然后,当您执行
getchar()
时,您将获得第二个字符\n
。这也是一个 C 函数。我的回答的其余部分将是关于 C++ 的。
缓冲 I/O 函数倾向于通过
\n
来分隔读取,是的。std::getline
有一个参数允许您更改此分隔符:但这只是一个分隔符。您可能会认为它表示“行结束”,但它肯定不是“文件结束”。
空字符并不重要。
空字符唯一出现问题的情况是在没有附带长度信息的 C 样式
char
缓冲区字符串中。 确定字符串长度的唯一方法是搜索终止空字符(请参阅:strlen
),如果存在任意其他字符,则这是有问题的空字符分散在数据的有用部分中。如果您传递一个指向
char
数组的指针,并且其大小为int
,那么它可以包含与您一样多的空字符喜欢。在 C 或 C++ 中从流中读取字符时,您使用的函数会告诉您读取了多少个字符。因此,即使其中一些是空字符,也没关系。您可以按照您认为合适的方式处理它们。
我不太明白这个问题,但我将通过简要描述文件结束来总结我的答案。
从历史上看,文件有一个物理字符
\004
(^D
),位于其内容的末尾并表示文件结束。如今,这个物理字符不再以这种方式使用,但操作系统和文件系统的内部将使用不同的机制来通知您的应用程序存在不再输入。 C 函数将返回宏
EOF
,并且 C++ 对象有一个可以检查的状态标志。其工作原理的具体细节对您来说是抽象的;你不应该关心它。
有趣的是,要结束 Linux 控制台中的输入,您仍然需要按键盘上的
^D
。我希望这对您有所帮助。您的问题不是特别清楚,但上面的内容旨在简要描述 C++ 中的 EOL 和 EOF。
我可以推荐这些书籍和资源供进一步阅读。
Firstly,
getche
is a POSIX function fromconio.h
that is non-standard and deprecated in all major toolchains.It's an unbuffered, non-formatted read operation. When your input stream uses
\r\n
for line endings (common on Windows), then you are reading that first character\r
.When you then perform
getchar()
, you're getting the second character,\n
. This is a C function, too.The rest of my answer will be about C++.
The buffered I/O functions tend to delimit reads by
\n
, yes. There is a parameter tostd::getline
which allows you to change this delimiter:But this is just a delimiter. You may consider that it signifies "End Of Line", but it's certainly not "End Of File".
Null characters don't matter.
The only time null characters are a problem is in C-style
char
buffer strings with no accompanying length information. The only way to determine the string's length becomes searching for the terminating null character (see:strlen
), which is problematic if there are arbitrary other null characters scattered throughout the useful part of the data.If you're passing around a pointer to a
char
array and its size as anint
, then it can contain as many null characters as you like.When reading characters from a stream, in C or C++, the function you use tells you how many characters were read. So, even if some of them were null characters, it doesn't matter. You can handle them as you see fit.
I didn't quite understand this question, but I'll wrap up my answer by briefly describing End Of File.
Historically, files had a physical character
\004
(^D
) that sat at the end of its contents and represented End Of File.Nowadays this physical character isn't used in this manner, but the internals of the OS and File System will use varying mechanisms to inform your application that there is no more input. The C functions will return the macro
EOF
, and the C++ objects have a state flag that you can check.The detail of precisely how this works is abstracted away from you; you shouldn't have to care about it.
Interestingly, to end input in a Linux console, it's still
^D
that you press on the keyboard.I hope that this has helped you somewhat. Your question wasn't particularly clear, but the above is intended to be a brief description of EOL and EOF in C++.
I can recommend these books and resources for further reading.
你正在混合 C 和 C++。
C++ 方式是 像这样:
std::getline
返回input_stream
,当输入用完或其他失败时,它会转换为布尔值 false。这里,“用完”意味着“遇到 EOF”或一些类似的情况。您也可以这样做
如果默认的
'\n'
不是正确的行终止符,。 EOF 是一个特殊的 ASCII 值,它是历史性的,与早期的打印机协议和终端有关hacks,现在只有当您使用 getchar 或其他类似的古董时才重要。
'\n'
是 UNIX 标准行结束符。微软使用“\r\n”
,这是对打印机的两条指令:将打印头移动到行的开头,并将纸张向上移动一行。 UNIX 认为没有理由必须继续进入非打印文件的世界,并删除了'\r'
You're mixing C and C++.
The C++ way is like this:
std::getline
returnsinput_stream
which casts to boolean false when the input runs out or something else fails. Here, "runs out" means "encounters EOF" or some analogous condition.You can also do
if the default
'\n'
is not the right line terminator.EOF is a special ASCII value which is historical, relates to early printer protocols and terminal hacks, and now matters only when you use
getchar
or other such antiques.'\n'
is the UNIX standard end-of-line character. Microsoft uses"\r\n"
which is two instructions to a printer: move the head to be beginning of the line, and move the paper up a row. UNIX decided that there was no reason that this has to continue into the world of non-printed files and dropped the'\r'
getchar 和 getche 应该一次获取一个字符。不应该有“行尾”的概念。如果您以 NULL 字符结束行,则应该将其返回为您读取的字符。
当读取文件末尾时,您将获得特殊的 EOF 宏作为返回值。匹配它来检测文件的结尾。如果您得到“\n”或 NULL,您可以将其解析为适合您的文件(即,将其处理为文本行的末尾)。
http://www.cplusplus.com/reference/clibrary/cstdio/getchar/
(不确定我是否曾经使用过这个)
http://msdn.microsoft.com/en-我们/library/kswce429(v=vs.80).aspx
getchar and getche should get one character at a time. There should be no concept of the "end of a line." If you end your line with a NULL character, you should get that back as the character you read.
When the end of the file is read, you will get the special EOF macro as your return value. Match that to detect the end of the file. If you get a '\n' or a NULL, you can parse that as is appropriate to your file (i.e. handle as end of a line of text).
http://www.cplusplus.com/reference/clibrary/cstdio/getchar/
(not sure i've ever used this one)
http://msdn.microsoft.com/en-us/library/kswce429(v=vs.80).aspx