BufferedReader的ready方法在while循环中判断EOF?
我有一个大文件(英语维基百科文章仅数据库为 XML 文件)。我正在使用 BufferedReader 一次读取一个字符。伪代码是:
file = BufferedReader...
while (file.ready())
character = file.read()
这实际上有效吗?或者,当等待 HDD 返回数据时,ready 只会返回 false,而不是在到达 EOF 时返回 false?我尝试使用 if (file.read() == -1) ,但似乎遇到了一个我根本找不到的无限循环。
我只是想知道它是否正在读取整个文件,因为我的统计数据显示 444,380 个维基百科页面已被阅读,但我认为还有更多文章。
I have a large file (English Wikipedia articles only database as XML files). I am reading one character at a time using BufferedReader
. The pseudocode is:
file = BufferedReader...
while (file.ready())
character = file.read()
Is this actually valid? Or will ready
just return false
when it is waiting for the HDD to return data and not actually when the EOF
has been reached? I tried to use if (file.read() == -1)
but seemed to run into an infinite loop that I literally could not find.
I am just wondering if it is reading the whole file as my statistics say 444,380 Wikipedia pages have been read but I thought there were many more articles.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Reader.ready()
方法不适用于测试文件结尾。相反,它是一种测试调用 read() 是否会阻塞的方法。检测是否已到达 EOF 的正确方法是检查
read
调用的结果。例如,如果您一次读取一个字符,则
read()
方法将返回一个int
,它要么是有效字符,要么是-1
code> 如果已到达文件末尾。因此,您的代码应如下所示:The
Reader.ready()
method is not intended to be used to test for end of file. Rather, it is a way to test whether callingread()
will block.The correct way to detect that you have reached EOF is to examine the result of a
read
call.For example, if you are reading a character at a time, the
read()
method returns anint
which will either be a valid character or-1
if you've reached the end-of-file. Thus, your code should look like this:这不能保证读取整个输入。
ready()
只是告诉您底层流是否已准备好一些内容。例如,如果它是通过网络套接字或文件进行抽象,则可能意味着还没有任何可用的缓冲数据。This is not guaranteed to read the whole input.
ready()
just tells you if the underlying stream has some content ready. If it is abstracting over a network socket or file, for example, it could mean that there isn't any buffered data available yet.