从 Java 文本文件中读取特定行
有没有什么方法可以从文本文件中读取特定行?在 API 或 Apache Commons 中。 像这样的事情:
String readLine(File file, int lineNumber)
我同意它的实现很简单,但是它不是很有效,特别是如果文件很大的话。
Is there any method to read a specific line from a text file ? In the API or Apache Commons.
Something like :
String readLine(File file, int lineNumber)
I agree it's trivial to implement, but it's not very efficient specially if the file is very big.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
可以,但是仍然存在效率问题。
或者,您可以使用:
由于缓冲区的存在,这会稍微提高效率。
看一下
Scanner.skip(..)
并尝试跳过整行(使用正则表达式)。我无法判断它是否会更有效 - 对其进行基准测试。PS 效率 我的意思是内存效率
would do, but it still has the efficiency problem.
Alternatively, you can use:
This will be slightly more efficient due to the buffer.
Take a look at
Scanner.skip(..)
and attempt skipping whole lines (with regex). I can't tell if it will be more efficient - benchmark it.P.S. with efficiency I mean memory efficiency
据我所知。
请注意,文件上没有关于行开始位置的特定索引,因此任何实用程序方法都与以下内容一样有效:(
当然,具有适当的错误处理和资源关闭逻辑)。
Not that I'm aware of.
Be aware that there's no particular indexing on files as to where the line starts, so any utility method would be exactly as efficient as:
(with appropriate error-handling and resource-closing logic, of course).
如果您正在阅读的行的长度全部相同,那么计算可能会很有用。
但在行长度不同的情况下,我认为没有其他选择可以一次读取一行,直到行数正确为止。
If the lines you were reading were all the same length, then a calculation might be useful.
But in the situation when the lines are different lengths, I don't think there's an alternative to reading them one at a time until the line count is correct.
不幸的是,除非您可以保证文件中的每一行的长度完全相同,否则您将必须通读整个文件,或者至少读到您所在的行。
计算行数的唯一方法是在文件中查找换行符,这意味着您必须读取每个字节。
可以优化您的代码以使其整洁且可读,但在底层您将始终阅读整个文件。
如果您要一遍又一遍地读取同一个文件,您可以解析该文件并创建一个索引来存储某些行号的偏移量,例如第 100、200 行等所在位置的字节数。
Unfortunately, unless you can guarantee that every line in the file is the exact same length, you're going to have to read through the whole file, or at least up to the line you're after.
The only way you can count the lines is to look for the new line characters in the file, and this means you're going to have to read each byte.
It will be possible to optimise your code to make it neat and readable, but underneath you'll always be reading the whole file.
If you're going to reading the same file over and over again you could parse the file and create an index storing the offsets of certain line numbers, for example the byte count of where lines 100, 200 and so on are.
因为文件是字节而不是行导向的 - 任何通用解决方案的复杂性最多都是 O(n),其中 n 是文件大小(以字节为单位)。您必须扫描整个文件并计算行分隔符,直到知道要读取文件的哪一部分。
Because files are byte and not line orientated - any general solutions complexity will be O(n) at best with n being the files size in bytes. You have to scan the whole file and count the line delimiters until you know which part of the file you want to read.
guava 有类似的东西:
所以你可以这样做
guava has something similar:
So you can do
使用文件实用程序:
Using File Utils:
如果您打算以相同的方式处理同一文件(查找特定行的文本),您可以为文件建立索引。行号->抵消。
If you are going to work with the same file in the same way (looking for a text at certain line) you can index your file. Line number -> offset.
根据这个答案,Java 8使我们能够从文件中提取特定行。该答案中提供了示例。
According to this answer, Java 8 enables us to extract specific lines from a file. Examples are provided in that answer.