Java - 是否可以逐行读取文件,停止,然后立即开始读取我停止的地方的字节?

发布于 2024-08-03 13:44:34 字数 168 浏览 5 评论 0原文

我在尝试解析文件的 ascii 部分时遇到问题,一旦我点击结束标记,立即开始从该点开始读取字节。我所知道的在 Java 中读取一行或整个单词的所有内容都会创建一个缓冲区,这会破坏在我的停止点之后立即获取字节的任何机会。逐字节读取、查找新行、重建新行之前的所有内容、查看它是否是我的结束标记,然后从那里开始,是唯一的方法吗?

I'm having an issue trying to parse the ascii part of a file, and once I hit the end tag, IMMEDIATELY start reading in the bytes from that point on. Everything I know in Java to read off a line or a whole word creates a buffer, which ruins any chance of getting the bytes immediately following my stop point. Is the only way to do this read in byte-by-byte, find new-lines, reconstruct everything prior to the new-line, see if it's my end tag, and go from there?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

总攻大人 2024-08-10 13:44:35

It is possible, but as far as I know not with the classes from the API.

You can do it manually - open it as a BufferedInputStream, which supports mark/reset. You read block by block (byte[]) and you parse it as ASCII. Eventually you accumulate it in a buffer until you hit the marker.
But before you read you call mark. If you believe you read all you needed in ASCII, you call reset and then you call read to dump the rest of the ASCII part. And now you have a BufferedInputStream (which is an InputStream) ready for reading the binary part of the file.

要走就滚别墨迹 2024-08-10 13:44:35

我认为最好的想法是放弃“线”的概念。要查找结束标记,请创建一个足以容纳结束标记的 环形缓冲区,逐字节读入其中,并在每个字节后检查它是否包含标签。

有更复杂和更高效的搜索算法,但差异仅与较长的搜索词相关(大概您的结束标记很短)。

I think the best idea would be to abandon the concept of "lines". To find the end tag, create a ring buffer that's just big enough to contain the end tag, read into it byte-by-byte, and after each byte check if it contains the tag.

There are more sophisticated and efficient search algorithms, but the difference is only relevant with longer search terms (presumably your end tag is short).

闻呓 2024-08-10 13:44:35

这个文件有多大?我的第一个想法是将整个内容读入 ByteBuffer 或 ByteArrayOutputStream 而不尝试处理它,然后通过比较字节值来定位标签。一旦您知道文本部分的结束位置和二进制部分的开始位置,您就可以适当地处理每个部分。

How big is this file? My first thought is to read the whole thing into a ByteBuffer or a ByteArrayOutputStream without trying to process it, then locate the tag by comparing byte values. Once you know where the text part ends and the binary part begins, you process each part as appropriate.

盛夏已如深秋| 2024-08-10 13:44:35

文件是在增长还是静态?

如果它是静态的,请参阅 http://java.sun .com/javase/6/docs/api/java/nio/MappedByteBuffer.html

Is the file growing, or is it static?

If it's static, see http://java.sun.com/javase/6/docs/api/java/nio/MappedByteBuffer.html

待"谢繁草 2024-08-10 13:44:35

是的,你对一个字节一个字节的看法是正确的。抽象有其缺点。

Yup, you're right about the byte-by-byte. Abstraction has its disadvantages.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文