Java NIO/ MappedByteBuffer 和映射、分块读取部分文件

发布于 2024-11-19 05:37:54 字数 723 浏览 4 评论 0原文

我想循环读取大文件的部分内容。我必须读取整个文件,但这不起作用,我收到一个异常,即文件太大。我将代码更改为下面的列表。下面的代码仅读取第一个块。我需要更改什么才能移动到下一个块。

   final FileInputStream fis = new FileInputStream(f);
    final FileChannel fc = fis.getChannel();
    final long sizeRead = fc.size() < defaultReadBufferSize ? fc.size() : defaultReadBufferSize;
    final MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, sizeRead);        
    while (bb.hasRemaining()) {                        
        final CharBuffer cb = decoder.decode(bb);
        this.search(f, cb);   
        System.out.println("============>" + cb.length());
        System.out.println("============>" + bb.hasRemaining());            
    }        
    fc.close();

I want to read parts of a large file in a loop. I had to just read the entire file but that didn't work, I was getting an exception that the file is too large. I changed my code to the listing below. The code below only reads in the first chunk. What do I need to change to move to the next chunk.

   final FileInputStream fis = new FileInputStream(f);
    final FileChannel fc = fis.getChannel();
    final long sizeRead = fc.size() < defaultReadBufferSize ? fc.size() : defaultReadBufferSize;
    final MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, sizeRead);        
    while (bb.hasRemaining()) {                        
        final CharBuffer cb = decoder.decode(bb);
        this.search(f, cb);   
        System.out.println("============>" + cb.length());
        System.out.println("============>" + bb.hasRemaining());            
    }        
    fc.close();

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

心病无药医 2024-11-26 05:37:54

您遇到的问题是无法通过这种方式访问​​字符编码数据。即您需要知道字符之间的边界在哪里。

访问文件和字符解码的成本可能比读取文件的方式昂贵得多,因此我会使用 BufferedReader,它也会简单得多。

例如,假设您想从第 1000 个字节开始读取。您可以这样做,但您不知道第 1000 个字节是否是多字节字符的一部分。

如果你可以假设所有字符都是字节,那么整个问题就简单得多,并且你不需要 CharBuffer,你可以直接访问 ByteBuffer,这会快得多。

The problem you have is that character encoded data cannot be accessed this way. i.e. you need to know where the boundaries between characters are.

The cost of accessing the file and character decoding it is likely to be far more expensive than how you read it, so I would use a BufferedReader which will be much simpler as well.

e.g. say you want to read from the 1000th byte. You can do that but you won't know if the 1000th byte is part of a multi-byte character or not.

If you can assume all characters are bytes, the whole issue is much simpler and you don't need a CharBuffer, you can access the ByteBuffer directly which would be much faster.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文