在 Java 中使用 FileChannels 连接大文件的方法是什么？

发布于 2024-11-08 13:00:53 字数 2497 浏览 0 评论 0原文

我想找出两种方法中哪一种更好，用于在 Java 中连接文本文件。如果有人有一些关于内核级别发生的事情的见解，可以解释这些写入 FileChannel 的方法之间的差异，我将不胜感激。

根据我从文档和其他 Stack Overflow 对话中了解到的情况， allocateDirect 直接在驱动器上分配空间，并且主要避免使用 RAM。我担心如果文件输入很大（例如 1GB），使用 allocateDirect 创建的 ByteBuffer 可能会溢出或无法分配。在我们的软件开发过程中，我保证该文件不会大于 2 GB；但未来它可能会达到 10 或 20GB。

我观察到，transferFrom 循环从来不会多次执行循环...所以它似乎成功地一次写入整个 infile；但我还没有用大于 60MB 的文件对其进行测试。不过，我循环了，因为文档指定不能保证一次会写入多少内容。在我的系统上，transferFrom 只能接受 int32 作为其计数参数，因此我将无法指定一次传输超过 2GB 的数据...同样，内核专业知识将帮助我理解。

预先感谢您的帮助！

使用 ByteBuffer：

boolean concatFiles(StringBuffer sb, File infile, File outfile) {

    FileChannel inChan = null, outChan = null;

    try {

        ByteBuffer buff = ByteBuffer.allocateDirect((int)(infile.length() + sb.length()));
        //write the stringBuffer so it goes in the output file first:
        buff.put(sb.toString().getBytes());

        //create the FileChannels:
        inChan  = new RandomAccessFile(infile,  "r" ).getChannel();
        outChan = new RandomAccessFile(outfile, "rw").getChannel();

        //read the infile in to the buffer:
        inChan.read(buff);

        // prep the buffer:
        buff.flip();

        // write the buffer out to the file via the FileChannel:
        outChan.write(buff);
        inChan.close();
        outChan.close();
     } catch...etc

}

使用 trasferTo（或 transferFrom）：

boolean concatFiles(StringBuffer sb, File infile, File outfile) {

    FileChannel inChan = null, outChan = null;

    try {

        //write the stringBuffer so it goes in the output file first:    
        PrintWriter  fw = new PrintWriter(outfile);
        fw.write(sb.toString());
        fw.flush();
        fw.close();

        // create the channels appropriate for appending:
        outChan = new FileOutputStream(outfile, true).getChannel();
        inChan  = new RandomAccessFile(infile, "r").getChannel();

        long startSize = outfile.length();
        long inFileSize = infile.length();
        long bytesWritten = 0;

        //set the position where we should start appending the data:
        outChan.position(startSize);
        Byte startByte = outChan.position();

        while(bytesWritten < length){ 
            bytesWritten += outChan.transferFrom(inChan, startByte, (int) inFileSize);
            startByte = bytesWritten + 1;
        }

        inChan.close();
        outChan.close();
    } catch ... etc

原文

I want to find out what method is better of two that I have come up with for concatenating my text files in Java. If someone has some insight they can share about what goes on at the kernel level that explains the difference between these methods of writing to a FileChannel, I would greatly appreciate it.

From what I understand from documentation and other Stack Overflow conversations, the allocateDirect allocates space right on the drive, and mostly avoids using RAM. I have a concern that the ByteBuffer created with allocateDirect might have a potential to overflow or not be allocated if the File infile is large, say 1GB. I am guaranteed at this point in the development of our software that the File will be no larger than 2 GB; but there is potential in the future that it might be as big as 10 or 20GB.

I have observed that the transferFrom loop never goes through the loop more than once... so it seems to succeed in writing the entire infile at once; but I haven't tested it with files bigger than 60MB. I looped though, because the documentation specifies that there is no guarantee of how much will be written at once. With transferFrom only able to accept, on my system, an int32 as its count parameter, I won't be able to specify more than 2GB at a time be transferred... Again, kernel expertise would help me understand.

Thanks in advance for your help!!

Using a ByteBuffer:

boolean concatFiles(StringBuffer sb, File infile, File outfile) {

    FileChannel inChan = null, outChan = null;

    try {

        ByteBuffer buff = ByteBuffer.allocateDirect((int)(infile.length() + sb.length()));
        //write the stringBuffer so it goes in the output file first:
        buff.put(sb.toString().getBytes());

        //create the FileChannels:
        inChan  = new RandomAccessFile(infile,  "r" ).getChannel();
        outChan = new RandomAccessFile(outfile, "rw").getChannel();

        //read the infile in to the buffer:
        inChan.read(buff);

        // prep the buffer:
        buff.flip();

        // write the buffer out to the file via the FileChannel:
        outChan.write(buff);
        inChan.close();
        outChan.close();
     } catch...etc

}

Using trasferTo (or transferFrom):

boolean concatFiles(StringBuffer sb, File infile, File outfile) {

    FileChannel inChan = null, outChan = null;

    try {

        //write the stringBuffer so it goes in the output file first:    
        PrintWriter  fw = new PrintWriter(outfile);
        fw.write(sb.toString());
        fw.flush();
        fw.close();

        // create the channels appropriate for appending:
        outChan = new FileOutputStream(outfile, true).getChannel();
        inChan  = new RandomAccessFile(infile, "r").getChannel();

        long startSize = outfile.length();
        long inFileSize = infile.length();
        long bytesWritten = 0;

        //set the position where we should start appending the data:
        outChan.position(startSize);
        Byte startByte = outChan.position();

        while(bytesWritten < length){ 
            bytesWritten += outChan.transferFrom(inChan, startByte, (int) inFileSize);
            startByte = bytesWritten + 1;
        }

        inChan.close();
        outChan.close();
    } catch ... etc

分享到QQ

分享到微博