使用 Java 与 C 读取二进制文件++

发布于 2024-11-28 08:53:31 字数 119 浏览 0 评论 0原文

我有一个二进制文件(大约 100 MB),我需要快速读入。在 C++ 中,我可以将文件加载到 char 指针中,并通过递增指针来遍历它。这当然会非常快。

在 Java 中是否有一种相对快速的方法来做到这一点?

I have a binary file (about 100 MB) that I need to read in quickly. In C++ I could just load the file into a char pointer and march through it by incrementing the pointer. This of course would be very fast.

Is there a comparably fast way to do this in Java?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

撩人痒 2024-12-05 08:53:31

如果您使用内存映射文件或常规缓冲区,您将能够以硬件允许的速度读取数据。

File tmp = File.createTempFile("deleteme", "bin");
tmp.deleteOnExit();
int size = 1024 * 1024 * 1024;

long start0 = System.nanoTime();
FileChannel fc0 = new FileOutputStream(tmp).getChannel();
ByteBuffer bb = ByteBuffer.allocateDirect(32 * 1024).order(ByteOrder.nativeOrder());

for (int i = 0; i < size; i += bb.capacity()) {
    fc0.write(bb);
    bb.clear();
}
long time0 = System.nanoTime() - start0;
System.out.printf("Took %.3f ms to write %,d MB using ByteBuffer%n", time0 / 1e6, size / 1024 / 1024);

long start = System.nanoTime();
FileChannel fc = new FileInputStream(tmp).getChannel();
MappedByteBuffer buffer = fc.map(FileChannel.MapMode.READ_ONLY, 0, size);
LongBuffer longBuffer = buffer.order(ByteOrder.nativeOrder()).asLongBuffer();
long total = 0; // used to prevent a micro-optimisation.
while (longBuffer.remaining() > 0)
    total += longBuffer.get();
fc.close();
long time = System.nanoTime() - start;
System.out.printf("Took %.3f ms to read %,d MB MemoryMappedFile%n", time / 1e6, size / 1024 / 1024);

long start2 = System.nanoTime();
FileChannel fc2 = new FileInputStream(tmp).getChannel();
bb.clear();
while (fc2.read(bb) > 0) {
    while (bb.remaining() > 0)
        total += bb.get();
    bb.clear();
}
fc2.close();
long time2 = System.nanoTime() - start2;
System.out.printf("Took %.3f ms to read %,d MB File via NIO%n", time2 / 1e6, size / 1024 / 1024);

prints

Took 305.243 ms to write 1,024 MB using ByteBuffer
Took 286.404 ms to read 1,024 MB MemoryMappedFile
Took 155.598 ms to read 1,024 MB File via NIO

这是一个比您想要的文件大 10 倍的文件。它这么快是因为数据被缓存在内存中(而且我有一个 SSD 驱动器)。如果您有快速的硬件,则可以非常快地读取数据。

If you use a memory mapped file or regular buffer you will be able to read the data as fast your hardware allows.

File tmp = File.createTempFile("deleteme", "bin");
tmp.deleteOnExit();
int size = 1024 * 1024 * 1024;

long start0 = System.nanoTime();
FileChannel fc0 = new FileOutputStream(tmp).getChannel();
ByteBuffer bb = ByteBuffer.allocateDirect(32 * 1024).order(ByteOrder.nativeOrder());

for (int i = 0; i < size; i += bb.capacity()) {
    fc0.write(bb);
    bb.clear();
}
long time0 = System.nanoTime() - start0;
System.out.printf("Took %.3f ms to write %,d MB using ByteBuffer%n", time0 / 1e6, size / 1024 / 1024);

long start = System.nanoTime();
FileChannel fc = new FileInputStream(tmp).getChannel();
MappedByteBuffer buffer = fc.map(FileChannel.MapMode.READ_ONLY, 0, size);
LongBuffer longBuffer = buffer.order(ByteOrder.nativeOrder()).asLongBuffer();
long total = 0; // used to prevent a micro-optimisation.
while (longBuffer.remaining() > 0)
    total += longBuffer.get();
fc.close();
long time = System.nanoTime() - start;
System.out.printf("Took %.3f ms to read %,d MB MemoryMappedFile%n", time / 1e6, size / 1024 / 1024);

long start2 = System.nanoTime();
FileChannel fc2 = new FileInputStream(tmp).getChannel();
bb.clear();
while (fc2.read(bb) > 0) {
    while (bb.remaining() > 0)
        total += bb.get();
    bb.clear();
}
fc2.close();
long time2 = System.nanoTime() - start2;
System.out.printf("Took %.3f ms to read %,d MB File via NIO%n", time2 / 1e6, size / 1024 / 1024);

prints

Took 305.243 ms to write 1,024 MB using ByteBuffer
Took 286.404 ms to read 1,024 MB MemoryMappedFile
Took 155.598 ms to read 1,024 MB File via NIO

This is for a file 10x larger than what you want. Its this fast because the data is being cached in memory (and I have an SSD drive). If you have fast hardware, the data can be read pretty fast.

︶葆Ⅱㄣ 2024-12-05 08:53:31

当然,您可以使用内存映射文件。

这里有两个很好的链接,其中包含示例代码:

  • Thinking in Java:内存映射文件
  • <一个href="http://www.java-tips.org/java-se-tips/java.nio/how-to-create-a-memory-mapped-file-19.html" rel="nofollow">Java Tips:如何创建内存映射文件

如果你不想走这条路,就使用普通的InputStream(比如后面的DataInputStream)将它包裹在一个BufferedInputStream

Sure, you could use a memory mapped file.

Here are two good links with sample code:


If you don't want to go this route, just use an ordinary InputStream (such as a DataInputStream after wrapping it in a BufferedInputStream.

故事和酒 2024-12-05 08:53:31

大多数文件不需要内存映射,但可以简单地通过标准 Java I/O 读取,特别是因为您的文件非常小。读取所述文件的合理方法是使用 BufferedInputStream。

InputStream in = new BufferedInputStream(new FileInputStream("somefile.ext"));

Java 中的缓冲已经针对大多数计算机进行了优化。如果您有一个更大的文件,比如 100MB,那么您会考虑进一步优化它。

Most files will not need memory mapping but can simply be read by the standard Java I/O, especially since your file is so small. A reasonable way to read said files is by using a BufferedInputStream.

InputStream in = new BufferedInputStream(new FileInputStream("somefile.ext"));

Buffering is already optimized in Java for most computers. If you had a larger file, say 100MB, then you would look at optimizing it further.

岁月静好 2024-12-05 08:53:31

从磁盘读取文件将是最慢的部分,因此可能没有任何区别。当然,在这个单独的操作中,JVM 仍然需要十年的时间才能启动,因此请添加该时间。

Reading the file from the disk is going to be the slowest part by miles, so it's likely to make no difference whatsoever. Of this individual operation, of course- the JVM still takes a decade to start up, so add that time in.

眼眸印温柔 2024-12-05 08:53:31

看一下这篇博客文章,了解如何在 Java 中将二进制文件读入字节数组:

http://www.spartanjava.com/2008/read-a-file-into-a-byte-array/

从链接复制:

File file = new File("/somepath/myfile.ext");
FileInputStream is = new FileInputStream(file);

// Get the size of the file
long length = file.length();

if (length > Integer.MAX_VALUE) {
    throw new IOException("The file is too big");
}

// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];

// Read in the bytes
int offset = 0;
int numRead = 0;
while (offset < bytes.length
       && (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
    offset += numRead;
}

// Ensure all the bytes have been read in
if (offset < bytes.length) {
    throw new IOException("The file was not completely read: "+file.getName());
}

// Close the input stream, all file contents are in the bytes variable
is.close()    

Take a look at this blog post here on how to read a binary file into a byte array in Java:

http://www.spartanjava.com/2008/read-a-file-into-a-byte-array/

Copied from link:

File file = new File("/somepath/myfile.ext");
FileInputStream is = new FileInputStream(file);

// Get the size of the file
long length = file.length();

if (length > Integer.MAX_VALUE) {
    throw new IOException("The file is too big");
}

// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];

// Read in the bytes
int offset = 0;
int numRead = 0;
while (offset < bytes.length
       && (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
    offset += numRead;
}

// Ensure all the bytes have been read in
if (offset < bytes.length) {
    throw new IOException("The file was not completely read: "+file.getName());
}

// Close the input stream, all file contents are in the bytes variable
is.close()    
笑看君怀她人 2024-12-05 08:53:31

使用 Java SDK 的 DataInputStream 在这里会很有帮助。 DataInputStream 提供 readByte() 或 readChar() 等函数(如果需要的话)。
一个简单的例子可以是:

DataInputStream dis = new DataInputStream(new FileInputStream("file.dat")); 
try {
   while(true) {
      byte b = dis.readByte();
      //Do something with the byte
   } 
} catch (EOFException eofe) {
//Stream Ended
} catch (IOException ioe) {
//Input exception
}

希望有帮助。当然,您也可以将整个流读取到字节数组并迭代它......

Using the DataInputStream of the Java SDK can be helpful here. DataInputStream provide such functions as readByte() or readChar(), if that's what needed.
A simple example can be:

DataInputStream dis = new DataInputStream(new FileInputStream("file.dat")); 
try {
   while(true) {
      byte b = dis.readByte();
      //Do something with the byte
   } 
} catch (EOFException eofe) {
//Stream Ended
} catch (IOException ioe) {
//Input exception
}

Hope it helps. You can, of course, read the entire stream to a byte array and iterate through it as well...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文