如何在 Java 中组合图像而不将其加载到 RAM 中
我正在尝试生成一个非常大(大约十亿像素)的图像,到目前为止,在出现内存不足错误之前,我只能在 BufferedImage 中创建大约 40 兆像素的图像。我想一块一块地构建图像,然后将各个部分组合起来,而不将图像加载到内存中。我还可以通过将每个部分写入文件来完成此操作,但 ImageIO 不支持此操作。
I have a very large (around a gigapixel) image I am trying to generate, and so far I can only create images up to around 40 megapixels in a BufferedImage before I get an out of memory error. I want to construct the image piece by piece, then combine the pieces without loading the images into memory. I could also do this by writing each piece to a file, but ImageIO does not support this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我认为
JAI
可以帮助你构建你想要的东西。我建议查看 JAI 提供的数据结构和流。另外,看看这些问题,可能会对您有所帮助。
你基本上想在那里反转 2。< br>
祝你的项目好运;)
I think
JAI
can help you build what you want. I would suggest looking at the data structures and streams offered byJAI
.Also, have a look at these questions, might help you with ideas.
You basically want to reverse 2 there.
Good luck with your project ;)
不是一个正确的解决方案,只是一个草图。
当图像被压缩时,解压图像并不容易。您可以通过外部工具将图像解压缩为某种简单的格式(xpm,未压缩的 tiff)。然后,您可以将该图像的各个部分加载为字节数组(因为格式非常简单),并根据这些原始数据创建 Image 实例。
Not a proper solution, just a sketch.
Unpacking a piece of image is not easy when an image is compressed. You can decompress, by an external tool, the image into some trivial format (xpm, uncompressed tiff). Then you could load pieces of this image as byte arrays, because the format is so straightforward, and create Image instances out of these raw data.
我看到两个简单的解决方案。为您的图像创建自定义二进制格式。为了保存,只需一次生成一个部分,seek() 到文件中的适当位置,然后卸载数据。为了加载,seek() 到文件中的适当位置,然后加载数据。
另一个解决方案是自己学习图像格式。 bmp 是未压缩的,但也是唯一容易学习的。一旦学会了,上述步骤就非常有效。
请记住将图像转换为字节数组以便于存储。
I see two easy solutions. Create a custom binary format for your image. For saving, just generate one part at a time, seek() to the appropriate spot in the file, then offload your data. For loading, seek() to the appropriate spot in the file, then load your data.
The other solution is to learn an image format yourself. bmp is uncompressed, but the only easy one to learn. Once learned, the above steps work quite well.
Remember to convert your image to a byte array for easy storage.
如果没有办法将其内置到 Java 中(为了你的缘故,我希望情况不是这样,并且有人回答说是这样),那么你将需要自己实现一个算法,就像其他人在这里评论的那样。
您不一定需要自己了解整个算法。如果您采用预先存在的算法,则只需修改它以将文件作为字节流加载,创建一个字节缓冲区以继续读取文件的块,然后修改算法以一次接受一个块的数据。
某些算法(例如 jpg)可能无法以这种方式使用文件块的线性流来实现。正如 @warren 所建议的,bmp 可能是最容易以这种方式实现的,因为该文件格式只有这么多字节的标头,然后它只是以二进制格式直接转储 RGBA 数据(以及一些填充)。因此,如果您要加载需要组合的子图像,请逻辑上一次加载它们 1 个(尽管您实际上可以多线程此操作并同时加载下一个数据以加快速度,因为此过程将需要很长一段时间),读取下一行数据,将其保存到二进制输出流中,等等。
您甚至可能需要多次加载子图像。例如,假设保存的图像由 2x2 网格中的 4 个子图像组成。您可能需要加载图像1,读取其第一行数据,将其保存到新文件,释放图像1,加载图像2,读取其第一行数据,保存,释放2,加载1以读取其第二行数据数据等。如果您使用压缩图像格式进行保存,则更有可能需要执行此操作。
再次建议使用 bmp,因为 bmp 未压缩,您可以将数据保存为您想要的任何格式(假设文件已打开)以提供随机访问的方式),您可以在正在保存的文件中跳过,以便您可以完全读取 1 个子图像并保存其所有数据,然后再继续处理下一个。这可能会节省运行时间,但也可能会带来可怕的保存文件大小。
我还可以继续。可能存在多个陷阱、优化等等。
如果您创建了一种新的图像文件格式,该格式仅由元数据组成,允许它以逻辑组合它们的方式引用其他文件而不实际创建 1 个文件,而不是保存 1 个由其他文件组合而成的巨大文件,该怎么办海量文件?是否可以选择创建新的图像文件格式取决于您的软件;如果您希望人们将这些图像用于其他软件,那么这将行不通 - 至少,除非您能让新的图像文件格式流行并成为标准。
If there is no way to do it built into Java (for your sake I hope this is not the case and that someone answers saying so), then you will need to implement an algorithm yourself, just as others have commented here saying so.
You do not necessarily need to understand the entire algorithm yourself. If you take a pre-existing algorithm, you could just modify it to load the file as a byte stream, create a byte buffer to keep reading chunks of the file, and modify the algorithm to accept this data a chunk at a time.
Some algorithms, such as jpg, might not be possible to implement with a linear stream of file chunks in this manner. As @warren suggested, bmp is probably the easiest to implement in this way since that file format just has a header of so many bytes then it just dumps the RGBA data straight out in binary format (along with some padding). So if you were to load up your sub-images that need to be combined, loading them logically 1 at a time (though you could actually multithread this thing and load the next data concurrently to speed it up, as this process is going to take a long time), reading the next line of data, saving that out to your binary output stream, and so on.
You might even need to load the sub-images multiple times. For example, imagine an image being saved which is made up of 4 sub-images in a 2x2 grid. You might need to load image 1, read its first line of data, save that to your new file, release image 1, load image 2, read its first line of data, save, release 2, load 1 to read its 2nd line of data, and so on. You would be more likely to need to do this if you use a compressed image format for saving in.
To suggest a bmp again, since bmp is not compressed and you can just save the data in whatever format you want (assuming the file was opened in a manner which provides random access), you could skip around in the file you're saving so that you can completely read 1 sub-image and save all of its data before moving on to the next one. That might provide run time savings, but it might also provide terrible saved file sizes.
And I could go on. There are likely to be multiple pitfalls, optimizations, and so on.
Instead of saving 1 huge file which is the result of combining other files, what if you created a new image file format which was merely made up of meta-data allowing it to reference other files in a way which combined them logically without actually creating 1 massive file? Whether or not creating a new image file format is an option depends on your software; if you are expecting people to take these images to use in other software, then this would not work - at least, not unless you could get your new image file format to catch on and become standard.