如何将许多纹理/缓冲液混合到OpenGL中的一个纹理/缓冲区中?

发布于 2025-01-29 07:22:21 字数 658 浏览 2 评论 0原文

我有一个大型缓冲区(对象)包含 mnist 数据集)灰度图像,以float表示像素强度,以排顺序存储一对一。我想有效地(即有点交互)将这些图像融合到一个“平均”图像中,其中混合图像中的每个像素是同一位置上所有像素的平均值。这可能吗?

我考虑的选项是:

  1. 直接在缓冲对象上使用计算着色器。我会产生imgwidth * imgheight计算着色器调用/线程,每个调用循环在所有图像上进行。这似乎不是很有效,因为每个调用都必须在所有图像上循环,但是以相反的方式执行此操作(即产卵numimages调用和在像素上行走)仍然具有相互等待的调用。

  2. 使用图形管道将纹理绘制为一对一的框架,将它们互相融合。这仍然会导致线性时间,因为依次将每个图像呈现到框架缓冲器。不过,我对Framebuffers并不熟悉。

  3. 在CPU中进行线性执行,这似乎比在GPU上进行的更容易且慢得多。我只会错过像素的并行处理。

我缺少他们的其他可能性。有最佳方​​法吗?如果不是,您认为最简单的是什么?

I have one big buffer (object) containing the MNIST dataset: many (tens of thousands) small (28x28) grayscale images, stored one-by-one in row-wise order as floats indicating pixel intensity. I would like to efficiently (i.e. somewhat interactively) blend these many images into one "average" image, where each pixel in the blended image is the average of all the pixels at that same position. Is this possible?

The options I considered are:

  1. Using a compute shader directly on the buffer object. I would spawn imgWidth * imgHeight compute shader invocations/threads, with each invocation looping over all images. This doesn't seem very efficient, as each invocation has to loop over all images, but doing it the other way (i.e. spawning numImages invocations and walking over the pixels) still has invocations waiting on each other.

  2. Using the graphics pipeline to draw the textures one-by-one to a framebuffer, blending them all over each other. This would still result in linear time, as each image has to be rendered to the framebuffer in turn. I'm not very familiar with framebuffers, though.

  3. Doing it all linearly in the CPU, which seems easier and not much slower than doing it on the GPU. I would only be missing out on the parallel processing of the pixels.

Are their other possibilities I'm missing. Is there an optimal way? And if not, what do you think would be the easiest?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

安静 2025-02-05 07:22:21

大多数时候,我们想在像素级别上并行化,因为有很多。

但是,在您的情况下,没有那么多像素(28x28)。

您最大的数字似乎是图像数(数千张图像)。因此,我们想利用这一点。

使用计算着色器,您可以成对将图像融合在一起,而不是迭代所有图像。每次通过后,您将减半图像数量。一旦图像数量很少,您可能需要更改策略,但这是您需要尝试的东西,以查看最有效的方法。

您知道计算着色器可以具有3个维度。您可以拥有Xy索引图像的像素。 z可以用来将纹理数组中的一对图像击中。因此,对于索引z,您将混合纹理2*z2*z+1

您需要考虑的一些实现详细信息:

  • 很可能,图像的数量不会是两个的幂。因此,在某些时候,图像数量将很奇怪。
  • 由于您正在使用大量图像,因此您可能会遇到浮动精确问题。您可能需要使用浮点纹理,或加入策略,因此这不是问题。
  • 通常,当线程处理2x2像素而不是单个像素时,计算着色器的工作状况最好。

Most times we want to parallelize at the pixel level because there are many.

However, in your case there are not that many pixels (28x28).

The biggest number you have seems to be the number of images (thousands of images). So we would like to leverage that.

Using a compute shader, instead of iterating though all the images, you could blend the images in pairs. After each pass you would halve the number of images. Once the number of images gets very small, you might want to change the strategy but that's something that you need to experiment with to see what works best.

You know compute shaders can have 3 dimensions. You could have X and Y index the pixel of the image. And Z can be used to inxed the pair of images in a texture array. So for index Z, you would blend textures 2*Z and 2*Z+1.

Some implementation details you need to take into account:

  • Most likely, the number of images won't be a power of two. So at some point the number of images will be odd.
  • Since you are working with lots of images, you could run into float precission issues. You might need to use float textures, or addapt the strategy so this is not a problem.
  • Usually compute shaders work best when the threads process tiles of 2x2 pixels instead of individual pixels.
戈亓 2025-02-05 07:22:21

这就是我的方式。

将所有纹理渲染到Framebuffer,这也可能是默认的帧缓冲区。

完成后完成。

从Framebuffer中读取数据。

glReadBuffer(GL_COLOR_ATTACHMENT0);
glBindBuffer(GL_PIXEL_PACK_BUFFER, w_pbo[w_writeIndex]);
// copy from framebuffer to PBO asynchronously. it will be ready in the NEXT frame
glReadPixels(0, 0, SCR_WIDTH, SCR_HEIGHT, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
// now read other PBO which should be already in CPU memory
glBindBuffer(GL_PIXEL_PACK_BUFFER, w_pbo[w_readIndex]);
unsigned char* Data = (unsigned char*)glMapBuffer(GL_PIXEL_PACK_BUFFER, GL_READ_ONLY);

This is how i do it.

Render all the textures to the framebuffer , which can also be the default frame buffer.

Once rendering in completed.

Read the data from the Framebuffer.

glReadBuffer(GL_COLOR_ATTACHMENT0);
glBindBuffer(GL_PIXEL_PACK_BUFFER, w_pbo[w_writeIndex]);
// copy from framebuffer to PBO asynchronously. it will be ready in the NEXT frame
glReadPixels(0, 0, SCR_WIDTH, SCR_HEIGHT, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
// now read other PBO which should be already in CPU memory
glBindBuffer(GL_PIXEL_PACK_BUFFER, w_pbo[w_readIndex]);
unsigned char* Data = (unsigned char*)glMapBuffer(GL_PIXEL_PACK_BUFFER, GL_READ_ONLY);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文