随机访问缓冲区优化

发布于 2024-12-10 17:57:06 字数 589 浏览 0 评论 0原文

我有 colorBuffer Color[width*height] (很可能是 800*600)

,在光栅化过程中我调用:

void setPixel(int x, int y, Color & color)
{
    colorBuffer[y * width + x] = color;
}

事实证明,这种对颜色缓冲区的随机访问确实无效,并且减慢了我的应用程序的速度。

我认为这是我使用它的方式造成的。我计算一些像素(使用光栅化算法)并调用 setPixel。
所以我认为我的缓冲区不在缓存中,这是主要问题。当尝试一次写入整个缓冲区时,速度要快得多。

有什么办法,如何优化?

编辑

我不使用它来用两个for循环填充缓冲区。
我用它来绘制“随机”像素。
例如,当光栅化线时,我使用它就像

setPixel(10,10);
calculate next point
setPixel(10,11);
calculate next point
setPixel(next point)
...

I have colorBuffer Color[width*height] (most likely 800*600)

and during rasterization I call:

void setPixel(int x, int y, Color & color)
{
    colorBuffer[y * width + x] = color;
}

It turns out that this random access to color buffer is really ineffective and slows my application down.

I think that it is caused the way I use it. I calculate some pixel (with rasterization algorithms) and call setPixel.
So I think my buffer is not in cache and this is the main problem. When trying to write into the whole buffer at once, it is much much faster.

Is there any way, how to optimize this?

edit

I do not use it to fill buffer with two for cycles.
I use it to paint "random" pixels.
eg when rasterize line I use it like

setPixel(10,10);
calculate next point
setPixel(10,11);
calculate next point
setPixel(next point)
...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

失退 2024-12-17 17:57:06

在我看来,缓冲区的访问模式取决于算法处理像素的顺序。您不能简单地更改该顺序,以便它为您的缓冲区创建一个顺序访问方案吗?

They way I see it, the access-pattern to the buffer depends in the order in which your algorithm processes the pixels. Can you not simply change that order so that it creates a sequential access-scheme to your buffer?

我的奇迹 2024-12-17 17:57:06

是的,你应该尝试缓存友好,
但我要做的第一件事就是找出什么需要时间。

这很简单。只需暂停几次,看看它在做什么。

如果它主要在计算下一点中,您应该看到它在那里做什么,因为这就是时间的所在。
(我假设您明白“在”中的意思是“在堆栈上”。)

如果它主要在 SetPixel 中,当您暂停它时,请查看反汇编窗口。

如果它在例程的序言/尾声中花费大量时间,则应将其内联。

如果在将指令实际移动到 colorBuffer 中花费了很多时间,那么您就会遇到缓存问题。

如果它在代码中花费大量时间来计算索引y * width + x,那么您可能想看看是否可以以某种方式使用您单步执行的初始化指针。

如果你修复了任何问题,你应该重新做一遍,因为你可能已经发现了另一个进一步加快速度的机会。

Yes, you should try to be cache-friendly,
but the first thing I would do is find out what's taking time.

It's simple enough. Just pause it several times and see what it's doing.

If it's mostly in calculate next point, you should see what it's doing in there, because that's where the time is going.
(I assume you understand that by "in" I mean "on the stack".)

If it's mostly in SetPixel, when you pause it, look at the disassembly window.

If it's spending much time in the prologue/epilogue of the routine, it should be inlined.

If it's spending much time in the actual move instruction into colorBuffer, then you're hitting the cache issue.

If it's spending much time in the code for the index calculation y * width + x, then you might want to see if you could somehow use an initialized pointer that you step along.

If you fix anything, you should do it all again, because you may have uncovered another opportunity to speed it up further.

十级心震 2024-12-17 17:57:06

首先要注意的是,处理像素的方式会对速度产生巨大影响。如果你

for (int x = 0; x < width;++x)
{
  for (int y = 0; y < height; ++y)
  {
    setPixel(x,y,Color());
  }
}

这样做,对性能来说真的很糟糕,因为你实际上是在内存宽度方向上跳跃(注意你做的是 y*width + x)。

如果您只是更改处理顺序,

for (int y = 0; y < height;++y)
{
  for (int x = 0; x < width; ++x)
  {
    setPixel(x,y,Color());
  }
}

您应该会注意到性能提升,因为处理器现在有机会缓存​​内存访问(以前没有)。

此外,在实际设置内存之前,您应该检查是否可以确定整个像素块将具有相同的颜色值。然后,您可以将这些恒定颜色值按块复制到图像数组中,这也可以节省大量性能。

The first thing to notice is that the way you process your pixels makes a huge difference to speed. If you do

for (int x = 0; x < width;++x)
{
  for (int y = 0; y < height; ++y)
  {
    setPixel(x,y,Color());
  }
}

this will be really bad for performance because you're literally jumping around in memory width-wise (note that you do y*width + x).

If you simply change the order of processing to

for (int y = 0; y < height;++y)
{
  for (int x = 0; x < width; ++x)
  {
    setPixel(x,y,Color());
  }
}

you already should notice a performance gain as the processor now gets a chance to cache memory accesses (which it didn't before).

Furthermore you should check if you can determine that entire blocks of pixels will have the same color value before actually setting the memory. Then you can copy those constant color values block-wise to your image array which can save you also a good deal of performance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文