我还可以为此功能做更多优化吗?
我正在编写的图形库(用于 JavaScript/canvas,使用 ImageData)中有一个简单的框模糊功能。
我做了一些优化以避免堆积冗余代码,例如循环遍历通道的 [0..3]
而不是复制代码,以及使用单个未复制的每个周围像素实现代码行,最后取平均值。
这些优化是为了减少冗余代码行。我是否可以做任何进一步的优化,或者更好的是,我可以更改任何可以提高函数本身性能的事情?
在使用 Core 2 Duo 的 200x150 图像区域上运行此函数,在 Firefox 3.6 上大约需要 450 毫秒,在 Firefox 4 上大约需要 45 毫秒,在 Chromium 10 上大约需要 55 毫秒。
各种注释
expressive.data.get
返回一个ImageData
对象expressive.data.put
将ImageData
的内容写回画布ImageData< /code> 是一个对象:
无符号长宽
无符号长高度
数组数据
,格式为r, g, b, a, r, g, b, a ...
的一维数据
代码
expressive.boxBlur = function(canvas, x, y, w, h) {
// averaging r, g, b, a for now
var data = expressive.data.get(canvas, x, y, w, h);
for (var i = 0; i < w; i++)
for (var j = 0; j < h; j++)
for (var k = 0; k < 4; k++) {
var total = 0, values = 0, temp = 0;
if (!(i == 0 && j == 0)) {
temp = data.data[4 * w * (j - 1) + 4 * (i - 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == w - 1 && j == 0)) {
temp = data.data[4 * w * (j - 1) + 4 * (i + 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == 0 && j == h - 1)) {
temp = data.data[4 * w * (j + 1) + 4 * (i - 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == w - 1 && j == h - 1)) {
temp = data.data[4 * w * (j + 1) + 4 * (i + 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(j == 0)) {
temp = data.data[4 * w * (j - 1) + 4 * (i + 0) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(j == h - 1)) {
temp = data.data[4 * w * (j + 1) + 4 * (i + 0) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == 0)) {
temp = data.data[4 * w * (j + 0) + 4 * (i - 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == w - 1)) {
temp = data.data[4 * w * (j + 0) + 4 * (i + 1) + k];
if (temp !== undefined) values++, total += temp;
}
values++, total += data.data[4 * w * j + 4 * i + k];
total /= values;
data.data[4 * w * j + 4 * i + k] = total;
}
expressive.data.put(canvas, data, x, y);
};
I have a simple box blur function in a graphics library (for JavaScript/canvas, using ImageData) I'm writing.
I've done a few optimisations to avoid piles of redundant code such as looping through [0..3]
for the channels instead of copying the code, and having each surrounding pixel implemented with a single, uncopied line of code, averaging values at the end.
Those were optimisations to cut down on redundant lines of code. Are there any further optimisations I can do of that kind, or, better still, any things I can change that may improve performance of the function itself?
Running this function on a 200x150 image area, with a Core 2 Duo, takes about 450ms on Firefox 3.6, 45ms on Firefox 4 and about 55ms on Chromium 10.
Various notes
expressive.data.get
returns anImageData
objectexpressive.data.put
writes the contents of anImageData
back to a canvas- an
ImageData
is an object with:unsigned long width
unsigned long height
Array data
, a single-dimensional data in the formatr, g, b, a, r, g, b, a ...
The code
expressive.boxBlur = function(canvas, x, y, w, h) {
// averaging r, g, b, a for now
var data = expressive.data.get(canvas, x, y, w, h);
for (var i = 0; i < w; i++)
for (var j = 0; j < h; j++)
for (var k = 0; k < 4; k++) {
var total = 0, values = 0, temp = 0;
if (!(i == 0 && j == 0)) {
temp = data.data[4 * w * (j - 1) + 4 * (i - 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == w - 1 && j == 0)) {
temp = data.data[4 * w * (j - 1) + 4 * (i + 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == 0 && j == h - 1)) {
temp = data.data[4 * w * (j + 1) + 4 * (i - 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == w - 1 && j == h - 1)) {
temp = data.data[4 * w * (j + 1) + 4 * (i + 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(j == 0)) {
temp = data.data[4 * w * (j - 1) + 4 * (i + 0) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(j == h - 1)) {
temp = data.data[4 * w * (j + 1) + 4 * (i + 0) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == 0)) {
temp = data.data[4 * w * (j + 0) + 4 * (i - 1) + k];
if (temp !== undefined) values++, total += temp;
}
if (!(i == w - 1)) {
temp = data.data[4 * w * (j + 0) + 4 * (i + 1) + k];
if (temp !== undefined) values++, total += temp;
}
values++, total += data.data[4 * w * j + 4 * i + k];
total /= values;
data.data[4 * w * j + 4 * i + k] = total;
}
expressive.data.put(canvas, data, x, y);
};
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
小幅优化:
您还可以在索引中执行一些小幅优化,例如:
另外,在代码中的每个 for 语句后面放置 {}。额外的 2 个字符不会产生太大的影响。潜在的错误会。
A minor optimization:
You could also perform some minor optimizations in your indices, for example:
Also, put {} after every for-statement you have in your code. The 2 additional characters won't make a big difference. The potential bugs will.
您可以提取一些常见的表达式:
您可以尝试将循环移至
k
上,使其位于最外层,然后将+k
折叠到t
的定义中code>,节省了一点重复计算。 (由于内存局部性原因,这可能会变得很糟糕。)您可以尝试将
j
上的循环移动到i
上的循环之外,这会给您带来更好的结果内存局部性。这对于大图像来说更重要;对于您使用的尺寸来说可能根本不重要。相当痛苦,但可能非常有效:通过将循环拆分为 {0,1..w-2,w-1} 和 {0,1..h-2,h-1} ,您可能会丢失大量条件操作。
您可以摆脱所有这些
未定义
测试。鉴于您正在进行所有这些范围检查,您真的需要它们吗?避免范围检查的另一种方法:您可以沿着每个边缘填充图像(用零)一个像素。请注意,执行此操作的明显方法将给出与边缘现有代码不同的结果;这可能是一件好事,也可能是一件坏事。如果这是一件坏事,您可以计算出适当的值来除以。
You could pull out some common expressions:
You could try moving the loop over
k
so it's outermost, and then fold the+k
into the definition oft
, saving a bit more repeated calculation. (That might turn out to be bad for memory-locality reasons.)You could try moving the loop over
j
to be outside the loop overi
, which will give you better memory locality. This will matter more for large images; it may not matter at all for the size you're using.Rather painful but possibly very effective: you could lose lots of conditional operations by splitting your loops up into {0,1..w-2,w-1} and {0,1..h-2,h-1}.
You could get rid of all those
undefined
tests. Do you really need them, given that you're doing all those range checks?Another way to avoid the range checks: you could pad your image (with zeros) by one pixel along each edge. Note that the obvious way to do this will give different results from your existing code at the edges; this may be a good or a bad thing. If it's a bad thing, you can work out the appropriate value to divide by.
temp = 0
的声明不是必须的,只需编写var Total = 0,values = 0, temp;
即可。接下来是向后循环。
慢于
第三个技巧是使用 Duffy's Device 进行巨大的 for 循环。
the declaration of
temp = 0
is not necessary, just writevar total = 0, values = 0, temp;
.The next thing is to loop backwards.
is slower than
The third tip is to use Duffy's Device for huge for loops.
也许(只是也许)尽可能移动
if
检查将是一个优势。让我展示一些伪代码:为了简单起见,我将在
k
上循环的代码称为“内循环”。这将大大减少 if
if
检查的数量,因为大多数操作都不需要它们。Maybe (just maybe) moving the
if
checks out as far as possible would be an advantage. Let me present some pseudo-code:I'll just call the code looping over
k
"inner loop" for simplicityThis would drastically reduce the number if
if
checks, since the majority of operations would need none of them.如果使用 var data 的唯一方式是作为 data.data 那么您可以将:更改
为:
并将每一行更改为:更改
为:
这样您将保存一些名称查找。
可能有更好的方法来优化它,但这只是我首先注意到的。
更新:
此外,
if (i != 0 || j != 0)
可能比if (!(i == 0 && j == 0))
不仅因为否定,还因为它可以短路。(使用
==
与===
和!=
与!==
进行自己的实验,因为我的快速测试显示的结果对我来说似乎违反直觉。)而且有些测试进行了多次,有些 if 是相互排斥的,但无论如何都在没有 else 的情况下进行了测试。您可以尝试重构它,使用更多嵌套的 if 和更多的 else if。
If the only way you use var data is as data.data then you can change:
to:
and change every line like:
to:
and you will save some name lookups.
There may be better ways to optimize it but this is just what I've noticed first.
Update:
Also,
if (i != 0 || j != 0)
can be faster thanif (!(i == 0 && j == 0))
not only because of the negation but also because it can short cuircuit.(Make your own experiments with
==
vs.===
and!=
vs.!==
because my quick tests showed the results that seem counter-intuitive to me.)And also some of the tests are done many times and some of the ifs are mutually exclusive but tested anyway without an else. You can try to refactor it having more nested ifs and more else ifs.