为什么下面会拖累Fragment Shader的性能(Open GL ES 2.0)
我在片段着色器中有以下代码:
precision lowp float;
varying vec2 v_texCoord;
uniform sampler2D s_texture;
uniform bool color_tint;
uniform float color_tint_amount;
uniform vec4 color_tint_color;
void main(){
float gradDistance;
vec4 texColor, gradColor;
texColor = texture2D(s_texture, v_texCoord);
if (color_tint){
gradColor = color_tint_color;
gradColor.a = texColor.a;
texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
}
gl_FragColor = texColor;
}
代码工作正常,但有趣的是,即使我传入的所有 color_tint
都是 false,上面的代码仍然会严重拖累性能。比较时:
void main(){
float gradDistance;
vec4 texColor, gradColor;
texColor = texture2D(s_texture, v_texCoord);
if (false){
gradColor = color_tint_color;
gradColor.a = texColor.a;
texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
}
gl_FragColor = texColor;
}
后者可以达到 40+ fps,而第一个约为 18 fps。我仔细检查了一下,第一个传递的所有 color_tint
都是 false,因此该块永远不应该执行。
顺便说一句,我正在使用 GLES20 在 Android 2.2 中对上述内容进行编程。
有哪位高手可以知道这个着色器出了什么问题吗?
I have the following code in the Fragment Shader:
precision lowp float;
varying vec2 v_texCoord;
uniform sampler2D s_texture;
uniform bool color_tint;
uniform float color_tint_amount;
uniform vec4 color_tint_color;
void main(){
float gradDistance;
vec4 texColor, gradColor;
texColor = texture2D(s_texture, v_texCoord);
if (color_tint){
gradColor = color_tint_color;
gradColor.a = texColor.a;
texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
}
gl_FragColor = texColor;
}
The code works fine, but it is interesting that even all color_tint
I passed in is false, the above code still cause serious drag in performance. When comparing to:
void main(){
float gradDistance;
vec4 texColor, gradColor;
texColor = texture2D(s_texture, v_texCoord);
if (false){
gradColor = color_tint_color;
gradColor.a = texColor.a;
texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
}
gl_FragColor = texColor;
}
Which the later one can achieve 40+ fps while the first one is about 18 fps. I double checked and all color_tint
passed in the first one are false so the block should never executed.
BTW, I am programming the above in Android 2.2 using GLES20.
Could any expert know what's wrong with the shader?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我不是片段着色器方面的专家,但我认为第二个会更快,因为整个 if 语句可以在编译时删除,因为它永远不是真的。在第一个中,它无法判断
color_tint
在运行时之前始终为 false,因此每次都需要检查并分支。分支的成本可能很高,尤其是在通常为可预测的串行编程而设计的图形硬件上。我建议你尝试将其重写为无分支 - 达伦的答案在这方面有一些很好的建议。
I am not an expert in fragment shaders, but I assume the second one would be faster because the entire if statement could be removed at compile time because it is never true. In the first one it can't tell that
color_tint
is always false until runtime so will need to check that and branch every time. Branches can be expensive, especially on graphics hardware that is often designed for predictable serial programming.I suggest you try rewriting it to be branchless - Darren's answer has some good suggestions in that direction.
片段着色器上的分支非常慢,如果可能的话请避免使用它们。使用 color_tint_amount 0 表示无色调。预乘 color_tint_color 并保存每个像素的乘法。使 color_tint_amount = 1.0 - color_tint_amount。 (所以现在 1.0 意味着没有 gradColor)这些着色器每秒运行数百万次,您必须保存每个周期。
Branches are very slow on fragment shaders avoid them if possible. Use color_tint_amount of 0 for no tint. Premultiply the color_tint_color and save a multiply per pixel. Make color_tint_amount = 1.0 - color_tint_amount. (so now 1.0 means no gradColor) These shaders and run millions upon millions of times a second, you have to save every cycle you can.