为什么下面会拖累Fragment Shader的性能(Open GL ES 2.0)

发布于 2024-11-04 19:07:15 字数 1174 浏览 0 评论 0原文

我在片段着色器中有以下代码:

precision lowp float;

varying vec2 v_texCoord;
uniform sampler2D s_texture;

uniform bool color_tint;
uniform float color_tint_amount;
uniform vec4 color_tint_color;

void main(){
    float gradDistance;
    vec4 texColor, gradColor;
    texColor = texture2D(s_texture, v_texCoord);
    if (color_tint){
        gradColor = color_tint_color;
        gradColor.a = texColor.a;
        texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
    }
    gl_FragColor = texColor;
}

代码工作正常,但有趣的是,即使我传入的所有 color_tint 都是 false,上面的代码仍然会严重拖累性能。比较时:

void main(){
    float gradDistance;
    vec4 texColor, gradColor;
    texColor = texture2D(s_texture, v_texCoord);
    if (false){
        gradColor = color_tint_color;
        gradColor.a = texColor.a;
        texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
    }
    gl_FragColor = texColor;
}

后者可以达到 40+ fps,而第一个约为 18 fps。我仔细检查了一下,第一个传递的所有 color_tint 都是 false,因此该块永远不应该执行。

顺便说一句,我正在使用 GLES20 在 Android 2.2 中对上述内容进行编程。

有哪位高手可以知道这个着色器出了什么问题吗?

I have the following code in the Fragment Shader:

precision lowp float;

varying vec2 v_texCoord;
uniform sampler2D s_texture;

uniform bool color_tint;
uniform float color_tint_amount;
uniform vec4 color_tint_color;

void main(){
    float gradDistance;
    vec4 texColor, gradColor;
    texColor = texture2D(s_texture, v_texCoord);
    if (color_tint){
        gradColor = color_tint_color;
        gradColor.a = texColor.a;
        texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
    }
    gl_FragColor = texColor;
}

The code works fine, but it is interesting that even all color_tint I passed in is false, the above code still cause serious drag in performance. When comparing to:

void main(){
    float gradDistance;
    vec4 texColor, gradColor;
    texColor = texture2D(s_texture, v_texCoord);
    if (false){
        gradColor = color_tint_color;
        gradColor.a = texColor.a;
        texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
    }
    gl_FragColor = texColor;
}

Which the later one can achieve 40+ fps while the first one is about 18 fps. I double checked and all color_tint passed in the first one are false so the block should never executed.

BTW, I am programming the above in Android 2.2 using GLES20.

Could any expert know what's wrong with the shader?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

怪我闹别瞎闹 2024-11-11 19:07:15

我不是片段着色器方面的专家,但我认为第二个会更快,因为整个 if 语句可以在编译时删除,因为它永远不是真的。在第一个中,它无法判断 color_tint 在运行时之前始终为 false,因此每次都需要检查并分支。分支的成本可能很高,尤其是在通常为可预测的串行编程而设计的图形硬件上。

我建议你尝试将其重写为无分支 - 达伦的答案在这方面有一些很好的建议。

I am not an expert in fragment shaders, but I assume the second one would be faster because the entire if statement could be removed at compile time because it is never true. In the first one it can't tell that color_tint is always false until runtime so will need to check that and branch every time. Branches can be expensive, especially on graphics hardware that is often designed for predictable serial programming.

I suggest you try rewriting it to be branchless - Darren's answer has some good suggestions in that direction.

静谧 2024-11-11 19:07:15

片段着色器上的分支非常慢,如果可能的话请避免使用它们。使用 color_tint_amount 0 表示无色调。预乘 color_tint_color 并保存每个像素的乘法。使 color_tint_amount = 1.0 - color_tint_amount。 (所以现在 1.0 意味着没有 gradColor)这些着色器每秒运行数百万次,您必须保存每个周期。

Branches are very slow on fragment shaders avoid them if possible. Use color_tint_amount of 0 for no tint. Premultiply the color_tint_color and save a multiply per pixel. Make color_tint_amount = 1.0 - color_tint_amount. (so now 1.0 means no gradColor) These shaders and run millions upon millions of times a second, you have to save every cycle you can.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文