Alderlake GT1上的着色器编译器:SIMD32着色器效率低下
当我在Alderlake GT1集成的GPU上编译并链接GLSL着色器时,我会收到警告:
通过 gldebugmessagecallbackarbbbackarb 机制。
WRN [Shader Compiler][Other]{Notification}: VS SIMD8 shader: 11 inst, 0 loops, 40 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 176 to 112 bytes.
WRN [API][Performance]{Notification}: SIMD32 shader inefficient
WRN [Shader Compiler][Other]{Notification}: FS SIMD8 shader: 5 inst, 0 loops, 20 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
WRN [Shader Compiler][Other]{Notification}: FS SIMD16 shader: 5 inst, 0 loops, 28 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
#version 150
in mediump vec2 position;
out lowp vec4 clr;
uniform mediump vec2 rotx;
uniform mediump vec2 roty;
uniform mediump vec2 translation;
uniform lowp vec4 colour;
void main()
gl_Position.x = dot( position, rotx ) + translation.x;
gl_Position.y = dot( position, roty ) + translation.y;
gl_Position.z = 1.0;
gl_Position.w = 1.0;
clr = colour;
#version 150
in lowp vec4 clr;
out lowp vec4 fragColor;
void main()
fragColor = clr;
GL渲染器: Mesa Intel(r)图形(ADL-S GT1)
OS: ubuntu 22.04
gpu: alderlake-alderlake-s gt1
api : OpenGL 3.2 Core Profile
GLSL版本: 150
When I compile and link my GLSL shader on an Alderlake GT1 integrated GPU, I get the warning:
SIMD32 shader inefficient
This warning is reported via glDebugMessageCallbackARB mechanism.
I would like to investigate if I can avoid this inefficiency, but I am not sure how to get more information on this warning.
The full output from the driver, for this shader:
WRN [Shader Compiler][Other]{Notification}: VS SIMD8 shader: 11 inst, 0 loops, 40 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 176 to 112 bytes.
WRN [API][Performance]{Notification}: SIMD32 shader inefficient
WRN [Shader Compiler][Other]{Notification}: FS SIMD8 shader: 5 inst, 0 loops, 20 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
WRN [Shader Compiler][Other]{Notification}: FS SIMD16 shader: 5 inst, 0 loops, 28 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
The messages are created during the fragment shader compiling, by the way.
My vertex shader:
#version 150
in mediump vec2 position;
out lowp vec4 clr;
uniform mediump vec2 rotx;
uniform mediump vec2 roty;
uniform mediump vec2 translation;
uniform lowp vec4 colour;
void main()
gl_Position.x = dot( position, rotx ) + translation.x;
gl_Position.y = dot( position, roty ) + translation.y;
gl_Position.z = 1.0;
gl_Position.w = 1.0;
clr = colour;
My fragment shader:
#version 150
in lowp vec4 clr;
out lowp vec4 fragColor;
void main()
fragColor = clr;
That said, I doubt it is shader specific, because it seems to report this for every shader I use on this platform?
GL RENDERER: Mesa Intel(R) Graphics (ADL-S GT1)
OS: Ubuntu 22.04
GPU: AlderLake-S GT1
API: OpenGL 3.2 Core Profile
GLSL Version: 150
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

查看此代码,看来编译器有三个选项:使用 simd8 , simd16 或 simd32 。这是指宽度,而不是零位。因此,SIMD32是32宽的SIMD。
intel_debug = do32
有趣的事实:brw 代表broadwater Gen4图形。但是Gen12 Intel GPU仍然使用此编译器。
This seems to come from an Intel fragment shader compiler, that is part of Mesa.
Looking at this code, it seems that the compiler has three options: to use SIMD8, SIMD16 or SIMD32. This refers to widths, not to bits. So SIMD32 is 32-wide SIMD.
The compiler uses a heuristic to see if the SIMD32 version will be efficient, and if not, it skips that option.
Of course, this heuristic can get it wrong, so there is an option to force the BRW compiler to try SIMD32 regardless.
The environment variable setting
will tell the compiler to try the SIMD32 as well.When I tested this on my system, I indeed observed that the driver now reports three different results:
Observe that in this case, the heuristic definitely got it right: almost 50 times more cycles than SIMD8?
Fun fact: BRW stands for Broadwater, gen4 graphics. But gen12 Intel GPUs still use this compiler.