GPU着色器的像素缩放算法问题
因此,我正在为超级任天堂等老式模拟器开发一些像素着色器。你有 HQnx、2xSaI 等经典算法,它们肯定是为了在 CPU 上运行而编写的,并且在传输到屏幕之前精确地缩放到两倍大小。
转向 GPU 片段着色器,这些算法基本上可以免费完成。我正在使用 OpenGL 和 Cg/GLSL,但这个问题也应该适用于 Direct3D/HLSL 编码器。
主要问题是这些算法使用某种算法与相邻像素混合来决定颜色。然而,我发现这个概念对于着色器语言来说相当困难。一般来说,使用片段着色器,您可以获得浮点纹理坐标,您可以使用它来进行纹理查找,通常使用 GL_LINEAR 用作纹理过滤器。大多数像素着色器使用 GL_NEAREST,并自行进行平滑。
如果我想找到确切的相邻像素,就会出现问题。我见过一些实现,但它们偶尔会导致屏幕上出现伪影。可能是由于发生浮点不准确。我发现,当使用二次方大小的纹理时,大多数伪影都会消失,这进一步坚定了我的信念,即浮点不准确。下面是 Cg 中的一个示例片段着色器,它显示了问题:
struct output
{
float4 color : COLOR;
};
struct input
{
float2 video_size;
float2 texture_size;
float2 output_size;
};
struct deltas
{
float2 UL, UR, DL, DR;
};
output main_fragment (float2 tex : TEXCOORD0, uniform input IN, uniform sampler2D s_p : TEXUNIT0)
{
float2 texsize = IN.texture_size;
float dx = pow(texsize.x, -1.0) * 0.25;
float dy = pow(texsize.y, -1.0) * 0.25;
float3 dt = float3(1.0, 1.0, 1.0);
deltas VAR = {
tex + float2(-dx, -dy),
tex + float2(dx, -dy),
tex + float2(-dx, dy),
tex + float2(dx, dy)
};
float3 c00 = tex2D(s_p, VAR.UL).xyz;
float3 c20 = tex2D(s_p, VAR.UR).xyz;
float3 c02 = tex2D(s_p, VAR.DL).xyz;
float3 c22 = tex2D(s_p, VAR.DR).xyz;
float m1=dot(abs(c00-c22),dt)+0.001;
float m2=dot(abs(c02-c20),dt)+0.001;
output OUT;
OUT.color = float4((m1*(c02+c20)+m2*(c22+c00))/(2.0*(m1+m2)),1.0);
return OUT;
}
是否有某种方法可以确保我们可以从我们期望的像素而不是不同的像素中获取颜色数据?我相信会出现这个问题,因为我们可能会从两个像素之间的坐标查询像素(如果这有意义的话)。希望我忽略的这些着色器语言中有一些内置功能。
So I'm working on some pixel shaders for good old emulators like Super Nintendo. You have the classic algorithms like HQnx, 2xSaI, etc, and they are definitely written to run on CPUs and be scaled exactly to twice the size before blitting to the screen.
Moving on to GPU fragment shaders, these algorithms can be done essentially for free. I'm working with OpenGL and Cg/GLSL, but this question should apply to Direct3D/HLSL coders as well.
The major problem is that these algorithms blend against neighboring pixels using some algorithm to decide on the color. However, I found this concept quite hard with shader languages. Generally with fragment shaders you are able to get a floating point texture coordinate, which you can use to do texture lookups, usually with GL_LINEAR used as texture filter. Most pixels shaders use GL_NEAREST, and do the smoothing themselves.
The problem occurs if I want to find, say, the exact neighbor pixel. I've seen some implementations but they occasionally cause artifacts on the screen. Probably due to floating point inaccuracies that occur. I've found that most of the artifacts simply disappear when using power-of-two sized textures, which further strengthens my belief there are floating point inaccuracies going on. Here is a sample fragment shader in Cg that shows the issues:
struct output
{
float4 color : COLOR;
};
struct input
{
float2 video_size;
float2 texture_size;
float2 output_size;
};
struct deltas
{
float2 UL, UR, DL, DR;
};
output main_fragment (float2 tex : TEXCOORD0, uniform input IN, uniform sampler2D s_p : TEXUNIT0)
{
float2 texsize = IN.texture_size;
float dx = pow(texsize.x, -1.0) * 0.25;
float dy = pow(texsize.y, -1.0) * 0.25;
float3 dt = float3(1.0, 1.0, 1.0);
deltas VAR = {
tex + float2(-dx, -dy),
tex + float2(dx, -dy),
tex + float2(-dx, dy),
tex + float2(dx, dy)
};
float3 c00 = tex2D(s_p, VAR.UL).xyz;
float3 c20 = tex2D(s_p, VAR.UR).xyz;
float3 c02 = tex2D(s_p, VAR.DL).xyz;
float3 c22 = tex2D(s_p, VAR.DR).xyz;
float m1=dot(abs(c00-c22),dt)+0.001;
float m2=dot(abs(c02-c20),dt)+0.001;
output OUT;
OUT.color = float4((m1*(c02+c20)+m2*(c22+c00))/(2.0*(m1+m2)),1.0);
return OUT;
}
Is there some way to make sure that we can grab the color data from the pixel we expect and not a different one? I believe this problem occurs since we might be query a pixel from a coordinate that is just between two pixels (if that makes sense). Hopefully there is some built-in function into these shader languages I'm overlooking.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当然,在 OpenGL 中,有多种方法:
非标准化坐标就是你想要的,它们在 [0, 0] x [w, h] 范围/
Sure, in OpenGL there are several ways:
Unnormalized coordinates is what you want, they are in the [0, 0] x [w, h] range/