GPU着色器的像素缩放算法问题

发布于 2024-10-17 11:22:33 字数 1541 浏览 10 评论 0原文

因此，我正在为超级任天堂等老式模拟器开发一些像素着色器。你有 HQnx、2xSaI 等经典算法，它们肯定是为了在 CPU 上运行而编写的，并且在传输到屏幕之前精确地缩放到两倍大小。

转向 GPU 片段着色器，这些算法基本上可以免费完成。我正在使用 OpenGL 和 Cg/GLSL，但这个问题也应该适用于 Direct3D/HLSL 编码器。

主要问题是这些算法使用某种算法与相邻像素混合来决定颜色。然而，我发现这个概念对于着色器语言来说相当困难。一般来说，使用片段着色器，您可以获得浮点纹理坐标，您可以使用它来进行纹理查找，通常使用 GL_LINEAR 用作纹理过滤器。大多数像素着色器使用 GL_NEAREST，并自行进行平滑。

如果我想找到确切的相邻像素，就会出现问题。我见过一些实现，但它们偶尔会导致屏幕上出现伪影。可能是由于发生浮点不准确。我发现，当使用二次方大小的纹理时，大多数伪影都会消失，这进一步坚定了我的信念，即浮点不准确。下面是 Cg 中的一个示例片段着色器，它显示了问题：

struct output
{
   float4 color : COLOR;
};

struct input
{
  float2 video_size;
  float2 texture_size;
  float2 output_size;
};

struct deltas
{
   float2 UL, UR, DL, DR;
};


output main_fragment (float2 tex : TEXCOORD0, uniform input IN, uniform sampler2D s_p : TEXUNIT0)
{
   float2 texsize = IN.texture_size;
   float dx = pow(texsize.x, -1.0) * 0.25;
   float dy = pow(texsize.y, -1.0) * 0.25;
   float3 dt = float3(1.0, 1.0, 1.0);

   deltas VAR = { 
      tex + float2(-dx, -dy),
      tex + float2(dx, -dy),
      tex + float2(-dx, dy),
      tex + float2(dx, dy)
   };

   float3 c00 = tex2D(s_p, VAR.UL).xyz;
   float3 c20 = tex2D(s_p, VAR.UR).xyz;
   float3 c02 = tex2D(s_p, VAR.DL).xyz;
   float3 c22 = tex2D(s_p, VAR.DR).xyz;

   float m1=dot(abs(c00-c22),dt)+0.001;
   float m2=dot(abs(c02-c20),dt)+0.001;

   output OUT;
   OUT.color = float4((m1*(c02+c20)+m2*(c22+c00))/(2.0*(m1+m2)),1.0);
   return OUT;
}

是否有某种方法可以确保我们可以从我们期望的像素而不是不同的像素中获取颜色数据？我相信会出现这个问题，因为我们可能会从两个像素之间的坐标查询像素（如果这有意义的话）。希望我忽略的这些着色器语言中有一些内置功能。

原文

So I'm working on some pixel shaders for good old emulators like Super Nintendo. You have the classic algorithms like HQnx, 2xSaI, etc, and they are definitely written to run on CPUs and be scaled exactly to twice the size before blitting to the screen.

Moving on to GPU fragment shaders, these algorithms can be done essentially for free. I'm working with OpenGL and Cg/GLSL, but this question should apply to Direct3D/HLSL coders as well.

The major problem is that these algorithms blend against neighboring pixels using some algorithm to decide on the color. However, I found this concept quite hard with shader languages. Generally with fragment shaders you are able to get a floating point texture coordinate, which you can use to do texture lookups, usually with GL_LINEAR used as texture filter. Most pixels shaders use GL_NEAREST, and do the smoothing themselves.

The problem occurs if I want to find, say, the exact neighbor pixel. I've seen some implementations but they occasionally cause artifacts on the screen. Probably due to floating point inaccuracies that occur. I've found that most of the artifacts simply disappear when using power-of-two sized textures, which further strengthens my belief there are floating point inaccuracies going on. Here is a sample fragment shader in Cg that shows the issues:

struct output
{
   float4 color : COLOR;
};

struct input
{
  float2 video_size;
  float2 texture_size;
  float2 output_size;
};

struct deltas
{
   float2 UL, UR, DL, DR;
};


output main_fragment (float2 tex : TEXCOORD0, uniform input IN, uniform sampler2D s_p : TEXUNIT0)
{
   float2 texsize = IN.texture_size;
   float dx = pow(texsize.x, -1.0) * 0.25;
   float dy = pow(texsize.y, -1.0) * 0.25;
   float3 dt = float3(1.0, 1.0, 1.0);

   deltas VAR = { 
      tex + float2(-dx, -dy),
      tex + float2(dx, -dy),
      tex + float2(-dx, dy),
      tex + float2(dx, dy)
   };

   float3 c00 = tex2D(s_p, VAR.UL).xyz;
   float3 c20 = tex2D(s_p, VAR.UR).xyz;
   float3 c02 = tex2D(s_p, VAR.DL).xyz;
   float3 c22 = tex2D(s_p, VAR.DR).xyz;

   float m1=dot(abs(c00-c22),dt)+0.001;
   float m2=dot(abs(c02-c20),dt)+0.001;

   output OUT;
   OUT.color = float4((m1*(c02+c20)+m2*(c22+c00))/(2.0*(m1+m2)),1.0);
   return OUT;
}

Is there some way to make sure that we can grab the color data from the pixel we expect and not a different one? I believe this problem occurs since we might be query a pixel from a coordinate that is just between two pixels (if that makes sense). Hopefully there is some built-in function into these shader languages I'm overlooking.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

极度宠爱 2024-10-24 11:22:33

当然，在 OpenGL 中，有多种方法：

在着色器内，使用 texelFetch 函数进行纹理获取，该函数使用整数非标准化坐标。在 GL_EXT_gpu_shader4 扩展或 OpenGL 3.0 中可用。
使用纹理矩形（GL_ARB_texture_rectangle，OpenGL 3.2），此纹理目标使用非标准化坐标，但不允许 mipmap 并限制某些包装模式。

非标准化坐标就是你想要的，它们在 [0, 0] x [w, h] 范围/

回复收藏 0 原文

~没有更多了~