调度超过 65535 个线程

发布于 2024-11-16 12:59:20 字数 1119 浏览 2 评论 0原文

我正在尝试使用 DirectCompute 对顶点进行蒙皮。所采用的蒙皮方法使得您可以拥有影响每个顶点的可变数量的权重(例如,Md5 网格就是以这种方式定义的)。

基本上计算着色器的输入是。

JointsBuffer { float4 orientation, float4 position } Structured buffer SRV
WeightsBuffer { float3 normal, float4 position, float bias, uint jointIndex } Structured buffer SRV
VerticesBuffer { float2 texcoords, uint weightIndex, uint numWeights } Structured buffer SRV

现在,

SkinnedVerticesBuffer { float3 normal, float4 position, float2 texcoord } Structured buffer UAV

计算着色器应该为顶点缓冲区中的每个元素运行一次,并使用 SV_DispatchThreadID,着色器尝试为 VerticesBuffer 中的每个顶点填充 SkinnedVerticesBuffer 中相应的 SkinnedVertex(1:1 对应关系)。

因此,问题在于许多网格体的顶点数超过 65535 个,而 DispatchThreadID 命令只允许为每个维度分派那么多线程。现在我理论上可以写一些东西,将很多数字分成小于 65535 的三个因数的组合,但对于质数我不可能做到这一点。

因此,例如,当出现一些具有 71993 个(质数)顶点的网格时,我想不出处理它的方法。

我不能过度调度 72000 个带有 context->Dispatch( 36000, 2, 0 ) 的线程,因为这样 DispatchThreadID 将超出我的缓冲区范围。

现在我倾向于使用一个恒定的缓冲区来保存顶点的数量,然后过度调度到最接近的2的幂,然后简单地做

if( SV_DispatchThreadID > numVertices ) return;

这是我唯一的选择吗?其他人都会遇到这个障碍。

I'm attempting to skin vertices using DirectCompute. The method of skinning employed is such that you can have a variable amount of weights influencing each vertex (e.g. Md5 meshes are defined this way).

Basically inputs to the compute shader are.

JointsBuffer { float4 orientation, float4 position } Structured buffer SRV
WeightsBuffer { float3 normal, float4 position, float bias, uint jointIndex } Structured buffer SRV
VerticesBuffer { float2 texcoords, uint weightIndex, uint numWeights } Structured buffer SRV

and the output is

SkinnedVerticesBuffer { float3 normal, float4 position, float2 texcoord } Structured buffer UAV

Now the compute shader should be run once per element in the vertex buffer, and using SV_DispatchThreadID the shader attempts to populate the corresponding SkinnedVertex in the SkinnedVerticesBuffer for every Vertex in the VerticesBuffer ( 1:1 correspondence ).

So the problem is that many meshes have greater than 65535 vertices, and the DispatchThreadID command only allows for dispatching that many threads per dimension. Now I can theoretically write something that divides a lot of numbers up into a combination of three factors less than 65535, but I can't possibly do that for prime numbers.

So for example when some mesh with 71993 ( a prime number ) of vertices comes up I can't think of a way to handle it.

I can't over dispatch say 72000 threads with context->Dispatch( 36000, 2, 0 ), because then DispatchThreadID will run out of my buffer bounds.

Right now I'm leaning towards a constant buffer holding the amount of vertices, and then over dispatching to the nearest power of 2 and then simply doing

if( SV_DispatchThreadID > numVertices ) return;

Is this my only option? Anyone else run into this snag.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

流年已逝 2024-11-23 12:59:20

我从来没有。但 65000 个线程似乎太多了。

然后,当我尝试查找 文档 看来你传递的值不是线程,而是线程组。 gamedev 上的某人在通过时似乎存在性能问题一个像 768 这样大的数字,所以在我看来,你必须减少这个巨大的数字。

我不确定,但我感觉你误解了这些参数。尝试再次阅读这些值的实际含义。 (不过,这只是外行人的直觉。)

I've never. But 65000 threads seems like an awful lot.

Then, when I try to find documentation it seems that the values you pass are not threads, but thread groups. Someone on gamedev seems to have performance issues when passing a number as great as 768, so it seems to me that you will have to decrease that huge number.

I'm not sure, but I got the feeling you're misinterpreting these parameters. Try to read again what these values actually mean. (Just a layman's gut feeling, though.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文