DirectX/Cuda/OpenGL 可访问的总内存/纹理内存

发布于 2024-09-12 10:35:56 字数 464 浏览 7 评论 0原文

有人可以解释一下 Cuda 上下文中使用的纹理内存与 DirectX 上下文中使用的纹理内存的区别吗?假设一个显卡有512MB的标称内存,它是如何划分为常量内存/纹理内存和全局内存的。

例如,我有一张特斯拉卡,通过 cudaGetDeviceProperties 查询,totalConstMem 为 64KB,totalGlobalMem 为 4GB,但没有变量告诉我需要多少纹理内存。

另外,通过 DirectX 等图形 API 访问时,“纹理内存”有多少。我没有这些 API 的编程经验,所以我不知道它们如何访问以及访问什么样的内存。但据我所知,所有访问的内存都是硬件缓存的。如果我错了,请纠正我。

在 KoppeKTop 的回答之后:那么在 CUDA 和 DirectX 的情况下,共享内存是否充当纹理内存的自动缓存?无论如何,我不认为拥有另一个硬件缓存是有意义的。这是否也意味着如果我在内核中使用整个共享内存,纹理内存将不会被缓存?

谢谢。

Can someone please explain the difference in texture memory as used in the context of Cuda as opposed to texture memory used in the context of DirectX. Suppose a graphics card has 512 MB of advertised memory, how is it divided into constant memory/texture memory and global memory.

E.g. I have a tesla card that has totalConstMem as 64KB and totalGlobalMem as 4GB, as queried by cudaGetDeviceProperties, but there is no variable that tells me how much of texture memory is required.

Also, how much is "Texture memory" when accessed via DirectX etc graphics APIs. I don't have experience programming in these APIs, so I don't know how and what kind of memory can they access. But AFAIK, all the memory is access is hardware-cached. Please correct me if I'm wrong.

After KoppeKTop's answer: So does the shared memory act as automatic cache for texture memory in case of CUDA and DirectX both? I don't suppose having another h/w cache would make sense anyway. Does it also mean that if I'm using the whole of shared memory in a kernel, texture memory wouldn't get cached?

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

不甘平庸 2024-09-19 10:35:56

事实上,我从未接触过 DirectX,但我可以用 CUDA 纹理来解释这种情况。纹理是具有缓存只读访问权限的简单数组(cudaArray 或倾斜数组),存储在全局内存中。因此,512 MB 卡上一个大纹理的最大大小为 512 Megs(实际上稍微小一点,但还不够)。它针对访问 2D 空间中的数据进行了优化(它被缓存为 2D 切片)。坐标和值也可以在访问时进行转换(有关详细信息,请参阅 CUDA 编程指南)。

不,并非所有内存都会在访问时缓存(对于具有计算能力 1.x 的 CUDA 设备)。只有常量和纹理内存。计算能力 >= 2.0 (Fermi) 的设备使用 L1 和 L2 缓存(或仅 L2 - 可配置)缓存所有内存访问。

Actually, I had never deal with DirectX, but I could explain the situation with CUDA textures. Texture is simple array (cudaArray or pitched array) with cached read-only access, stored in global memory. So, maximum size of one big texture on 512 MB card is 512 Megs (actually a little bit less, but it's not sufficient). It's optimized to accessing data in 2D space (it's cached as 2D slices). Also coordinates and values could be transformed on access (see CUDA Programming Guide for details).

And no, not all memory is cached on access (for CUDA devices with compute capability 1.x). Only constant and texture memory. Devices with compute capability >= 2.0 (Fermi) caches all memory accesses using L1 and L2 caches (or only L2 - it's configurable).

听,心雨的声音 2024-09-19 10:35:56

在 KoppeKTop 的回答之后:
共享内存充当自动缓存
对于 CUDA 情况下的纹理内存和
DirectX 两者都有吗?我不认为有
另一个硬件缓存是有意义的
反正。这是否也意味着如果我
使用整个共享内存
内核、纹理内存无法获取
缓存?

对于 GF100 之前的一代 (G80),GPU 有专用的全局常量和全局纹理缓存(两者都是只读的)。共享内存有自己的专用内存库。

对于 GF100 一代,您仍然拥有专用纹理缓存,但现在共享内存和 L1 缓存(缓存全局内存)之间共享相同的片上内存。如果您使用 CUDA,您可以配置如何分配该内存。对于 DirectX/OpenGL,图形驱动程序使用 48KB 共享内存/16KB L1 缓存配置。

在任何情况下,共享内存始终由软件管理(除非 GF100 上专用于 L1 缓存的部分),并且不会消耗纹理缓存。

After KoppeKTop's answer: So does the
shared memory act as automatic cache
for texture memory in case of CUDA and
DirectX both? I don't suppose having
another h/w cache would make sense
anyway. Does it also mean that if I'm
using the whole of shared memory in a
kernel, texture memory wouldn't get
cached?

For pre-GF100 generation (G80), GPU have dedicated global constant and global texture caches (both are read-only). Shared-memory have their own dedicated memory banks.

For GF100 generation, you still have dedicated texture cache, but the same on-chip memory is now shared between shared-memory and L1 cache (caching global memory). You can configurate how this memory is splitted if you use CUDA. For DirectX/OpenGL the graphics driver uses a 48KB shared memory/16KB L1 cache configuration.

In any case shared-memory is always software-managed (unless the part dedicated to L1 cache on GF100), and don't eat up on texture caches.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文