许多 CUDA 示例失败
安装新的 CUDA 4.0 驱动程序和 SDK 后,许多 SDK 测试失败(例如 fastWalshTransform、matrixMul、reduction
)。这是 ./deviceQuery
:
Device 0: "GeForce GTX 570"
CUDA Driver Version / Runtime Version 4.0 / 4.0
CUDA Capability Major/Minor version number: 2.0
Total amount of global memory: 1279 MBytes (1341325312 bytes)
(15) Multiprocessors x (32) CUDA Cores/MP: 480 CUDA Cores
GPU Clock Speed: 1.57 GHz
Memory Clock rate: 2100.00 Mhz
Memory Bus Width: 320-bit
L2 Cache Size: 655360 bytes
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 4 / 0
例如,reduction
的输出为:
- GPU 结果 = 2135772699
- CPU 结果 = 2139353471
=> 失败
。
解决方案:这曾经是(现在仍然是)硬件问题(驱动程序更新不能解决问题)。也许是一些内存问题,但很常见。我们有几张 NVIDIA 卡显示了该问题(甚至 Tesla!)。到目前为止,我们找到的唯一解决方案是重新启动机器或稍微增加电压。
After installing fresh CUDA 4.0 drivers and SDK, many SDK tests fail (e.g. fastWalshTransform, matrixMul, reduction
). This is the ./deviceQuery
:
Device 0: "GeForce GTX 570"
CUDA Driver Version / Runtime Version 4.0 / 4.0
CUDA Capability Major/Minor version number: 2.0
Total amount of global memory: 1279 MBytes (1341325312 bytes)
(15) Multiprocessors x (32) CUDA Cores/MP: 480 CUDA Cores
GPU Clock Speed: 1.57 GHz
Memory Clock rate: 2100.00 Mhz
Memory Bus Width: 320-bit
L2 Cache Size: 655360 bytes
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 4 / 0
E.g. output of reduction
is:
- GPU result = 2135772699
- CPU result = 2139353471
=> FAILED
.
Solution: It was (and still is) a hardware problem (driver updates don't solve the problem). Maybe some memory issue but quite common. We have several NVIDIA cards showing that issue (even Tesla!). The only solution we have found so far is to restart the machine or to increase the voltage a little bit.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是(现在仍然是)硬件问题(驱动程序更新不能解决问题)。也许是一些内存问题,但很常见。我们有几张 NVIDIA 卡显示了该问题(甚至 Tesla!)。到目前为止,我们找到的唯一解决方案是重新启动机器或稍微增加电压。
It was (and still is) a hardware problem (driver updates don't solve the problem). Maybe some memory issue but quite common. We have several NVIDIA cards showing that issue (even Tesla!). The only solution we have found so far is to restart the machine or to increase the voltage a little bit.