Cuda 中的线程索引

发布于 2024-10-14 22:11:08 字数 5410 浏览 6 评论 0原文

我在我的 8000 系列设备(支持 CUDA)上运行以下代码:

#include <stdio.h>
__global__ void testSet(int * MyBlock)
{
   unsigned int ThreadIDX= threadIdx.x+blockDim.x*blockIdx.x;
   MyBlock[ThreadIDX]=ThreadIDX;
}

int main()
{
  int * MyInts;
  int Result[1024];
  cudaMalloc( (void**) &MyInts,sizeof(int)*1024);
  testSet<<<2,512>>>(MyInts);
  cudaMemcpy(Result,MyInts,sizeof(int)*1024,cudaMemcpyDeviceToHost);
  for(unsigned int t=0; t<1024/8;t++) {
        printf("Results: %d %d %d %d %d %d %d %d\n",
               Result[t], Result[t+1],Result[t+2],
               Result[t+3],Result[t+4],Result[t+5],
               Result[t+6],Result[t+7]);
  }
  return 0;
}

然后我退出...

Results: 0 1 2 3 4 5 6 7
Results: 1 2 3 4 5 6 7 8
Results: 2 3 4 5 6 7 8 9
Results: 3 4 5 6 7 8 9 10
Results: 4 5 6 7 8 9 10 11
Results: 5 6 7 8 9 10 11 12
Results: 6 7 8 9 10 11 12 13
Results: 7 8 9 10 11 12 13 14
Results: 8 9 10 11 12 13 14 15
Results: 9 10 11 12 13 14 15 16
Results: 10 11 12 13 14 15 16 17
Results: 11 12 13 14 15 16 17 18
Results: 12 13 14 15 16 17 18 19
Results: 13 14 15 16 17 18 19 20
Results: 14 15 16 17 18 19 20 21
Results: 15 16 17 18 19 20 21 22
Results: 16 17 18 19 20 21 22 23
Results: 17 18 19 20 21 22 23 24
Results: 18 19 20 21 22 23 24 25
Results: 19 20 21 22 23 24 25 26
Results: 20 21 22 23 24 25 26 27
Results: 21 22 23 24 25 26 27 28
Results: 22 23 24 25 26 27 28 29
Results: 23 24 25 26 27 28 29 30
Results: 24 25 26 27 28 29 30 31
Results: 25 26 27 28 29 30 31 32
Results: 26 27 28 29 30 31 32 33
Results: 27 28 29 30 31 32 33 34
Results: 28 29 30 31 32 33 34 35
Results: 29 30 31 32 33 34 35 36
Results: 30 31 32 33 34 35 36 37
Results: 31 32 33 34 35 36 37 38
Results: 32 33 34 35 36 37 38 39
Results: 33 34 35 36 37 38 39 40
Results: 34 35 36 37 38 39 40 41
Results: 35 36 37 38 39 40 41 42
Results: 36 37 38 39 40 41 42 43
Results: 37 38 39 40 41 42 43 44
Results: 38 39 40 41 42 43 44 45
Results: 39 40 41 42 43 44 45 46
Results: 40 41 42 43 44 45 46 47
Results: 41 42 43 44 45 46 47 48
Results: 42 43 44 45 46 47 48 49
Results: 43 44 45 46 47 48 49 50
Results: 44 45 46 47 48 49 50 51
Results: 45 46 47 48 49 50 51 52
Results: 46 47 48 49 50 51 52 53
Results: 47 48 49 50 51 52 53 54
Results: 48 49 50 51 52 53 54 55
Results: 49 50 51 52 53 54 55 56
Results: 50 51 52 53 54 55 56 57
Results: 51 52 53 54 55 56 57 58
Results: 52 53 54 55 56 57 58 59
Results: 53 54 55 56 57 58 59 60
Results: 54 55 56 57 58 59 60 61
Results: 55 56 57 58 59 60 61 62
Results: 56 57 58 59 60 61 62 63
Results: 57 58 59 60 61 62 63 64
Results: 58 59 60 61 62 63 64 65
Results: 59 60 61 62 63 64 65 66
Results: 60 61 62 63 64 65 66 67
Results: 61 62 63 64 65 66 67 68
Results: 62 63 64 65 66 67 68 69
Results: 63 64 65 66 67 68 69 70
Results: 64 65 66 67 68 69 70 71
Results: 65 66 67 68 69 70 71 72
Results: 66 67 68 69 70 71 72 73
Results: 67 68 69 70 71 72 73 74
Results: 68 69 70 71 72 73 74 75
Results: 69 70 71 72 73 74 75 76
Results: 70 71 72 73 74 75 76 77
Results: 71 72 73 74 75 76 77 78
Results: 72 73 74 75 76 77 78 79
Results: 73 74 75 76 77 78 79 80
Results: 74 75 76 77 78 79 80 81
Results: 75 76 77 78 79 80 81 82
Results: 76 77 78 79 80 81 82 83
Results: 77 78 79 80 81 82 83 84
Results: 78 79 80 81 82 83 84 85
Results: 79 80 81 82 83 84 85 86
Results: 80 81 82 83 84 85 86 87
Results: 81 82 83 84 85 86 87 88
Results: 82 83 84 85 86 87 88 89
Results: 83 84 85 86 87 88 89 90
Results: 84 85 86 87 88 89 90 91
Results: 85 86 87 88 89 90 91 92
Results: 86 87 88 89 90 91 92 93
Results: 87 88 89 90 91 92 93 94
Results: 88 89 90 91 92 93 94 95
Results: 89 90 91 92 93 94 95 96
Results: 90 91 92 93 94 95 96 97
Results: 91 92 93 94 95 96 97 98
Results: 92 93 94 95 96 97 98 99
Results: 93 94 95 96 97 98 99 100
Results: 94 95 96 97 98 99 100 101
Results: 95 96 97 98 99 100 101 102
Results: 96 97 98 99 100 101 102 103
Results: 97 98 99 100 101 102 103 104
Results: 98 99 100 101 102 103 104 105
Results: 99 100 101 102 103 104 105 106
Results: 100 101 102 103 104 105 106 107
Results: 101 102 103 104 105 106 107 108
Results: 102 103 104 105 106 107 108 109
Results: 103 104 105 106 107 108 109 110
Results: 104 105 106 107 108 109 110 111
Results: 105 106 107 108 109 110 111 112
Results: 106 107 108 109 110 111 112 113
Results: 107 108 109 110 111 112 113 114
Results: 108 109 110 111 112 113 114 115
Results: 109 110 111 112 113 114 115 116
Results: 110 111 112 113 114 115 116 117
Results: 111 112 113 114 115 116 117 118
Results: 112 113 114 115 116 117 118 119
Results: 113 114 115 116 117 118 119 120
Results: 114 115 116 117 118 119 120 121
Results: 115 116 117 118 119 120 121 122
Results: 116 117 118 119 120 121 122 123
Results: 117 118 119 120 121 122 123 124
Results: 118 119 120 121 122 123 124 125
Results: 119 120 121 122 123 124 125 126
Results: 120 121 122 123 124 125 126 127
Results: 121 122 123 124 125 126 127 128
Results: 122 123 124 125 126 127 128 129
Results: 123 124 125 126 127 128 129 130
Results: 124 125 126 127 128 129 130 131
Results: 125 126 127 128 129 130 131 132
Results: 126 127 128 129 130 131 132 133
Results: 127 128 129 130 131 132 133 134

难道我不希望打印 0..1024 吗?

我误解了什么吗?我阅读了 NVIDIA CUDA 编程指南的介绍部分,我认为事情就是这样的。

当然,到目前为止,我遇到了很多烦人的错误/设计限制(例如 8000 系列缺乏双精度支持),以及如果您使用“std::. ..”而不是一般的“使用命名空间 std;”

所以我想我期待一些古怪的事情......

但我只是迫切想弄清楚这里到底发生了什么......

I'm running the following code on my 8000 series device (which supports CUDA):

#include <stdio.h>
__global__ void testSet(int * MyBlock)
{
   unsigned int ThreadIDX= threadIdx.x+blockDim.x*blockIdx.x;
   MyBlock[ThreadIDX]=ThreadIDX;
}

int main()
{
  int * MyInts;
  int Result[1024];
  cudaMalloc( (void**) &MyInts,sizeof(int)*1024);
  testSet<<<2,512>>>(MyInts);
  cudaMemcpy(Result,MyInts,sizeof(int)*1024,cudaMemcpyDeviceToHost);
  for(unsigned int t=0; t<1024/8;t++) {
        printf("Results: %d %d %d %d %d %d %d %d\n",
               Result[t], Result[t+1],Result[t+2],
               Result[t+3],Result[t+4],Result[t+5],
               Result[t+6],Result[t+7]);
  }
  return 0;
}

And I get out...

Results: 0 1 2 3 4 5 6 7
Results: 1 2 3 4 5 6 7 8
Results: 2 3 4 5 6 7 8 9
Results: 3 4 5 6 7 8 9 10
Results: 4 5 6 7 8 9 10 11
Results: 5 6 7 8 9 10 11 12
Results: 6 7 8 9 10 11 12 13
Results: 7 8 9 10 11 12 13 14
Results: 8 9 10 11 12 13 14 15
Results: 9 10 11 12 13 14 15 16
Results: 10 11 12 13 14 15 16 17
Results: 11 12 13 14 15 16 17 18
Results: 12 13 14 15 16 17 18 19
Results: 13 14 15 16 17 18 19 20
Results: 14 15 16 17 18 19 20 21
Results: 15 16 17 18 19 20 21 22
Results: 16 17 18 19 20 21 22 23
Results: 17 18 19 20 21 22 23 24
Results: 18 19 20 21 22 23 24 25
Results: 19 20 21 22 23 24 25 26
Results: 20 21 22 23 24 25 26 27
Results: 21 22 23 24 25 26 27 28
Results: 22 23 24 25 26 27 28 29
Results: 23 24 25 26 27 28 29 30
Results: 24 25 26 27 28 29 30 31
Results: 25 26 27 28 29 30 31 32
Results: 26 27 28 29 30 31 32 33
Results: 27 28 29 30 31 32 33 34
Results: 28 29 30 31 32 33 34 35
Results: 29 30 31 32 33 34 35 36
Results: 30 31 32 33 34 35 36 37
Results: 31 32 33 34 35 36 37 38
Results: 32 33 34 35 36 37 38 39
Results: 33 34 35 36 37 38 39 40
Results: 34 35 36 37 38 39 40 41
Results: 35 36 37 38 39 40 41 42
Results: 36 37 38 39 40 41 42 43
Results: 37 38 39 40 41 42 43 44
Results: 38 39 40 41 42 43 44 45
Results: 39 40 41 42 43 44 45 46
Results: 40 41 42 43 44 45 46 47
Results: 41 42 43 44 45 46 47 48
Results: 42 43 44 45 46 47 48 49
Results: 43 44 45 46 47 48 49 50
Results: 44 45 46 47 48 49 50 51
Results: 45 46 47 48 49 50 51 52
Results: 46 47 48 49 50 51 52 53
Results: 47 48 49 50 51 52 53 54
Results: 48 49 50 51 52 53 54 55
Results: 49 50 51 52 53 54 55 56
Results: 50 51 52 53 54 55 56 57
Results: 51 52 53 54 55 56 57 58
Results: 52 53 54 55 56 57 58 59
Results: 53 54 55 56 57 58 59 60
Results: 54 55 56 57 58 59 60 61
Results: 55 56 57 58 59 60 61 62
Results: 56 57 58 59 60 61 62 63
Results: 57 58 59 60 61 62 63 64
Results: 58 59 60 61 62 63 64 65
Results: 59 60 61 62 63 64 65 66
Results: 60 61 62 63 64 65 66 67
Results: 61 62 63 64 65 66 67 68
Results: 62 63 64 65 66 67 68 69
Results: 63 64 65 66 67 68 69 70
Results: 64 65 66 67 68 69 70 71
Results: 65 66 67 68 69 70 71 72
Results: 66 67 68 69 70 71 72 73
Results: 67 68 69 70 71 72 73 74
Results: 68 69 70 71 72 73 74 75
Results: 69 70 71 72 73 74 75 76
Results: 70 71 72 73 74 75 76 77
Results: 71 72 73 74 75 76 77 78
Results: 72 73 74 75 76 77 78 79
Results: 73 74 75 76 77 78 79 80
Results: 74 75 76 77 78 79 80 81
Results: 75 76 77 78 79 80 81 82
Results: 76 77 78 79 80 81 82 83
Results: 77 78 79 80 81 82 83 84
Results: 78 79 80 81 82 83 84 85
Results: 79 80 81 82 83 84 85 86
Results: 80 81 82 83 84 85 86 87
Results: 81 82 83 84 85 86 87 88
Results: 82 83 84 85 86 87 88 89
Results: 83 84 85 86 87 88 89 90
Results: 84 85 86 87 88 89 90 91
Results: 85 86 87 88 89 90 91 92
Results: 86 87 88 89 90 91 92 93
Results: 87 88 89 90 91 92 93 94
Results: 88 89 90 91 92 93 94 95
Results: 89 90 91 92 93 94 95 96
Results: 90 91 92 93 94 95 96 97
Results: 91 92 93 94 95 96 97 98
Results: 92 93 94 95 96 97 98 99
Results: 93 94 95 96 97 98 99 100
Results: 94 95 96 97 98 99 100 101
Results: 95 96 97 98 99 100 101 102
Results: 96 97 98 99 100 101 102 103
Results: 97 98 99 100 101 102 103 104
Results: 98 99 100 101 102 103 104 105
Results: 99 100 101 102 103 104 105 106
Results: 100 101 102 103 104 105 106 107
Results: 101 102 103 104 105 106 107 108
Results: 102 103 104 105 106 107 108 109
Results: 103 104 105 106 107 108 109 110
Results: 104 105 106 107 108 109 110 111
Results: 105 106 107 108 109 110 111 112
Results: 106 107 108 109 110 111 112 113
Results: 107 108 109 110 111 112 113 114
Results: 108 109 110 111 112 113 114 115
Results: 109 110 111 112 113 114 115 116
Results: 110 111 112 113 114 115 116 117
Results: 111 112 113 114 115 116 117 118
Results: 112 113 114 115 116 117 118 119
Results: 113 114 115 116 117 118 119 120
Results: 114 115 116 117 118 119 120 121
Results: 115 116 117 118 119 120 121 122
Results: 116 117 118 119 120 121 122 123
Results: 117 118 119 120 121 122 123 124
Results: 118 119 120 121 122 123 124 125
Results: 119 120 121 122 123 124 125 126
Results: 120 121 122 123 124 125 126 127
Results: 121 122 123 124 125 126 127 128
Results: 122 123 124 125 126 127 128 129
Results: 123 124 125 126 127 128 129 130
Results: 124 125 126 127 128 129 130 131
Results: 125 126 127 128 129 130 131 132
Results: 126 127 128 129 130 131 132 133
Results: 127 128 129 130 131 132 133 134

Wouldn't I expect 0..1024 to be printed??

Am I misunderstanding something? I read the intro sections of the NVIDIA CUDA programming guide, and I thought this is how things worked.

Of course I've run into plenty of annoying bugs/design limitations thus far (for example the 8000 series' lack of double precision support) and errors that CUDA causes with iomanip commands (setw, setprecision) if you use "std::..." instead of a general "using namespace std;"

So I guess I expect some whackiness...

But I'm just desperate to figure out what the heck is going on here...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

节枝 2024-10-21 22:11:08

更改

for(unsigned int t=0; t<1024/8;t++) {

为:

for(unsigned int t=0; t<1024; t+=8) {

您有 2 x 512 = 1024 个线程,其索引范围为 0..1023。每个线程都将自己的索引写入 MyBlock 中的相应位置。因此,您期望看到一个其值等于其索引的数组。

Change:

for(unsigned int t=0; t<1024/8;t++) {

to:

for(unsigned int t=0; t<1024; t+=8) {

You have 2 x 512 = 1024 threads, whose indices range from 0..1023. Each thread is writing its own index to the corresponding location in MyBlock. Hence you expect to see an array whose values are equal to its indices.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文