“每个多处理器的最大线程数”和“每个多处理器的最大线程数”有什么区别?和“每个块的最大线程数”在设备查询结果中
执行设备查询时, 我想知道“每个多处理器的最大线程数”和“每个块的最大线程数”之间的区别。据我了解,sm = multiprocessor = GPU上的块,但我不明白为什么这两个值不同。多处理器中是否有多个块?
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
还有一个额外的问题就是thread和core的关系,匹配thread=core是否正确?
When executing device query,
I want to know the difference between "Maximum number of threads per multiprocessor" and "Maximum number of threads per block". As I understood it, sm = multiprocessor = block on the gpu, but I do not understand why the two values are different. Are there multiple blocks in a multiprocessor?
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
And an additional question is the relationship between thread and core, is it correct to match thread = core?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的,可以有。
很简单,sm == 多处理器。 sm != block
SM(多处理器)是一个硬件实体。线程块是一个软件实体,基本上是线程的集合。
SM或多处理器可以有超过1个块驻留。为了充分占用最大线程数为 1536 的 SM,您需要驻留三个 512 线程块。
一个线程代表一个指令序列。 GPU中的“核心”是SM中处理某些指令类型的功能单元,即32位浮点加法、乘法和乘加指令。其他指令类型由SM中的其他(种类)功能单元处理。
当线程需要处理其中一种 32 位浮点指令类型时,该线程将需要一个内核。如果碰巧有不同的指令要处理,例如 LD(加载)指令,则它将需要不同的功能单元,特别是在这种情况/示例中的 LD/ST(加载/存储)单元。
Yes, there can be.
quite simply, sm == multiprocessor. sm != block
A SM (multiprocessor) is a hardware entity. A threadblock is a software entity, basically a collection of threads.
A SM or multiprocessor can have more than 1 block resident. To get full occupancy of an SM that had 1536 max threads, you would need to have something like three 512-thread blocks resident.
A thread represents a sequence of instructions. A "core" in GPU speak is a functional unit in the SM which processes certain instruction types, namely 32-bit floating point add, multiply, and multiply-add instructions. Other instruction types are handled by other (kinds of) functional units in the SM.
A thread will require a core when it has one of those 32-bit floating point instruction types to process. If it happens to have a different instruction to process, say a LD (load) instruction, it will require a different functional unit, specifically, a LD/ST (load/store) unit in that case/example.