OpenCL 适合什么类型的代码域?

发布于 2024-12-29 13:17:31 字数 430 浏览 7 评论 0原文

我阅读了 OpenCL 概述,它指出它适合运行 CPU、GPGPU、DSP 等的代码。但是,从命令参考来看,它似乎都是数学和图像类型操作。我没有看到任何关于字符串的内容。

这让我想知道您会通过 OpenCL 在 CPU 上运行什么?

此外,我知道 OpenCL 可用于在 GPGPU 上执行排序。但是人们是否会使用它(或者就这一点而言,当前的 GPGPU)来执行字符串处理,例如模式匹配、变音位提取、字典查找或任何其他需要处理字符串数组的操作。

编辑 我注意到英特尔即将推出的 Ivy Bridge 因其图形单元而被吹捧为“OpenCL 兼容”。这是否推断 CPU 内核不符合 OpenCL,或者根本没有这样的推断?

编辑 为了避免争论和建设性,如果有人能给我指出可以回答我的问题的官方参考资料,我将不胜感激。

I read the OpenCL overview, and it states it is suitable for code that runs of CPUs, GPGPUs, DSPs, etc. However, from looking through the command reference, it seems to be all math and image type operations. I didn't see anything for say strings.

This makes me wonder what would you run on a CPU via OpenCL?

Further, I know OpenCL can be used to perform sorting on GPGPUs. But would one ever use it (or, for that matter, a current GPGPU) to perform string processing such as pattern matching, metaphone extraction, dictionary lookup, or anything else that requires the processing of arrays of strings.

EDIT
I noticed that Intel's upcoming Ivy Bridge is touted as "OpenCL compliant" with reference to its graphics units. Does this infer that the CPU cores are not OpenCL compliant, or is there no such inference?

EDIT
In the interests of non-debate and constructiveness, I would appreciate if anyone could point me to official references that would answer my question.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

七度光 2025-01-05 13:17:32

您可以将 OpenCL 视为运行时(用于设备发现、排队)和基于 C 的编程语言的组合。这种编程语言具有原生向量类型以及内置函数和操作,可以对这些向量执行各种有趣的操作。这很好,因为您可以在 OpenCL 中编写矢量化内核,并且实现负责将其映射到硬件的实际矢量 ISA。

从这个 4/2011 文章,可能会消失:

目前有两种主要的 CPU 架构,x86 和 ARM,两者都
它将很快运行 OpenCL 代码。

如果您编写一个针对这两种架构的 OpenCL 应用程序,您就不必担心编写两个版本:一个 SSE 和一个 NEON。只需编写 OpenCL C 即可完成。是的,我知道。这假设供应商已经完成了他的工作并编写了充分利用底层 ISA 的可靠实现。但如果他不这样做,就投诉吧!

此外,一些 CL 实现提供标量内核的自动矢量化,这通常更容易编写。一个好的自动矢量化器可以让您毫不费力地获得可靠的性能提升。由于 CL 内核是“在线”编译的,因此获得这样的好处不需要交付重建的代码。

You can think of OpenCL as a combination of a runtime (for device discovery, queueing) and a C-based programming language. This programming language has native vector types and built-in functions and operations for doing all sorts fun stuff to these vectors. This is nice in that you can write a vectorized kernel in OpenCL, and it it the responsibility of the implementation to map that to the actual vector ISA of your hardware.

From this 4/2011 article, which might vanish:

There are two major CPU architectures out there, x86 and ARM, both of
which should soon run OpenCL code.

If you write an OpenCL application that targets both of these architectures, you wouldn't have to worry about writing two versions, one SSE and one NEON. Just write OpenCL C and be done with it. Yes, I know. This assumes the vendor has done his job and written a solid implementation that fully utilizes the underlying ISA. But if he doesn't, complain!

In addition, some CL implementations offer auto-vectorization of scalar kernels, which are usually easier to write. A good auto-vectorizer would give you a solid performance increase for no effort. Since CL kernels are compiled "online," obtaining such a benefit wouldn't require shipping rebuilt code.

瀟灑尐姊 2025-01-05 13:17:32

没有链接,但我认为这是因为使用字符串的算法可能会执行大量动态内存分配和分支,而 GPGPU 不太适合这两​​者。 GPGPU 与矢量处理也有很多共同点,因此使用不同大小的内存块(字符串算法通常会在其上工作,通常没有同质的字符串组)执行工作单元,会产生较差的性能,并且很难编程。

GPU 被设计为在同质数据组(例如每个向量或每个像素操作)上执行相同的工作,几乎没有分支。可以模仿此类行为的算法在 GPU 上非常有用。

No links, but I would assume this is because algorithms that use strings may do a lot of dynamic memory allocation and branching, both of which GPGPUs are not well-suited for. GPGPUs also have a lot in common with vector processing, so doing units of work with different sized blocks of memory (which a string algorithm will generally work on, you usually don't have a homogeneous group of strings), yields poorer performance and is hard to program.

GPUs were designed to do the same work, with little to no branching, on a homogeneous group of data (such as per-vector or per-pixel operations). Algorithms that can mimic this type of behavior are great on GPUs.

浸婚纱 2025-01-05 13:17:32

这让我想知道你会通过 OpenCL 在 CPU 上运行什么?

我更喜欢使用 ocl 将工作从 cpu 卸载到我的图形硬件。有时我的显卡有限制,所以我喜欢有一个备份内核供 CPU 使用。此类限制可能是内存大小、内存瓶颈、低时钟速度或 PCI-E 总线妨碍的情况。

我说我喜欢为 cpu 使用单独的内核,因为我认为所有内核都应该调整为在其目标硬件上运行。我什至喜欢有一个 openmp 备份计划,因为我使用的大多数算法都会提前以这种方式进行测试。

我认为最好的做法是在 CPU 上测试 GPU 内核,以确保它按预期运行。如果您的软件的用户安装了 opencl,但只有一个 cpu(或低端 GPU),那么能够在不同设备上执行相同的代码是很好的。

This makes me wonder what would you run on a CPU via OpenCL?

I prefer to use ocl to offload work from the cpu to my graphics hardware. Sometimes there is a limitation with my video card, so I like having a backup kernel for cpu use. Such limitations can be memory size, memory bottleneck, low clock speed, or when the pci-e bus gets in the way.

I say I like using a separate kernel for cpu, because I think all kernels should be tweaked to run on their target hardware. I even like to have an openmp backup plan, as most algorithms I use get tested out in this manner ahead of time.

I suppose it is best practice to test out a gpu kernel on the cpu to make sure it runs as expected. If a user of your software has opencl installed, but only a cpu (or a low-end gpu) it's nice to be able to execute the same code on the different devices.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文