SIMD编程语言

发布于 2024-08-04 12:53:46 字数 1540 浏览 9 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

凡尘雨 2024-08-11 12:53:46

所以最近英特尔发布了 ISPC 这正是我在问这个问题时所寻找的。它是一种可以与普通 C 代码链接的语言,具有隐式执行模型,并支持开始帖子中提到的所有功能(混合运算符、分支、数据结构、矢量操作、着色器等)并针对 SSE2、SSE4 进行编译, AVX、AVX2 和 Xeon Phi 矢量指令。

So recently Intel released ISPC which is exactly what I was looking for when asking this question. It's a language that can link with normal C code, has and implicit execution model, and support for all the features mentioned in the start post (swizzle operators, branching, data structs, vector ops, shader like) and compiles for SSE2, SSE4, AVX, AVX2, and Xeon Phi vector instructions.

死开点丶别碍眼 2024-08-11 12:53:46

你最好的选择可能是 OpenCL。我知道它主要被宣传为在 GPU 上运行代码的一种方式,但 OpenCL 内核也可以在 CPU 上编译和运行。 OpenCL 基本上是 C,但有一些限制:

  1. 没有函数指针
  2. 没有递归

和一堆添加。特别是向量类型:

float4 x = float4(1.0f, 2.0f, 3.0f, 4.0f);
float4 y = float4(10.0f, 10.0f, 10.0f, 10.0f);

float4 z = y + x.s3210 // add the vector y with a swizzle of x that reverses the element order

需要注意的是,代码必须完全可操作,OpenCL 无法调用任意库等。但是,如果您的计算内核相当独立,那么您基本上会得到一个向量增强的 C,而您不需要不需要使用内在函数。

这里是包含所有扩展的快速参考/备忘单。

Your best bet is probably OpenCL. I know it has mostly been hyped as a way to run code on GPUs, but OpenCL kernels can also be compiled and run on CPUs. OpenCL is basically C with a few restrictions:

  1. No function pointers
  2. No recursion

and a bunch of additions. In particular vector types:

float4 x = float4(1.0f, 2.0f, 3.0f, 4.0f);
float4 y = float4(10.0f, 10.0f, 10.0f, 10.0f);

float4 z = y + x.s3210 // add the vector y with a swizzle of x that reverses the element order

On big caveat is that the code has to be cleanly sperable, OpenCL can't call out to arbitrary libraries, etc. But if your compute kernels are reasonably independent then you basically get a vector enhanced C where you don't need to use intrinsics.

Here is a quick reference/cheatsheet with all of the extensions.

爱要勇敢去追 2024-08-11 12:53:46

它并不是真正的语言本身,但有一个 Mono 库(Mono.Simd),它将向您公开向量,并尽可能将它们的操作优化到 SSE 中:

It's not really the language itself, but there is a library for Mono (Mono.Simd) that will expose the vectors to you and optimise the operations on them into SSE whenever possible:

眼趣 2024-08-11 12:53:46

它是一个 C++ 库,而不是内置于该语言中,但是一旦声明了变量,Eigen 就几乎不可见了。

It's a library for C++, rather than built into the language, but Eigen is pretty invisible once your variables are declared.

箜明 2024-08-11 12:53:46

目前最好的解决方案是自己为 Nvidia 发布的开源 Cg 前端创建一个后端,但我想节省自己的精力,所以我很好奇以前是否有人这样做过。我最好立即开始使用它。

Currently the best solution is to do it myself by creating a back-end for the open-source Cg frontend that Nvidia released, but I'd like to save myself the effort so I'm curious if it's been done before. Preferably I'd start using it right away.

会傲 2024-08-11 12:53:46

D 编程语言还以与 Mono.SIMD 类似的方式提供对 SIMD 的访问。

The D programming language also provides access to SIMD in a similar way than Mono.SIMD.

阳光的暖冬 2024-08-11 12:53:46

这就是您正在寻找的 Fortran。如果内存足够,即使是开源编译器(g95、gfortran)也会利用 SSE(如果它是在您的硬件上实现的)。

That would be Fortran that you are looking for. If memory serves even the open-source compilers (g95, gfortran) will take advantage of SSE if it's implemented on your hardware.

装迷糊 2024-08-11 12:53:46

我知道这个问题有点老了,但我发现自己处于类似的困境,并决定我应该自己做。

我还没有走得太远,但如果你对我正在探索的方向感兴趣,可能值得一看。 :)

https://github.com/HappMacDonald/MasterBlaster

MasterBlaster 是一种函数式编程语言,但它是将编译成字节码,最终是它自己的更简单的基于堆栈的语言,称为 Crude。然后直接编译成程序集。

我的策略是 SIMD 优先:未优化的可执行文件将几乎完全使用 SIMD,然后潜在的优化之一是将无法从 SIMD 受益的代码简化为仅使用通用寄存器。

Crude 已达到图灵完备阶段,但目前仅以几十个 GAS 宏的形式存在。我正在为其开发一个独立的编译器,并构建迭代器/生成器功能,这些功能是 SIMD 加速方面的明星。

目前还没有矢量矩阵等支持,但这已经在路线图上,在编写该语法时我可能会记住您的描述。 :)

I know this question is a bit old, but I found myself in a similar predicament and decided I should just make my own.

I haven't gotten very far yet in the slightest, but if you're interested in the directions that I'm exploring it might be worth a look. :)

https://github.com/HappMacDonald/MasterBlaster

MasterBlaster is a functional programming language, but it's going to compile down into a bytecode that is ultimately it's own much simpler stack-based language called Crude. Crude then compiles directly into assembly.

My strategy is a SIMD-first one: unoptimized executables will use almost entirely SIMD, and then one of the potential optimizations will be to simplify code that isn't benefiting from SIMD into using only general registers instead.

Crude is up to the turing-complete stage, but only exists as a few dozen GAS macros presently. I'm working towards a self-contained compiler for it, and building out the iterator/generator features that are the stars of the show when it comes to SIMD acceleration.

No vector-matrix-etc support just yet, but that is on the roadmap and I'll probably bear your description in mind when writing up that syntax. :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文