同时使用多个 SIMD 指令集的好处

发布于 2024-09-01 20:00:49 字数 153 浏览 10 评论 0原文

我正在编写一个高度并行的多线程应用程序。我已经写好了一个SSE加速线程类。如果我要编写一个 MMX 加速线程类,然后同时运行这两个线程(每个核心一个 SSE 线程和一个 MMX 线程),性能会显着提高吗?

我认为这种设置将有助于隐藏内存延迟,但在开始投入时间之前我想确定一下。

I'm writing a highly parallel application that's multithreaded. I've already got an SSE accelerated thread class written. If I were to write an MMX accelerated thread class, then run both at the same time (one SSE thread and one MMX thread per core) would the performance improve noticeably?

I would think that this setup would help hide memory latency, but I'd like to be sure before I start pouring time into it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

没有心的人 2024-09-08 20:00:49

SSE 和 MMX 指令集共享 CPU 中的同一组向量处理执行单元。因此,运行一个 SSE 线程和一个 MMX 线程将具有与运行两个 SSE 线程(或两个 MMX 线程)相同的资源。唯一的区别是指令存在于 SSE 而不是 MMX(因为 SSE 是 MMX 的扩展)。但在这种情况下,MMX 可能会变慢,因为它没有可用的更高级指令。

所以答案是:不,与运行两个 SSE 线程相比,您不会看到性能改进。

The SSE and MMX instruction sets share the same set of vector processing execution units in the CPU. Therefore, running an SSE thread and an MMX thread will have the same resources available each thread as if running two SSE threads (or two MMX threads). The only difference is in instructions which exist in SSE but not MMX (since SSE is an extension of MMX). But in that case the MMX is probably going to be slower because it doesn't have those more advanced instructions available to it.

So the answer is: No, you would not see a performance improvement compared to running two SSE threads.

哽咽笑 2024-09-08 20:00:49

SSE 和 MMX 使用相同的寄存器,因此您使用两者中的哪一个并不重要(当然,除了 MMX 吸吮和 SSE 有用之外)

更好的问题是如何在目标 CPU 上实现 SSE。每个核心是否有一个 SSE 单元? (可能)如果是这样,那么您不妨在每个线程上运行 SSE 指令。

如果核心之间有一个共享的 SSE 单元,那么不同的线程将争夺它,因此在多个线程中执行 SSE 指令不会获得太多收益。 (我不知道是否有 CPU 实际上在线程之间共享 SSE 单元,所以将此作为假设的情况)

SSE and MMX use the same registers, so it doesn't matter which of the two you use (apart from MMX sucking and SSE being useful, of course)

The better question is how SSE is implemented on your target CPU. Does it have a SSE unit per core? (probably) If so, then you might as well run SSE instructions on every thread.

If it has a shared SSE unit between cores then different threads will be fighting over it so there won't be much gained by executing SSE instructions in multiple threads. (I don't know if any CPUs actually share the SSE unit between threads though, so take this as a hypothetical case)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文