什么是好的多核 64 位“Hello World”?程序?

发布于 2024-08-09 08:04:36 字数 200 浏览 5 评论 0原文

我最近将家用 PC 升级为四核 CPU 和 64 位操作系统。我以前有一些 C/C++ 经验,并且我真的“渴望”尝试练习一些 64 位 CPU 功能。有什么好的“Hello World”类型的程序可以通过执行一些在 32 位单核代码中根本无法正常工作的简单操作来演示 64 位多核功能?

我只是想“感受”一下这些新 CPU 在极端情况下如何影响 C/C++ 代码的性能。

I recently got my home PC upgraded to a quad-core CPU and 64-bit OS. I have some former experience with C/C++ and I'm really "itching" to try exercising some 64-bit CPU capabilities. What's a good "Hello World" type program that demonstrates 64-bit multi-core capabilities by doing some simple things that don't work well at all in 32-bit single-core code?

I'm just trying to get a "feel" for how these new CPUs can impact the performance of C/C++ code in extreme cases.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

始终不够爱げ你 2024-08-16 08:04:36

OpenMP 将是一种使用 C++ 进行多核编程的简单方法。维基百科的示例并没有真正执行任何处理器密集型操作,但您可以用一些独立的、长时间运行的函数替换“cout”。

OpenMP

就 64 位而言,性能有很大提高将会来自几个地方。

吞吐量增加,因为所有数据元素更宽,处理器可以在任何给定时钟周期内处理更多数据。看一下 Microsoft Exchange Server 的一些基准测试,它们现在已经转向支持 64 位,只是因为吞吐量的增长令人难以置信。

更多的寄存器,由于64位架构有大量的寄存器,大多数函数参数和返回值都可以使用寄存器来传递。

在具有一些调用约定的 x86 ABI 中,可以通过寄存器传递一个或两个参数,并且其余的必须压入堆栈。使用像 cdecl 这样的常见调用约定,不会将单个参数或返回值放置在寄存器中。由于堆栈位于主内存中,这可能会对性能造成很大影响。

OpenMP would be an easy way to play around with multicore programming in C++. The wikipedia example doesn't really do anything processor intensive, but you could replace the 'cout' with some independent, long-running function.

OpenMP

As far as 64-bit, a lot of your performance increase is going to come from a few places.

Increased throughput, because all data elements are wider the processor can process more data in any given clock cycle. Take a look at some of the Microsoft benchmarks for Exchange Server, they have now moved to support 64-bit only because the throughput increases are incredible.

More registers, since the 64-bit architecture has a large number of registers most function parameters and the return value can be passed using registers.

In the x86 ABI with some calling conventions one or maybe two parameter could be passed via registers and the rest have to be pushed onto the stack. With a common calling convention like cdecl not a single parameter or return value is placed in a register. Since the stack is located in main memory this can be a big performance hit.

哆兒滾 2024-08-16 08:04:36

您可能想做一些事情,以独立的方式对大数字或大面积内存执行计算量大的操作,例如光线追踪或蛋白质折叠。

需要记住的重要一点是,64 位或多核处理器实际上无法完成单核处理器无法完成的任何操作,本质上它们只是做得更快、数量更大。

You probably want to do something tyst performs computationally expensive operations om big numbers or large areas og memory in an independent fashion, such as raytracing or protein folding.

The important thing to keep in mind is that 64 bit or multicore processors can't really do anything that single-core processors CANNOT do, essentially they just do it faster and to bigger numbers.

勿忘初心 2024-08-16 08:04:36

考虑到有多少种不同的并行模型以及它们如何适应不同的任务,您的问题没有令人满意的答案。这完全取决于您最终真正想要做什么。您应该选择适合您想要执行的操作的模型(如果它与之前的约束不矛盾,请尝试 消息传递,与其他方法相比,它非常简单)。

我不得不说,Jherico 在评论中半开玩笑的回答是正确的。对于像“hello world”这样的简单任务,最好的模型是根本没有并行性。

Considering how many different parallelism models there are and how they are each adapted to different tasks, there is no satisfactory answer to your question. It all depends what you really want to do eventually. You should pick the model that's adapted to what you want to do (if it doesn't contradict the previous constraint, try message-passing, it's refreshingly easy compared to others).

I would have to say that Jherico's tongue-in-cheek answer in the comments is right. For such a simple task as "hello world", the best model is no parallelism at all.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文