CPU指令重排序的踪迹

发布于 2025-01-05 12:42:53 字数 203 浏览 0 评论 0原文

我研究了一些有关处理器指令重新排序和 Tomasulo 算法的内容。

为了进一步了解这个主题,我想知道是否有任何方法可以(获取跟踪)查看给定程序完成的实际动态重新排序?

我想给出一个输入程序并查看我的程序的“乱序指令执行跟踪”。

我可以使用 IBM-P7 机器和 Intel Core2Duo 笔记本电脑。另外请告诉我是否有一个简单的替代方案。

I have studied a few things about instruction re-ordering by processors and Tomasulo's algorithm.

In an attempt to understand this topic bit more I want to know if there is ANY way to (get the trace) see the actual dynamic reordering done for a given program?

I want to give an input program and see the "out of order instruction execution trace" of my program.

I have access to an IBM-P7 machine and an Intel Core2Duo laptop. Also please tell me if there is an easy alternative.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

冷…雨湿花 2025-01-12 12:42:53

您无法访问 CPU 内部完成的实际重新排序(没有公开的方法来启用跟踪)。但是有一些重新排序的模拟器,其中一些可以给您有用的提示。

对于现代 Intel CPU(核心 2、nehalem、Sandy 和 Ivy),有来自 Intel 的“Intel(R) 架构代码分析器”(IACA)。它的主页是 http://software.intel.com/en -us/articles/intel-architecture-code-analyzer/

该工具可让您查看如何将某些线性代码片段拆分为微操作,以及如何将它们规划为执行端口。该工具有一些局限性,它只是 CPU u-op 重新排序和执行的不精确模型。

还有一些用于模拟 x86/x86_84 CPU 内部结构的“外部”工具,我可以推荐 PTLsim (或派生MARSSx86):

PTLsim 以可配置的细节级别对现代超标量无序 x86-64 兼容处理器内核进行建模,范围......低至所有关键管道结构的 RTL 级别模型。此外,所有微代码、完整的缓存层次结构、内存子系统和支持硬件设备均以真实的周期精度进行建模。

但 PTLsim 模拟了一些“PTL”CPU,而不是真正的 AMD 或 Intel CPU。好消息是,这个 PTL 是乱序,基于想法来自真实核心:

该模型的基本微架构结合了 Intel Pentium 4、AMD K8 和 Intel Core 2 的设计功能,但也融合了 IBM Power4/Power5 和 Alpha EV8 的一些想法。

另外,在http://es.cs.uni-kl.de /publications/datarsg/Senf11.pdf 据说 JavaHASE小程序能够模拟不同的简单CPU,甚至支持Tomasulo 示例

You have no access to actual reordering done inside the CPU (there is no publically known way to enable tracing). But there is some emulators of reordering and some of them can give you useful hints.

For modern Intel CPUs (core 2, nehalem, Sandy and Ivy) there is "Intel(R) Architecture Code Analyzer" (IACA) from Intel. It's homepage is http://software.intel.com/en-us/articles/intel-architecture-code-analyzer/

This tool allows you to look how some linear fragment of code will be splitted into micro-operations and how they will be planned into execution Ports. This tool has some limitations and it is only inexact model of CPU u-op reordering and execution.

There are also some "external" tools for emulating x86/x86_84 CPU internals, I can recommend the PTLsim (or derived MARSSx86):

PTLsim models a modern superscalar out of order x86-64 compatible processor core at a configurable level of detail ranging ... down to RTL level models of all key pipeline structures. In addition, all microcode, the complete cache hierarchy, memory subsystem and supporting hardware devices are modeled with true cycle accuracy.

But PTLsim models some "PTL" cpu, not real AMD or Intel CPU. The good news is that this PTL is Out-Of-Order, based on ideas from real cores:

The basic microarchitecture of this model is a combination of design features from the Intel Pentium 4, AMD K8 and Intel Core 2, but incorporates some ideas from IBM Power4/Power5 and Alpha EV8.

Also, in arbeit http://es.cs.uni-kl.de/publications/datarsg/Senf11.pdf is said that JavaHASE applet is capable of emulating different simple CPUs and even supports Tomasulo example.

如梦亦如幻 2025-01-12 12:42:53

不幸的是,除非您在这些公司之一工作,否则答案是否定的。 Intel/AMD 处理器甚至不安排您给它们的(宏)指令。他们首先将这些指令转换为微操作,然后安排它们。这些微指令是什么以及指令重新排序的整个过程都是严格保密的秘密,因此他们并不想让您知道发生了什么。

Unfortunately, unless you work for one of these companies, the answer is no. Intel/AMD processors don't even schedule the (macro) instructions you give them. They first convert those instructions into micro operations and then schedule those. What these micro instructions are and the entire process of instruction reordering is a closely guarded secret, so they don't exactly want you to know what is going on.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文