“cpuid”在“rdtsc”之前
有时我会遇到使用 rdtsc 指令读取 TSC 的代码,但之前调用 cpuid 。
为什么需要调用cpuid
?我意识到这可能与具有 TSC 值的不同内核有关,但是当您按顺序调用这两个指令时,到底会发生什么?
Sometimes I encounter code that reads TSC with rdtsc
instruction, but calls cpuid
right before.
Why is calling cpuid
necessary? I realize it may have something to do with different cores having TSC values, but what exactly happens when you call those two instructions in sequence?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是为了防止乱序执行。这段文字来自现已从网络上消失的链接(但在消失之前被偶然复制到此处),这段文字来自约翰·埃克达尔 (John Eckerdal) 撰写的一篇题为“性能监控”的文章:
It's to prevent out-of-order execution. From a link that has now disappeared from the web (but which was fortuitously copied here before it disappeared), this text is from an article entitled "Performance monitoring" by one John Eckerdal:
原因有两个:
Two reasons:
CPUID 正在序列化,防止 RDTSC 乱序执行。
现在您可以安全地使用 LFENCE 来代替。它被记录为在 Intel CPU 上的指令流上进行序列化(但不存储到内存),现在在 Spectre 的微代码更新后也在 AMD 上进行序列化。
https://hadibrais.wordpress .com/2018/05/14/the-significance-of-the-x86-lfence-instruction/ 解释了有关 LFENCE 的更多信息。
另请参阅https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code- execution-paper.pdf 了解使用 RDTSCP 将 CPUID(或 LFENCE)排除在定时区域之外的方法:
另请参阅获取 CPU 周期计数? 了解有关 RDTSC 警告的更多信息,例如constant_tsc 和 nonstop_tsc。
作为奖励,RDTSCP 会为您提供一个核心 ID。如果您想检查核心迁移,您也可以使用 RDTSCP 作为开始时间。但如果您的 CPU 具有
constant_tsc
功能,则软件包中的所有内核都应同步其 TSC,因此在现代 x86 上通常不需要此功能。正如 @Tony 的回答指出的那样,您可以从 CPUID 获取核心 ID。
CPUID is serializing, preventing out-of-order execution of RDTSC.
These days you can safely use LFENCE instead. It's documented as serializing on the instruction stream (but not stores to memory) on Intel CPUs, and now also on AMD after their microcode update for Spectre.
https://hadibrais.wordpress.com/2018/05/14/the-significance-of-the-x86-lfence-instruction/ explains more about LFENCE.
See also https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf for a way to use RDTSCP that keeps CPUID (or LFENCE) out of the timed region:
See also Get CPU cycle count? for more about RDTSC caveats, like constant_tsc and nonstop_tsc.
As a bonus, RDTSCP gives you a core ID. You could use RDTSCP for the start time as well, if you want to check for core migration. But if your CPU has the
constant_tsc
features, all cores in the package should have their TSCs synced so you typically don't need this on modern x86.You could get the core ID from CPUID instead, as @Tony's answer points out.