从记忆中提取一个单词需要多少时间?

发布于 2024-09-07 17:08:13 字数 201 浏览 1 评论 0原文

采纳Peter Norvig的建议,我正在思考这个问题:

无论有没有缓存未命中,从内存中获取一个单词需要多长时间?

(假设标准硬件和架构。为简化计算,假设时钟为 1Ghz)

Taking Peter Norvig's advice, I am pondering on the question:

How much time does it take to fetch one word from memory, with and without a cache miss?

(Assume standard hardware and architecture. To simplify calculations assume 1Ghz clock)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

新人笑 2024-09-14 17:08:22

在 github.org 上发布了该表中的数据很好的可视化,他们还有一个“人类尺度”对时间价值的重新诠释。

在此处输入图像描述

There is a nice visualization of the data in that table published on github.org They also have a "human scale" reinterpretation of the time values there.

enter image description here

極樂鬼 2024-09-14 17:08:20

此处进行了公平的总结,但有些不精确。当编写时(两年多前),对于当时的中档 PC,估计:内存访问,60ns;一级缓存,10ns; L2 缓存,20-30ns(没有对 L3 缓存访问时间的估计)。当然,这一切都有很大的不同,具体取决于争用和访问模式(因为缓存层通常是从较慢的内存中“按行”填充的,如果您访问地址 X,然后访问地址 X+1,则第二次访问可能会速度会快一点,因为缓存行填充是由第一次访问开始的)。

当然,经过精心调校的高端服务器会更快(此类机器之间内存访问延迟的相对差异通常比“原始”CPU 速度的相对差异大得多)。

Fair summary here, with some imprecision. When written (2+ years ago) and for a mid-range PC of the time, it estimated: memory access, 60ns; L1 cache, 10ns; L2 cache, 20-30ns (no estimates for L3 cache access times). It all varies a lot of course, depending on contention and access patterns (since cache layers are typically filled "by lines" from slower memory, if you access address X then address X+1 the second access may be a bit faster as the cache line filling was started by the first access).

And, of course, a high-end, well-tuned server will be much faster (relative differences between such machines in memory access latency are typically much larger than ones in "raw" CPU speeds).

葵雨 2024-09-14 17:08:19

看起来 Norvig 自己回答了这个问题

execute typical instruction         1/1,000,000,000 sec = 1 nanosec
fetch from L1 cache memory          0.5 nanosec
branch misprediction                5 nanosec
fetch from L2 cache memory          7 nanosec
Mutex lock/unlock                   25 nanosec
fetch from main memory              100 nanosec
send 2K bytes over 1Gbps network    20,000 nanosec
read 1MB sequentially from memory   250,000 nanosec
fetch from new disk location (seek) 8,000,000 nanosec
read 1MB sequentially from disk     20,000,000 nanosec
send packet US to Europe and back   150 milliseconds = 150,000,000 nanosec 

“执行典型指令”= 1 的部分ns 意味着 1 GHz CPU(当然假设高效的流水线)。

我不知道他从哪里获取这些信息,但我相信 Peter Norvig 是可靠的:-)

Seems like Norvig answers this himself:

execute typical instruction         1/1,000,000,000 sec = 1 nanosec
fetch from L1 cache memory          0.5 nanosec
branch misprediction                5 nanosec
fetch from L2 cache memory          7 nanosec
Mutex lock/unlock                   25 nanosec
fetch from main memory              100 nanosec
send 2K bytes over 1Gbps network    20,000 nanosec
read 1MB sequentially from memory   250,000 nanosec
fetch from new disk location (seek) 8,000,000 nanosec
read 1MB sequentially from disk     20,000,000 nanosec
send packet US to Europe and back   150 milliseconds = 150,000,000 nanosec 

The part where it says "execute typical instruction" = 1 ns implies a 1 GHz CPU (assuming efficient pipelining, of course).

I don't know where he takes this information, but I trust Peter Norvig to be reliable :-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文