预取指令

发布于 2024-09-07 20:05:25 字数 172 浏览 10 评论 0原文

看来预取使用的一般逻辑是，只要代码忙于处理，直到预取指令完成其操作，就可以添加预取。但是，如果使用太多的预取指令，似乎会影响系统的性能。我发现我们首先需要有没有预取指令的工作代码。稍后我们需要在代码的各个位置进行预取指令的各种组合并进行分析以确定由于预取而真正可以改进的代码位置。有没有更好的方法来确定应该使用预取指令的确切位置？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

如日中天 2024-09-14 20:05:25

在大多数情况下，预取指令几乎没有任何好处，甚至在某些情况下甚至会适得其反。大多数现代 CPU 都有自动预取机制，该机制运行良好，因此添加软件预取提示效果甚微，甚至会干扰自动预取，实际上会降低性能。

在某些罕见的情况下，例如当您流式传输大数据块时，您几乎不做任何实际处理，您可能会设法通过软件启动的预取来隐藏一些延迟，但很难做到正确 - 您需要在您要使用数据之前开始预取几百个周期 - 太晚，您仍然会出现缓存未命中，太早，您的数据可能会在您准备使用它之前从缓存中逐出。通常这会将预取放在代码的某些不相关部分中，这不利于模块化和软件维护。更糟糕的是，如果您的架构发生变化（新的 CPU、不同的时钟速度等），导致 DRAM 访问延迟增加或减少，您可能需要将预取指令移至代码的另一部分以保持其有效。

无论如何，如果您觉得确实必须使用预取，我建议在任何预取指令周围使用#ifdef，以便您可以在使用或不使用预取的情况下编译代码，并查看它是否确实有助于（或阻碍）性能，例如，

#ifdef USE_PREFETCH
    // prefetch instruction(s)
#endif

一般来说，我会建议在完成所有更高效、更明显的事情后，将软件预取作为最后的微优化手段。

In the majority of cases prefetch instructions are of little or no benefit, and can even be counter-productive in some cases. Most modern CPUs have an automatic prefetch mechanism which works well enough that adding software prefetch hints achieves little, or even interferes with automatic prefetch, and can actually reduce performance.

In some rare cases, such as when you are streaming large blocks of data on which you are doing very little actual processing, you may manage to hide some latency with software-initiated prefetching, but it's very hard to get it right - you need to start the prefetch several hundred cycles before you are going to be using the data - do it too late and you still get a cache miss, do it too early and your data may get evicted from cache before you are ready to use it. Often this will put the prefetch in some unrelated part of the code, which is bad for modularity and software maintenance. Worse still, if your architecture changes (new CPU, different clock speed, etc), such that DRAM access latency increases or decreases, you may need to move your prefetch instructions to another part of the code to keep them effective.

Anyway, if you feel you really must use prefetch, I recommend #ifdefs around any prefetch instructions so that you can compile your code with and without prefetch and see if it is actually helping (or hindering) performance, e.g.

#ifdef USE_PREFETCH
    // prefetch instruction(s)
#endif

In general though, I would recommend leaving software prefetch on the back burner as a last resort micro-optimisation after you've done all the more productive and obvious stuff.

回复收藏 0 原文