循环展开和数据缓存性能
循环展开是否会以任何方式影响数据缓存性能? 这与我的一项作业有关,该作业要求我在 simplescalar sim-cache 上模拟代码,以测试循环平铺、内循环中的内存访问等对缓存访问和缓存未命中率的影响。该作业特别要求我们进行循环展开,但我不明白它如何影响数据缓存?
Does loop unrolling effect data cache performance in any way?
This is related to a homework I have which requires me to simulate code on simplescalar sim-cache to test the effect of loop tiling, memory access in inner loop etc. on cache accesses and cache miss rate. The assignment specifically asks us to do loop unrolling but I do not understand how it can effect the data cache?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
循环展开一般不会影响一级数据缓存,只会影响指令缓存。因为这两者在大多数架构中是不同的。
但是,如果您具有多级缓存体系结构,则大多数体系结构中的 2 级缓存将充当指令缓存和数据缓存的 2 级缓存。因此,如果展开太多指令,可能会对 L2 缓存产生影响,从而从本质上降低 L2 作为数据缓存的性能。
这是核心 i7 架构的图片,它具有独立的 icache 和 dcache,但两者的 L2 缓存相同。
http://upload.wikimedia.org/wikipedia/commons/6/64 /Intel_Nehalem_arch.svg
Loop unrolling in general will not affect L1 data cache, just the instruction cache. Since those two are different in most architectures.
However if you have multi level cache architecture, Level 2 cache in most architectures serves as Level 2 cache for both instruction cache and data cache. Thus if you will unroll way too many instructions you might have effect on L2 cache, thus essentially descreasing performance of L2 as a data cache.
Here is picture of core i7 architecture which has separate icache and dcache but L2 cache is same for both.
http://upload.wikimedia.org/wikipedia/commons/6/64/Intel_Nehalem_arch.svg