如何使用A72核心将数据放入L2缓存中？

发布于 2025-01-28 17:14:17 字数 345 浏览 3 评论 0原文

我有一系列看起来像这样的数据：

uint32_t data[128]; //Could be more than L1D Cache size

为了对其进行计算，我想将数据尽可能靠近我的计算单元，以便在L2缓存中。

我的目标使用Linux内核和一些添加的应用程序运行

，我知道我可以通过MMAP获得对内存的某个内存区域的访问，并且在内核之间共享的可用内存的某些部分中，我成功地完成了它。

如何做同样的事情，但是在L2高速缓存区域？

我已经阅读了GCC文档和AARCH64汇编指令集的一部分，但无法找到实现这一目标的方法。

原文

I have an array of data that looks like this :

uint32_t data[128]; //Could be more than L1D Cache size

In order to do computation on it, I want to put the data as close as possible to my computing unit so in the L2 Cache.

My target runs with a linux kernel and some additionnal apps

I know that I can get an access to a certain area of the memory with mmap and I have succesfully done it in some part of my available memory shared between cores.

How to do the same thing but in L2 Cache area ?

I've read part of gcc documentation and AArch64 assembly instruction set but cannot figure out the way to achieve this.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

空城旧梦 2025-02-04 17:14:17

如何做同样的事情，但是在L2缓存区域？

您的硬件不支持这一点。

通常，ARMV8体系结构对缓存的内容没有任何保证，也没有提供任何明确操纵或查询它们的方法 - 它仅提供保证并提供用于处理 cooherency 的工具。

具体而言，从 spec ：

[...] the architecture cannot guarantee whether:

• A memory location present in the cache remains in the cache.
• A memory location not present in the cache is brought into the cache.

Instead, the following principles apply to the behavior of caches:

• The architecture has a concept of an entry locked down in the cache.
  How lockdown is achieved is IMPLEMENTATION DEFINED, and lockdown might
  not be supported by:

  — A particular implementation.
  — Some memory attributes.

• An unlocked entry in a cache might not remain in that cache. The
  architecture does not guarantee that an unlocked cache entry remains in
  the cache or remains incoherent with the rest of memory. Software must
  not assume that an unlocked item that remains in the cache remains dirty.

• A locked entry in a cache is guaranteed to remain in that cache. The
  architecture does not guarantee that a locked cache entry remains
  incoherent with the rest of memory, that is, it might not remain dirty.

[...]

• Any memory location is not guaranteed to remain incoherent with the rest of memory.

因此，基本上您需要缓存锁定。咨询但是，您的CPU ：

• The Cortex-A72 processor does not support TLB or cache lockdown.

因此您不能故意将某些内容放在缓存中。现在，您可能能够通过尝试观察副作用来告诉是否已经通过试图避免了某些东西。缓存的两个常见副作用是延迟和相干性。因此，您可以尝试和时间访问时间或修改DRAM的内容，并检查您是否看到缓存映射的变化...但这仍然是一个可怕的主意。

一方面，这两个都是破坏性的操作，这意味着它们将通过测量您要测量的属性。对于另一个人来说，仅仅因为您曾经观察到它们曾经并不意味着您可以依靠对此发生。

底线：您无法保证在使用时在任何特定的缓存中持有某些东西。

How to do the same thing but in L2 Cache area ?

Your hardware doesn't support that.

In general, the ARMv8 architecture doesn't make any guarantees about the contents of caches and does not provide any means to explicitly manipulate or query them - it only makes guarantees and provides tools for dealing with coherency.

Specifically, from section D4.4.1 "General behavior of the caches" of the spec:

[...] the architecture cannot guarantee whether:

• A memory location present in the cache remains in the cache.
• A memory location not present in the cache is brought into the cache.

Instead, the following principles apply to the behavior of caches:

• The architecture has a concept of an entry locked down in the cache.
  How lockdown is achieved is IMPLEMENTATION DEFINED, and lockdown might
  not be supported by:

  — A particular implementation.
  — Some memory attributes.

• An unlocked entry in a cache might not remain in that cache. The
  architecture does not guarantee that an unlocked cache entry remains in
  the cache or remains incoherent with the rest of memory. Software must
  not assume that an unlocked item that remains in the cache remains dirty.

• A locked entry in a cache is guaranteed to remain in that cache. The
  architecture does not guarantee that a locked cache entry remains
  incoherent with the rest of memory, that is, it might not remain dirty.

[...]

• Any memory location is not guaranteed to remain incoherent with the rest of memory.

So basically you want cache lockdown. Consulting the manual of your CPU though:

• The Cortex-A72 processor does not support TLB or cache lockdown.

So you can't put something in cache on purpose. Now, you might be able to tell whether something has been cached by trying to observe side effects. The two common side effects of caches are latency and coherency. So you could try and time access times or modify the contents of DRAM and check whether you see that change in your cached mapping... but that's still a terrible idea.
For one, both of these are destructive operations, meaning they will change the property you're measuring, by measuring it. And for another, just because you observe them once does not mean you can rely on that happening.

Bottom line: you cannot guarantee that something is held in any particular cache by the time you use it.

回复收藏 0 原文