页表之外的虚拟内存

发布于 2024-10-26 02:44:54 字数 294 浏览 0 评论 0原文

我正在开展一个研究项目,为多核(1000+)芯片开发操作系统。我们正在研究实现一个用于内存权限(读/写/执行)的虚拟内存类型系统,该系统将允许跨内核安全地共享内存。

基本上,我们想要一个系统,允许我们将“页面”标记为可由另一个可写的核心子集读取......等等。我们不会进行地址转换(至少在这一点上),但我们需要一种有效设置和查询权限的方法。它将是一个充满软件的数据结构,带有简单的 TLB 样式缓存。

我们的直觉是,简单地为每个核心复制页表将过于昂贵(就内存使用而言)。

什么数据结构对于此类问题有效?

谢谢

I am working on a research project to develop an OS for a many-core(1000+) chip. we are looking into implementing a virtual memory type system for memory permissions (read/write/execute) that would allow memory to be safely shared across cores.

basically we want a system that would allow us to mark a 'page' as being readable by some subset of cores writeable by another...etc. we are not going to be doing address translation (at least at this point) but we need a way to efficiently set and query permissions. it is going to be a software filled datastructure with a simple TLB style cache.

Our intuition is that simply replicating page tables for each core will be too expensive (in terms of memory usage).

what datastructures would be efficient for this kind of problem?

thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

可爱咩 2024-11-02 02:44:54

您是否了解过常见的多核(2-12 核)CPU 如何解决这个问题?

您是否知道这些常见多核 CPU 中使用的解决方案在何处/何时/为何/如何无法扩展到 1,000 多个核心?

换句话说,您能否量化现有解决方案的问题所在,该解决方案在核心数 <= 12 的常见 CPU 上正在运行并且一直在运行?

如果您知道这一点,那么答案就比您想象的更接近,因为它只需要了解 AMD/Intel 如何在较小的规模上解决问题,以及需要什么才能使他们的解决方案在更大的规模上工作(也许需要更多的内存)表、算法调整等)

查看 AMD/Intel 的数据结构 - 然后使用这些数据结构为 1,000 多个内核构建一个软件模拟器,并查看模拟在何处/何时/为何以及如何失败 - 如果失败的话。 理想情况下,

使用用户可选择的内核数量构建模拟器,然后使用不同数量的内核进行测试、测试、测试 - 一路向上,注意一路上的瓶颈。

您的模拟器应该与 AMD(如果您使用 AMD 数据结构)或 Intel(如果您使用 Intel 数据结构)完全一样工作——与他们的芯片之一具有相同的核心数量……因为它应该证明他们(AMD/英特尔)正在正确地做他们正在做的事情(因为他们确实如此),并且因为这将有助于证明您的模拟程序正在正确地进行模拟 - 在特定数量的内核上。

祝你好运!

Have you looked at how common multi-core (2-12 core) CPU's address this problem?

Do you know where/when/why/how the solution that is used in these common multi-core CPU's -- will not scale to a 1,000+ cores?

In other words -- can you quantify what's wrong with the existing solution, which is working, and has been working, with common CPU's whose core count <= 12 ?

If you know that -- then the answer is closer than you think, because it just requires understanding how AMD/Intel solved the problem on a lesser scale -- and what's needed to make their solution work on a greater one (Maybe more memory for tables, algorithm tweaks, etc.)

Look at AMD's/Intel's data structures -- then build a software simulator for 1,000+ cores with those data structures, and see where/when/why and how your simulation fails -- if it fails...

Ideally build your simulator with a user-selectable number of cores, then TEST, TEST, TEST with different amounts of cores -- working your way up, noting bottlenecks along the way.

Your simulator should work EXACTLY as well as AMD (if you're using AMD data structures) or Intel (if you're using Intel data structures) -- at the same core count as one of their chips... because it should prove that THEY (AMD/Intel) are doing what they're doing correctly (because they are), and because that will help prove that your simulation program is doing it's simulation correctly -- at a specific number of cores.

Wishing you luck!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文