使用 HWLOC 的 NUMA 系统的 realloc()

发布于 2024-11-27 13:01:21 字数 835 浏览 2 评论 0原文

我有几个自定义分配器,它们提供了根据不同策略分配内存的不同方法。其中之一在定义的 NUMA 节点上分配内存。分配器的接口很简单。

template<typename config>
class NumaNodeStrategy
{

public:

    static void *allocate(const size_t sz){}

    static void *reallocate(void *old, size_t sz, size_t old_sz){}

    static void deallocate(void *p, size_t sz){}
};

分配本身是使用 hwloc_alloc_membind_nodeset() 方法处理的,并为分配策略等设置了相应的参数。但是,hwloc 仅提供分配和释放内存的方法我想知道应该如何实现reallocate()

两种可能的解决方案:

  1. 分配新的内存区域和 memcpy() 数据
  2. 使用 hwloc_set_membind_nodeset() 为节点集设置内存分配/绑定策略并使用普通的 malloc () / posix_memalign()realloc()

谁能帮我解决这个问题?

更新:

我尝试使问题更具体:是否有可能使用hwloc执行realloc()而不分配新内存并移动周围的页面?

I have a several custom allocators that provide different means to allocate memory based on different policies. One of them allocates memory on a defined NUMA node. The interface to the allocator is straight-forward

template<typename config>
class NumaNodeStrategy
{

public:

    static void *allocate(const size_t sz){}

    static void *reallocate(void *old, size_t sz, size_t old_sz){}

    static void deallocate(void *p, size_t sz){}
};

The allocation itself is handled using the hwloc_alloc_membind_nodeset() methods with the according parameters set for allocation policies etc. Howver, hwloc only provides methods for allocation and free'ing memory and I was wondering how should I implement reallocate().

Two possible solutions:

  1. Allocate new memory area and memcpy() the data
  2. Use hwloc_set_membind_nodeset() to set the memory allocation / binding policy for the nodeset and use plain malloc() / posix_memalign() and realloc().

Can anyone help me in getting this right?

Update:

I try to make the question more specific: Is there a possibility to perform a realloc() using hwloc without allocating new memory and moving the pages around?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

若水般的淡然安静女子 2024-12-04 13:01:22

你错了。 mbind 可以移动已触摸的页面。您只需要添加 MPOL_MF_MOVE。如果您添加标志 HWLOC_MEMBIND_MIGRATE,这就是 hwloc_set_area_membind_nodeset() 的作用。

move_pages 只是一种不同的方法(更灵活,但速度有点慢,因为您可以将独立页面移动到不同的位置)。一旦将输入转换为列表,mbind 与 MPOL_MF_MOVE 和 move_pages(和 migrate_pages)最终都会使用 mm/migrate.c 中完全相同的 migrate_pages() 函数页。

You're wrong. mbind can move pages that have been touched. You just need to add MPOL_MF_MOVE. That's what hwloc_set_area_membind_nodeset() does if you add the flag HWLOC_MEMBIND_MIGRATE.

move_pages is just a different way to do it (more flexible but a bit slower because you can move independant pages to different places). Both mbind with MPOL_MF_MOVE and move_pages (and migrate_pages) end up using the exact same migrate_pages() function in mm/migrate.c once they have converted the input into a list of pages.

睡美人的小仙女 2024-12-04 13:01:21

回复编辑:
hwloc 中没有 realloc,我们目前没有计划添加一个。如果您准确地看到了您想要的内容(函数的 C 原型),请随时添加票证到 https ://svn.open-mpi.org/trac/hwloc

回复ogsx:内存绑定不是特定的,它是虚拟内存区域特定的,并且可能是线程特定的。如果你重新分配,libc 不会做任何特殊的事情。
1)如果它可以在同一页面内重新分配,则您将在同一节点上获得内存。很好,但很少见,特别是对于大缓冲区。
2)如果它在不同的页面中重新分配(大多数情况下是大缓冲区),则取决于过去是否已经由 malloc lib 在物理内存中分配了相应的页面(在虚拟内存中进行了 malloc 和释放,但是仍在物理内存中分配)
2.a) 如果虚拟页已经被分配,它过去可能因为各种原因被分配到另一个节点上,你完蛋了。
2.b) 如果新的虚拟页尚未分配,则默认在当前节点上分配。如果您之前使用 set_area_membind() 或 mbind() 指定了绑定,它将被分配在正确的节点上。在这种情况下你可能会很高兴。

简而言之,这取决于很多事情。如果您不想让 malloc 库做复杂/隐藏的内部事情,特别是如果您的缓冲区很大,那么使用 mmap(MAP_ANONYMOUS) 而不是 malloc 是确保在您真正需要时分配页面的简单方法他们。你甚至可以使用 mremap 来执行类似于 realloc 的操作。

alloc 变为 mmap(length) + set_area_membind
realloc 变成 mremap + set_area_membind (在整个 mremap'ed 缓冲区上)

从未使用过,但看起来很有趣。

To reply to the edit:
There's no realloc in hwloc, and we currently have no plan to add one. If you see preceisely what you want (C prototype of the function), feel free to add a ticket to https://svn.open-mpi.org/trac/hwloc

To reply to ogsx: The memory binding isn't specific, it's virtual memory area specific, and possibly thread-specific. If you realloc, the libc doesn't do anything special.
1) If it can realloc within the same page, you get memory on the same node. Good, but rare, especially for large buffers.
2) If it realloc in a different page (most of the cases for large buffers), it depends if the corresponding page have already been allocated in physical memory by the malloc lib in the past (malloc'ed and freed in virtual memory, but still allocated in physical memory)
2.a) If the virtual page has been allocated, it may have been allocated on another node for various reasons in the past, you're screwed.
2.b) If the new virtual page has not been allocated yet, the default is to allocate on the current node. If you specified a binding with set_area_membind() or mbind() earlier, it'll be allocated on the right node. You may be happy in this case.

In short, it depends on a lot of things. If you don't want to bother with the malloc lib doing complex/hidden internal things, and especially if your buffers are large, doing mmap(MAP_ANONYMOUS) instead of malloc is a simple way to be sure that pages are allocated when you really want them. And you even have mremap to do something similar to realloc.

alloc becomes mmap(length) + set_area_membind
realloc becomes mremap + set_area_membind (on the entire mremap'ed buffer)

Never used that but looks interesting.

长伴 2024-12-04 13:01:21

hwloc_set_area_membind_nodeset 可以解决问题,但不不是吗?

 HWLOC_DECLSPEC int     
  hwloc_set_area_membind_nodeset (hwloc_topology_t topology, 
    const void *addr, size_t len, hwloc_const_nodeset_t nodeset, 
    hwloc_membind_policy_t policy, int flags)

将 (addr, len) 标识的已分配内存绑定到节点集中的 NUMA 节点。

返回:

  • -1,如果不支持该操作,则将 errno 设置为 ENOSYS
  • -1 如果无法强制执行绑定,则将 errno 设置为 EXDEV

在 Linux 上,此调用是通过 mbind 仅当该区域中的页面不存在时才有效触摸,所以这是在第二个解决方案中移动内存区域的更正确方法。 更新有一个 MPOL_MF_MOVE* 标志来移动触摸的数据。

我所知道的唯一无需重新分配和复制即可移动页面的系统调用是 move_pages

move_pages 将已执行进程的地址空间中的一组页面移动到不同的 NUMA 节点。

The hwloc_set_area_membind_nodeset does the trick, doesn't it?

 HWLOC_DECLSPEC int     
  hwloc_set_area_membind_nodeset (hwloc_topology_t topology, 
    const void *addr, size_t len, hwloc_const_nodeset_t nodeset, 
    hwloc_membind_policy_t policy, int flags)

Bind the already-allocated memory identified by (addr, len) to the NUMA node(s) in nodeset.

Returns:

  • -1 with errno set to ENOSYS if the action is not supported
  • -1 with errno set to EXDEV if the binding cannot be enforced

On linux, this call is implemented via mbind It works only if pages in the area was not touched, so it is just more correct way to move memory region in your second solution. UPDATE there is a MPOL_MF_MOVE* flags to move touched data.

The only syscall to move pages without reallocate-and-copy I know is move_pages

move_pages moves a set of pages in the address space of a executed process to a different NUMA node.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文