如何在dask中创建可写共享数组

发布于 2025-01-18 07:10:13 字数 88 浏览 4 评论 0原文

我是达斯克新手 我试图找到的是“进程之间的共享数组,并且它需要可由任何进程写入” 有人可以告诉我吗? Top

一种在dask中实现共享可写数组的方法

I'm new to Dask
what i'm trying to find is "shared array between processes and it needed to be writable by any proccess"
could someone can show me that?
Top

a way to implement shared writable array in dask

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

烟酒忠诚 2025-01-25 07:10:13

Dask 的内部抽象是一个 DAG,一个函数图,其中假设任务在重新运行时行为相同(“功能纯”),因为一个任务总是可能在两个地方运行,或者一个工作线程持有一个任务的输出消失。

因此,Dask 通常不支持可变数据结构作为任务输入/输出。但是,您可以执行会产生突变作为副作用的任务,例如写入磁盘的任何函数。

如果您准备设置自己的共享内存并传递其句柄,那么没有什么可以阻止您创建改变该内存的函数。围绕多次运行的任务的警告仍然有效,您将只能靠自己了。目前没有任何机制可以为您做这种事情,但我个人打算在接下来的几个月内进行调查。

Dask's internal abstraction is a DAG, a functional graph in which it is assumed that tasks act the same should you rerun them ("functionally pure"), since it's always possible that a task runs in two places, or that a worker which holds a task's output dies.

Dask does not, therefore, support mutable data structures as task inputs/outputs normally. However, you can execute tasks that create mutation as a side-effect, such as any of the functions that write to disk.

If you are prepared to set up your own shared memory and pass around handles to this, there is nothing stopping you from making functions that mutate that memory. The caveats around tasks running multiple times hold, and you would be on your own. There is no mechanism currently to do this kind of thing for you, but it is something I personally intend to investigate within the next few months.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文