通过dask阵列块迭代
我正在尝试通过一个一个一个一个一个一个一个dask阵列的块手动迭代,并应用我的计算。我知道DASK的好处是它可以为我进行迭代,但是我的计算失败了(由于我认为与DASK无关的原因),我想手动迭代以进行调试。我该怎么做?
我想象的是:
import dask.array as da
data = da.random.randint(0, 30, size=(1_000, 100, 100), chunks=(-1, 10, 10))
for chunk in data.iterchunks():
# chunk would contain some information about which chunk I have access to,
# and I could somehow get the data contained in that chunk
chunk_data = get_chunk(chunk)
my_function(chunk_data)
我回来的块
在哪里有一些有关我所在的块的信息,并且还会有该块的数据。
I am trying to manually iterate through the chunks of a dask array, one by one, and apply my computation. I understand that a benefit of dask is that it can to do the iteration for me, but my computation is failing (for reasons that I don't think are related to dask) and I want to iterate through manually for the purpose of debugging. How would I do that?
I am imagining something like:
import dask.array as da
data = da.random.randint(0, 30, size=(1_000, 100, 100), chunks=(-1, 10, 10))
for chunk in data.iterchunks():
# chunk would contain some information about which chunk I have access to,
# and I could somehow get the data contained in that chunk
chunk_data = get_chunk(chunk)
my_function(chunk_data)
Where the chunk
that I get back has some information about which chunk I am in, and there would also be get the data for that chunk.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用
arr.blocks
属性。 blockView对象具有类似数组的接口,但是访问块视图中的元素返回原始数组中所选块(S):因此,在您的情况下,您可以循环遍历以下所有块:
这将很慢,很慢,但我认为您要寻找什么。
Access the data within each chunk using the
arr.blocks
property. The BlockView object has an array-like interface, but accessing an element in the BlockView array returns the selected chunk(s) in the original array:So in your case, you could loop through all blocks with the following:
This will be slow, but it does I think what you're looking for.
尝试使用
data.chunks
而不是data.iterchunks()
。Try using
data.chunks
instead ofdata.iterchunks()
.您可以使用 /a>并避免使用 -loop的
:
You can use
da.map_blocks
and avoid thefor
-loop: