减少Python中长for循环的时间

发布于 2024-10-31 23:12:26 字数 865 浏览 2 评论 0原文

我这边的另一个愚蠢问题;)我对下面的 len(x)=len(y)=7'700'000 代码片段有一些问题:

from numpy import *

for k in range(len(x)):
    if x[k] == xmax:
        xind = -1
    else:
        xind = int(floor((x[k]-xmin)/xdelta))
    if y[k] == ymax:
        yind = -1
    else:
        yind = int(floor((y[k]-ymin)/ydelta))

    arr = append(arr,grid[xind,yind])

所有变量都是浮点数或整数,除了 arr 和网格。 arr 是一个一维数组,grid 是一个二维数组。

我的问题是运行循环需要很长时间(几分钟)。谁能解释一下,为什么需要这么长时间?有人有建议吗?即使我尝试通过 arange() 交换 range() ,我也只节省了一秒钟。

谢谢。

第一次编辑 对不起。忘记告诉我正在导入 numpy

第二次编辑

我在 2D 网格中有一些点。网格的每个单元格都存储了一个值。我必须找出该点的位置并将该值应用于新数组。这就是我的问题和我的想法。

ps:如果想更好地理解的话请看图。单元格的值用不同的颜色表示。

idea

a other stupid question from my side ;) I have some issues with the following snippet with len(x)=len(y)=7'700'000:

from numpy import *

for k in range(len(x)):
    if x[k] == xmax:
        xind = -1
    else:
        xind = int(floor((x[k]-xmin)/xdelta))
    if y[k] == ymax:
        yind = -1
    else:
        yind = int(floor((y[k]-ymin)/ydelta))

    arr = append(arr,grid[xind,yind])

All variables are floats or integers except arr and grid. arr is a 1D-array and grid is a 2D-array.

My problem is that it takes a long time to run through the loop (several minutes). Can anyone explain me, why this takes such a long time? Have anyone a suggestion? Even if I try to exchange range() through arange()then I save only some second.

Thanks.

1st EDIT
Sorry. Forgot to tell that I'm importing numpy

2nd EDIT

I have some points in a 2D-grid. Each cell of the grid have a value stored. I have to find out which position the point have and apply the value to a new array. That's my problem and my idea.

p.s.: look at the picture if you want to understand it better. the values of the cell are represented with different colors.

idea

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

臻嫒无言 2024-11-07 23:12:26

怎么样:

import numpy as np
xind = np.floor((x-xmin)/xdelta).astype(int)
yind = np.floor((y-ymin)/ydelta).astype(int)

xind[np.argmax(x)] = -1
yind[np.argmax(y)] = -1

arr = grid[xind,yind]

注意:如果你使用 numpy,如果你想高效地做事,不要像 python 列表那样对待数组。

How about something like:

import numpy as np
xind = np.floor((x-xmin)/xdelta).astype(int)
yind = np.floor((y-ymin)/ydelta).astype(int)

xind[np.argmax(x)] = -1
yind[np.argmax(y)] = -1

arr = grid[xind,yind]

Note: if you're using numpy don't treat the arrays like python lists if you want to do things efficiently.

北方。的韩爷 2024-11-07 23:12:26
for x_item, y_item in zip(x, y):
    # do stuff.

如果您不想生成巨大的额外文件,还有 izip列表。

for x_item, y_item in zip(x, y):
    # do stuff.

There's also izip for if you don't want to generate a giant extra list.

心安伴我暖 2024-11-07 23:12:26

除了数据大小之外,我看不到明显的问题。您的计算机能够将所有内容保存在内存中吗?如果没有,您可能会在交换内存中“跳来跳去”,这总是很慢。如果内存中有完整的数据,请尝试一下 psyco。它可能会大大加快你的计算速度。

I cannot see an obvious problem, beside the size of the data. Is your computer able to hold everything in memory? If not, you are probably "jumping around" in swapped memory, which will always be slow. If the complete data is in memory, give psyco a try. It might speed up your calculation a lot.

掌心的温暖 2024-11-07 23:12:26

我怀疑问题可能出在您存储结果的方式上:

arr = append(arr,grid[xind,yind])

The docs for append 说它返回:

附加了arr副本
。请注意 append 确实
没有就地发生:一个新数组是
分配并填充。

这意味着您将在每次迭代中释放和分配越来越大的数组。我建议预先分配一个正确大小的数组,然后在每次迭代中用数据填充它。例如:

arr = empty(len(x))

for k in range(len(x)):
    ...
    arr[k] = grid[xind,yind]

I suspect the problem might be in the way you're storing the results:

arr = append(arr,grid[xind,yind])

The docs for append say it returns:

A copy of arr with values appended
to axis. Note that append does
not occur in-place: a new array is
allocated and filled.

This means you'll be deallocating and allocating a larger and larger array every iteration. I suggest allocating an array of the correct size up-front, then populating it with data in each iteration. e.g.:

arr = empty(len(x))

for k in range(len(x)):
    ...
    arr[k] = grid[xind,yind]
遗失的美好 2024-11-07 23:12:26

x的长度是700万?我想这就是原因!
迭代发生了 700 万次,

也许你应该做另一种循环。
真的有必要循环超过7m次吗?

x's lenght is 7 millions? I think that's why!
THe iterations ocurrs 7 millions times,

probably you shoud make another kind of loop.
It's really necesary looping over 7 m times?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文