python 中迭代的最快方法

发布于 2024-12-20 16:55:08 字数 958 浏览 4 评论 0原文

到目前为止,我从来没有担心过这个问题,但现在我需要使用一些需要由 PyOpenGL 缓冲的大量顶点,并且看起来 python 迭代是瓶颈。情况是这样的。我有一个 3D 点顶点数组,在每一步我都必须计算每个顶点的 4D 颜色数组。到目前为止我的方法是:

upper_border = len(self.vertices) / 3
#Only generate at first step, otherwise use old one and replace values
if self.color_array is None:
     self.color_array = numpy.empty(4 * upper_border)  

for i in range(upper_border):
     #Obtain a color between a start->end color
     diff_activity = (activity[i] - self.min) / abs_diff  
     clr_idx = i * 4
     self.color_array[clr_idx] = start_colors[0] + diff_activity * end_colors[0]
     self.color_array[clr_idx + 1] = start_colors[1] + diff_activity * end_colors[1]
     self.color_array[clr_idx + 2] = start_colors[2] + diff_activity * end_colors[2]
     self.color_array[clr_idx + 3] = 1

现在我不认为我可以做任何其他事情来消除循环每一步的操作,但我猜测必须有一种更优化的性能方法来执行该循环。我之所以这么说,是因为在 javascript 中,相同的演算产生 9FPS,而在 Python 中我只能得到 2-3 FPS。

问候, 博格丹

I've never had to concern myself with this problem so far but now I need to use some large number of vertices that need to be buffered by PyOpenGL and it seems like the python iteration is the bottleneck. Here is the situation. I have an array of 3D points vertices, and at each step I have to compute a 4D array of colors for each vertices. My approach so far is:

upper_border = len(self.vertices) / 3
#Only generate at first step, otherwise use old one and replace values
if self.color_array is None:
     self.color_array = numpy.empty(4 * upper_border)  

for i in range(upper_border):
     #Obtain a color between a start->end color
     diff_activity = (activity[i] - self.min) / abs_diff  
     clr_idx = i * 4
     self.color_array[clr_idx] = start_colors[0] + diff_activity * end_colors[0]
     self.color_array[clr_idx + 1] = start_colors[1] + diff_activity * end_colors[1]
     self.color_array[clr_idx + 2] = start_colors[2] + diff_activity * end_colors[2]
     self.color_array[clr_idx + 3] = 1

Now I don't think there's anything else I can do to eliminate the operations from each step of the loop, but I'm guessing there has to be a more optimal performance way to do that loop. I'm saying that because in javascript for example, the same calculus produces a 9FPS while in Python I'm only getting 2-3 FPS.

Regards,
Bogdan

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

相守太难 2024-12-27 16:55:09

为了使此代码更快,您需要对其进行“向量化”:使用 NumPy 的广播规则将所有显式 Python 循环替换为隐式循环。我可以尝试给出循环的矢量化版本:

if self.color_array is None:
     self.color_array = numpy.empty((len(activity), 4))
diff_activity = (activity - self.min) / abs_diff
self.color_array[:, :3] = (start_colors + 
                           diff_activity[:, numpy.newaxis] + 
                           end_colors)
self.color_array[:, 3] = 1

请注意,我必须做很多猜测,因为我不确定所有变量是什么以及代码应该做什么,所以我不能保证这一点代码运行。我将 color_array 转换为二维数组,因为这似乎更合适。这可能需要更改代码的其他部分(或者您需要再次展平数组)。

我假设 self.min 和abs_diff 是标量,所有其他名称引用以下形状的 NumPy 数组:

activity.shape == (len(vertices) // 3,)
start_colors.shape == (3,)
end_colors.shape == (3,)

它看起来也好像 vertices 是一维数组,应该是二维数组。

To make this code faster, you need to "vectorise" it: replace all explicit Python loops with implicit loops, using NumPy's broadcasting rules. I can try and give a vectorised version of your loop:

if self.color_array is None:
     self.color_array = numpy.empty((len(activity), 4))
diff_activity = (activity - self.min) / abs_diff
self.color_array[:, :3] = (start_colors + 
                           diff_activity[:, numpy.newaxis] + 
                           end_colors)
self.color_array[:, 3] = 1

Note that I had to do a lot of guessing, since I'm not sure what all your variables are and what the code is supposed to do, so I can't guarantee this code runs. I turned color_array into a two-dimensional array, since this seems more appropriate. This probably requires changes in other parts of the code (or you need to flatten the array again).

I assume that self.min and abs_diff are scalars and all other names reference NumPy arrays of the following shapes:

activity.shape == (len(vertices) // 3,)
start_colors.shape == (3,)
end_colors.shape == (3,)

It also looks as if vertices is a one-dimensional array and should be a two-dimensional array.

虫児飞 2024-12-27 16:55:09
  1. 首先:使用 cProfile 分析你的代码
  2. 你应该使用xrange 而不是 range
  3. 您应该避免回忆 self.color_array 每个循环4次,尝试在循环之前创建一个局部变量,并将其用于循环: local_array = self.color_array
  4. 尝试预先计算 start_colors[N]end_colors[ N] : start_color_0 = start_colors[0]
  5. 尝试使用list.extend() 减少循环中的行数:

    local_array.extend([
       start_colors_0 + diff_activity * end_colors_0,
       start_colors_1 + diff_activity * end_colors_1,
       start_colors_2 + diff_activity * end_colors_2,
       1
    ])
    
  1. First of all : profile your code with cProfile
  2. You should use xrange instead of range
  3. You should avoid to recall self.color_array 4 times on each loop, try to create a local variable before the loop, and use it into the loop : local_array = self.color_array
  4. try to pre-compute the start_colors[N] and end_colors[N] : start_color_0 = start_colors[0]
  5. try to use list.extend() to reduce lines in loop :

    local_array.extend([
       start_colors_0 + diff_activity * end_colors_0,
       start_colors_1 + diff_activity * end_colors_1,
       start_colors_2 + diff_activity * end_colors_2,
       1
    ])
    
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文