在 numpy/scipy 中向量化 for 循环?

发布于 2024-08-29 14:37:55 字数 910 浏览 10 评论 0原文

我正在尝试对类方法内部的 for 循环进行矢量化。 for 循环具有以下形式:它迭代一堆点,并根据某个变量(下面称为“self.condition_met”)是否为真,调用该点上的一对函数,并将结果添加到列表中。这里的每个点都是列表向量中的一个元素,即看起来像 array([[1,2,3], [4,5,6], ...]) 的数据结构。这是有问题的函数:

def myClass:
   def my_inefficient_method(self):
       final_vector = []
       # Assume 'my_vector' and 'my_other_vector' are defined numpy arrays
       for point in all_points:
         if not self.condition_met:
             a = self.my_func1(point, my_vector)
             b = self.my_func2(point, my_other_vector)
         else:
             a = self.my_func3(point, my_vector)
             b = self.my_func4(point, my_other_vector)
         c = a + b
         final_vector.append(c)
       # Choose random element from resulting vector 'final_vector'

self.condition_met 是在调用 my_inefficient_method 之前设置的,因此似乎没有必要每次都检查它,但我不知道如何更好地编写它。由于这里没有破坏性操作,看来我可以将整个事情重写为矢量化操作——这可能吗?有什么想法如何做到这一点?

I'm trying to vectorize a for loop that I have inside of a class method. The for loop has the following form: it iterates through a bunch of points and depending on whether a certain variable (called "self.condition_met" below) is true, calls a pair of functions on the point, and adds the result to a list. Each point here is an element in a vector of lists, i.e. a data structure that looks like array([[1,2,3], [4,5,6], ...]). Here is the problematic function:

def myClass:
   def my_inefficient_method(self):
       final_vector = []
       # Assume 'my_vector' and 'my_other_vector' are defined numpy arrays
       for point in all_points:
         if not self.condition_met:
             a = self.my_func1(point, my_vector)
             b = self.my_func2(point, my_other_vector)
         else:
             a = self.my_func3(point, my_vector)
             b = self.my_func4(point, my_other_vector)
         c = a + b
         final_vector.append(c)
       # Choose random element from resulting vector 'final_vector'

self.condition_met is set before my_inefficient_method is called, so it seems unnecessary to check it each time, but I am not sure how to better write this. Since there are no destructive operations here it is seems like I could rewrite this entire thing as a vectorized operation -- is that possible? any ideas how to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

夜灵血窟げ 2024-09-05 14:37:55

这在 NumPy 中只需要几行代码(剩下的只是创建一个数据集、几个函数和设置)。

import numpy as NP

# create two functions 
fnx1 = lambda x : x**2
fnx2 = lambda x : NP.sum(fnx1(x))

# create some data
M = NP.random.randint(10, 99, 40).reshape(8, 5)

# creates index array based on condition satisfaction
# (is the sum (of that row/data point) even or odd)
ndx = NP.where( NP.sum(M, 0) % 2 == 0 )

# only those data points that satisfy the condition (are even) 
# are passed to one function then another and the result off applying both 
# functions to each data point is stored in an array
res = NP.apply_along_axis( fnx2, 1, M[ndx,] )

print(res)
# returns: [[11609 15309 15742 12406  4781]]

根据您的描述,我抽象了此流程:

  • 检查条件(布尔值)'if
    True'
  • 对这些数据调用配对函数
    满足以下条件的点(行)
    条件
  • 附加每组调用的结果
    到列表(下面的“res”)

This only takes a couple lines of code in NumPy (the rest is just creating a data set, a couple of functions, and set-up).

import numpy as NP

# create two functions 
fnx1 = lambda x : x**2
fnx2 = lambda x : NP.sum(fnx1(x))

# create some data
M = NP.random.randint(10, 99, 40).reshape(8, 5)

# creates index array based on condition satisfaction
# (is the sum (of that row/data point) even or odd)
ndx = NP.where( NP.sum(M, 0) % 2 == 0 )

# only those data points that satisfy the condition (are even) 
# are passed to one function then another and the result off applying both 
# functions to each data point is stored in an array
res = NP.apply_along_axis( fnx2, 1, M[ndx,] )

print(res)
# returns: [[11609 15309 15742 12406  4781]]

From your description i abstracted this flow:

  • check for condition (boolean) 'if
    True'
  • calls pair functions on those data
    points (rows) that satisfy the
    condition
  • appends result from each set of calls
    to a list ('res' below)
殊姿 2024-09-05 14:37:55

您可以重写 my_funcx 进行矢量化吗?如果是这样,你可以这样做

def myClass:
   def my_efficient_method(self):
       # Assume 'all_points', 'my_vector' and 'my_other_vector' are defined numpy arrays
       if not self.condition_met:
           a = self.my_func1(all_points, my_vector)
           b = self.my_func2(all_points, my_other_vector)
       else:
           a = self.my_func3(all_points, my_vector)
           b = self.my_func4(all_points, my_other_vector)
       final_vector = a + b
       # Choose random element from resulting vector 'final_vector'

Can you rewrite my_funcx to be vectorized? If so, you can do

def myClass:
   def my_efficient_method(self):
       # Assume 'all_points', 'my_vector' and 'my_other_vector' are defined numpy arrays
       if not self.condition_met:
           a = self.my_func1(all_points, my_vector)
           b = self.my_func2(all_points, my_other_vector)
       else:
           a = self.my_func3(all_points, my_vector)
           b = self.my_func4(all_points, my_other_vector)
       final_vector = a + b
       # Choose random element from resulting vector 'final_vector'
蘑菇王子 2024-09-05 14:37:55

最好执行 mtrw 操作,但如果您不确定矢量化,可以在 my_func 上尝试 numpy.vectorize

http://docs.scipy.org/doc/numpy/reference/ generated/numpy.vectorize.html

It's probably best to do what mtrw, but if you're not sure about vectorizing, you can try numpy.vectorize on the my_funcs

http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文