矩阵未对齐错误:Python SciPy fmin_bfgs
问题概要: 当尝试使用 scipy.optimize.fmin_bfgs 最小化(优化)函数时,该函数会抛出
derphi0 = np.dot(gfk, pk) ValueError:矩阵未对齐
错误。根据我的错误检查,这发生在 fmin_bfgs 第一次迭代的最后——就在返回任何值或调用回调之前。
配置: 视窗Vista Python 3.2.2 SciPy 0.10 IDE = Eclipse 与 PyDev
详细说明: 我使用 scipy.optimize.fmin_bfgs 来最小化简单逻辑回归实现的成本(从 Octave 转换为 Python/SciPy)。基本上,成本函数被命名为cost_arr函数,梯度下降在gradient_descent_arr函数中。
我已经手动测试并完全验证 *cost_arr* 和 *gradient_descent_arr* 正常工作并正确返回所有值。我还进行了测试以验证是否将正确的参数传递给 *fmin_bfgs* 函数。然而,运行时,我得到 ValueError: 矩阵未对齐。根据消息来源审查,确切的错误发生在
def line_search_wolfe1 # Minpack 的 Wolfe 行和标量搜索中的函数,由 scipy 包提供。
值得注意的是,如果我使用 scipy.optimize.fmin 代替,fmin 函数将运行完成。
确切错误:
文件 “D:\Users\Shannon\Programming\Eclipse\workspace\SBML\sbml\LogisticRegression.py”, 第 395 行,在 fminunc_opt
optcost = scipy.optimize.fmin_bfgs(self.cost_arr,initialtheta,fprime = self.gradient_descent_arr,args = myargs,maxiter = maxnumit,callback = self.callback_fmin_bfgs,retall = True)
文件 “C:\Python32x32\lib\site-packages\scipy\optimize\optimize.py”,行 533,在 fmin_bfgs old_fval,old_old_fval)
文件“C:\Python32x32\lib\site-packages\scipy\optimize\linesearch.py”,行 76,在 line_search_wolfe1 中 derphi0 = np.dot(gfk, pk) ValueError:矩阵未对齐
我用以下方法调用优化函数: optcost = scipy.optimize.fmin_bfgs(self.cost_arr,initialtheta,fprime=self.gradient_descent_arr,args=myargs,maxiter=maxnumit,callback=self.callback_fmin_bfgs,retall=True)
我花了几天时间试图解决这个问题但无法似乎确定了导致矩阵未对齐错误的原因。
附录:2012-01-08 我对此进行了更多的研究,似乎已经缩小了问题的范围(但对如何解决它们感到困惑)。首先,fmin(仅使用 fmin)使用这些函数来工作——成本、梯度。其次,在手动实现的单次迭代中进行测试时,成本函数和梯度函数都准确返回预期值(不使用 fmin_bfgs)。第三,我向optimize.linsearch添加了错误代码,错误似乎是在def line_search_wolfe1处抛出的:derphi0 = np.dot(gfk, pk)。 在这里,根据我的测试, scipy.optimize.optimize pk = [[ 12.00921659] [11.26284221]]pk类型=和scipy.optimize.optimizegfk=[[-12.00921659][-11.26284221]]gfk类型= 注意:根据我的测试,错误是在通过 fmin_bfgs 的第一次迭代时抛出的(即,fmin_bfgs 甚至从未完成单个迭代或更新)。
我很感谢任何指导或见解。
我的代码如下(日志记录,文档已删除): 假设 theta = 2x1 ndarray (实际:theta Info Size=(2, 1) Type = ) 假设 X = 100x2 ndarray (实际:X Info Size=(2, 100) Type = ) 假设 y = 100x1 ndarray (实际:y Info Size=(100, 1) Type = )
def cost_arr(self, theta, X, y):
theta = scipy.resize(theta,(2,1))
m = scipy.shape(X)
m = 1 / m[1] # Use m[1] because this is the length of X
logging.info(__name__ + "cost_arr reports m = " + str(m))
z = scipy.dot(theta.T, X) # Must transpose the vector theta
hypthetax = self.sigmoid(z)
yones = scipy.ones(scipy.shape(y))
hypthetaxones = scipy.ones(scipy.shape(hypthetax))
costright = scipy.dot((yones - y).T, ((scipy.log(hypthetaxones - hypthetax)).T))
costleft = scipy.dot((-1 * y).T, ((scipy.log(hypthetax)).T))
def gradient_descent_arr(self, theta, X, y):
theta = scipy.resize(theta,(2,1))
m = scipy.shape(X)
m = 1 / m[1] # Use m[1] because this is the length of X
x = scipy.dot(theta.T, X) # Must transpose the vector theta
sig = self.sigmoid(x)
sig = sig.T - y
grad = scipy.dot(X,sig)
grad = m * grad
return grad
def fminunc_opt_bfgs(self, initialtheta, X, y, maxnumit):
myargs= (X,y)
optcost = scipy.optimize.fmin_bfgs(self.cost_arr, initialtheta, fprime=self.gradient_descent_arr, args=myargs, maxiter=maxnumit, retall=True, full_output=True)
return optcost
Problem Synopsis:
When attempting to use the scipy.optimize.fmin_bfgs minimization (optimization) function, the function throws a
derphi0 = np.dot(gfk, pk)
ValueError: matrices are not aligned
error. According to my error checking this occurs at the very end of the first iteration through fmin_bfgs--just before any values are returned or any calls to callback.
Configuration:
Windows Vista
Python 3.2.2
SciPy 0.10
IDE = Eclipse with PyDev
Detailed Description:
I am using the scipy.optimize.fmin_bfgs to minimize the cost of a simple logistic regression implementation (converting from Octave to Python/SciPy). Basically, the cost function is named cost_arr function and the gradient descent is in gradient_descent_arr function.
I have manually tested and fully verified that *cost_arr* and *gradient_descent_arr* work properly and return all values properly. I also tested to verify that the proper parameters are passed to the *fmin_bfgs* function. Nevertheless, when run, I get the ValueError: matrices are not aligned. According to the source review, the exact error occurs in the
def line_search_wolfe1
function in # Minpack's Wolfe line and scalar searches as supplied by the scipy packages.
Notably, if I use scipy.optimize.fmin instead, the fmin function runs to completion.
Exact Error:
File
"D:\Users\Shannon\Programming\Eclipse\workspace\SBML\sbml\LogisticRegression.py",
line 395, in fminunc_optoptcost = scipy.optimize.fmin_bfgs(self.cost_arr, initialtheta, fprime=self.gradient_descent_arr, args=myargs, maxiter=maxnumit, callback=self.callback_fmin_bfgs, retall=True)
File
"C:\Python32x32\lib\site-packages\scipy\optimize\optimize.py", line
533, in fmin_bfgs old_fval,old_old_fval)
File "C:\Python32x32\lib\site-packages\scipy\optimize\linesearch.py", line
76, in line_search_wolfe1
derphi0 = np.dot(gfk, pk)
ValueError: matrices are not aligned
I call the optimization function with:
optcost = scipy.optimize.fmin_bfgs(self.cost_arr, initialtheta, fprime=self.gradient_descent_arr, args=myargs, maxiter=maxnumit, callback=self.callback_fmin_bfgs, retall=True)
I have spent a few days trying to fix this and cannot seem to determine what is causing the matrices are not aligned error.
ADDENDUM: 2012-01-08
I worked with this a lot more and seem to have narrowed the issues (but am baffled on how to fix them). First, fmin (using just fmin) works using these functions--cost, gradient. Second, the cost and the gradient functions both accurately return expected values when tested in a single iteration in a manual implementation (NOT using fmin_bfgs). Third, I added error code to optimize.linsearch and the error seems to be thrown at def line_search_wolfe1 in line: derphi0 = np.dot(gfk, pk).
Here, according to my tests, scipy.optimize.optimize pk = [[ 12.00921659]
[ 11.26284221]]pk type = and scipy.optimize.optimizegfk = [[-12.00921659] [-11.26284221]]gfk type =
Note: according to my tests, the error is thrown on the very first iteration through fmin_bfgs (i.e., fmin_bfgs never even completes a single iteration or update).
I appreciate ANY guidance or insights.
My Code Below (logging, documentation removed):
Assume theta = 2x1 ndarray (Actual: theta Info Size=(2, 1) Type = )
Assume X = 100x2 ndarray (Actual: X Info Size=(2, 100) Type = )
Assume y = 100x1 ndarray (Actual: y Info Size=(100, 1) Type = )
def cost_arr(self, theta, X, y):
theta = scipy.resize(theta,(2,1))
m = scipy.shape(X)
m = 1 / m[1] # Use m[1] because this is the length of X
logging.info(__name__ + "cost_arr reports m = " + str(m))
z = scipy.dot(theta.T, X) # Must transpose the vector theta
hypthetax = self.sigmoid(z)
yones = scipy.ones(scipy.shape(y))
hypthetaxones = scipy.ones(scipy.shape(hypthetax))
costright = scipy.dot((yones - y).T, ((scipy.log(hypthetaxones - hypthetax)).T))
costleft = scipy.dot((-1 * y).T, ((scipy.log(hypthetax)).T))
def gradient_descent_arr(self, theta, X, y):
theta = scipy.resize(theta,(2,1))
m = scipy.shape(X)
m = 1 / m[1] # Use m[1] because this is the length of X
x = scipy.dot(theta.T, X) # Must transpose the vector theta
sig = self.sigmoid(x)
sig = sig.T - y
grad = scipy.dot(X,sig)
grad = m * grad
return grad
def fminunc_opt_bfgs(self, initialtheta, X, y, maxnumit):
myargs= (X,y)
optcost = scipy.optimize.fmin_bfgs(self.cost_arr, initialtheta, fprime=self.gradient_descent_arr, args=myargs, maxiter=maxnumit, retall=True, full_output=True)
return optcost
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
万一其他人遇到这个问题....
1)错误1:正如评论中所述,我错误地以多维数组(m,n)或(m,1)的形式返回了渐变的值。 fmin_bfgs 似乎需要梯度的一维数组输出(也就是说,您必须返回 (m,) 数组而不是 (m,1) 数组。如果您不确定,请使用 scipy.shape(myarray) 检查尺寸返回值。
修复涉及在从梯度函数返回梯度之前添加:
数组从 (m,1) 平坦化为 (m,) 可以将其作为输入。
将 错误 2:请记住,fmin_bfgs 似乎适用于非线性函数。就我而言,我最初使用的示例是线性函数。这似乎可以解释即使在上述扁平化修复之后仍会出现一些异常结果。对于线性函数,fmin(而不是 fmin_bfgs)可能效果更好。
量子电动力学
In case anyone else encounters this problem ....
1) ERROR 1: As noted in the comments, I incorrectly returned the value from my gradient as a multidimensional array (m,n) or (m,1). fmin_bfgs seems to require a 1d array output from the gradient (that is, you must return a (m,) array and NOT a (m,1) array. Use scipy.shape(myarray) to check the dimensions if you are unsure of the return value.
The fix involved adding:
just before returning the gradient from your gradient function. This "flattens" the array from (m,1) to (m,). fmin_bfgs can take this as input.
2) ERROR 2: Remember, the fmin_bfgs seems to work with NONlinear functions. In my case, the sample that I was initially working with was a LINEAR function. This appears to explain some of the anomalous results even after the flatten fix mentioned above. For LINEAR functions, fmin, rather than fmin_bfgs, may work better.
QED
从当前 scipy 版本开始,您不需要传递 fprime 参数。它会毫无问题地为您计算梯度。您还可以使用“最小化”fn 并将方法作为“bfgs”传递,而不提供梯度作为参数。
As of current scipy version you need not pass fprime argument. It will compute the gradient for you without any issues. You can also use 'minimize' fn and pass method as 'bfgs' instead without providing gradient as argument.