如何向量化方程?
在观看 Andrew Ng 教授关于 GLM 的讲座后,我尝试实现 Softmax 回归算法来解决 K 分类器问题。我以为我理解他所说的一切,直到最后编写代码来实现 Softmax 回归的成本函数,如下所示:
我遇到的问题是试图找出一种矢量化的方法。我再次认为我理解如何对这样的方程进行向量化,因为我能够对线性回归和逻辑回归进行向量化,但在查看了该公式后,我陷入了困境。
虽然我很想为此找到一个矢量化解决方案(我意识到已经发布了类似的问题:Softmax 回归的矢量化实现),我更感兴趣的是你们中的任何人是否可以告诉我一种方法(您的方式)有条理将这样的方程转换为矢量化形式。例如,对于 ML 领域的专家或经验丰富的老手来说,当您第一次阅读文献中的新算法,并看到它们以与上面等式类似的表示法编写时,您如何将它们转换为矢量化形式?
我意识到我可能会表现得像那个问莫扎特的学生:“你怎么弹得这么好?”但我的问题只是出于想要更好地掌握这种材料的愿望,并假设不是每个人生来就知道如何矢量化方程,因此肯定有人设计了自己的系统,如果是这样,请分享!非常感谢!
干杯
I'm trying to implement the Softmax regression algorithm to solve the K-classifier problem after watching Professor Andrew Ng's lectures on GLM. I thought I understood everything he was saying until it finally came to writing the code to implement the cost function for Softmax regression, which is as follows:
The problem I am having is trying to figure out a way to vectorize this. Again I thought I understood how to go about vectorizing equations like this since I was able to do it for linear and logistic regression, but after looking at that formula I am stuck.
While I would love to figure out a vectorized solution for this (I realize there is a similar question posted already: Vectorized Implementation of Softmax Regression), what I am more interested in is whether any of you can tell me a way (your way) to methodically convert equations like this into vectorized forms. For example, for those of you who are experts or seasoned veterans in ML, when you read of new algorithms in the literature for the first time, and see them written in similar notation to the equation above, how do you go about converting them to vectorized forms?
I realize I might be coming off as being like the student who is asking Mozart, "How do you play the piano so well?" But my question is simply motivated from a desire to become better at this material, and assuming that not everyone was born knowing how to vectorize equations, and so someone out there must have devised their own system, and if so, please share! Many thanks in advance!
Cheers
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这看起来很难矢量化,因为你在求和中进行指数运算。我假设你正在将 e 提高到任意幂。您可以向量化的是表达式 \sum \sum theta ^2 的第二项,只需确保在 matlab 中使用 .* 运算符 在此处输入链接描述到计算机 \theta ^2
进入对数的比率的内项也是如此。 \theta ' x^(i) 是可向量化表达式。
您还可能受益于记忆或动态编程技术,并尝试重用 e^\theta' x^(i) 的计算结果。
一般来说,根据我的经验,矢量化的方法首先是让非矢量化实现工作。然后尝试对计算中最明显的部分进行矢量化。在每一步中,对您的函数进行很少的调整,并始终检查是否获得与非矢量化计算相同的结果。此外,拥有多个测试用例也非常有帮助。
This one looks pretty hard to vectorize since you are doing exponentials inside of your summations. I assume you are raising e to arbitrary powers. What you can vectorize is the second term of the expression \sum \sum theta ^2 just make sure to use .* operator in matlab enter link description here to computer \theta ^2
Same goes for the inner terms of the ratio of the that goes into the logarithm. \theta ' x^(i) is vectorizable expression.
You might also benefit from a memoization or dynamic programming technique and try to reuse the results of computations of e^\theta' x^(i).
Generally in my experience the way to vectorize is first to get non-vectorized implementation working. Then try to vectorize the most obvious parts of your computation. At every step tweak your function very little and always check if you get the same result as non-vectorized computation. Also, having multiple test cases is very helpful.
Octave 附带的帮助文件有这样的条目:
19.1 基本矢量化
对于一个非常好的初步近似,矢量化的目标是
编写避免循环并使用整个数组操作的代码。作为一个
简单的例子,考虑
与更简单的例子相比,
这不仅更容易编写;在内部也更容易
优化。 Octave 将此操作委托给底层
除其他优化外,还可以使用特殊向量的实现
硬件指令或者可以想象甚至可以执行添加
平行线。一般来说,如果代码是矢量化的,底层的
实施对于它可以做出的假设有更多的自由
以达到更快的执行速度。
这对于具有“廉价”主体的循环尤其重要。经常它
只需对最内层循环进行矢量化即可接受
表现。一般的经验法则是,
矢量化体应该大于或等于“阶”
封闭循环。
作为一个不那么简单的例子,而不是
write
这显示了一个关于使用数组的重要一般概念
索引而不是循环索引变量。  索引表达式。
还要大量使用布尔索引。如果有条件
需要测试,这个条件也可以写成布尔值
指数。例如,不使用
write
而是利用 'a > > 的事实。 5' 产生布尔索引。
尽可能使用元素向量运算符以避免循环
(“.*”和“.^”等运算符)。  算术运算。对于简单的
内联函数,“向量化”函数可以自动执行此操作。
-- 内置函数:向量化(FUN)
通过替换创建内联函数 FUN 的矢量化版本
所有出现的 ''、'/' 等以及 '.'、'./' 等。
还利用这些元素运算符中的广播来
避免循环和不必要的中间内存分配。
 广播。
如果可能的话,使用内置函数和库函数。内置和
编译后的函数非常快。即使有m文件库函数,
它很可能已经被优化,或者将会被进一步优化
在未来的版本中。
例如,甚至比
大多数
Octave 函数都用向量和数组参数编写
头脑。如果您发现自己编写了一个具有非常简单操作的循环,
很可能这样的功能已经存在。下列
函数在向量化代码中频繁出现:
索引操作
<前><代码>* 查找
* 子2ind
* ind2sub
* 种类
* 独特的
* 抬头
* 如果否则/合并
重复
<前><代码>* 重复格式
* 重复
矢量化算术
<前><代码>* 总和
* 产品
* 累积值
* 累积量
* 总和
* 差异
* 点
* 最大高潮
*小康明
高维数组的形状
<前><代码>* 重塑
* 调整大小
* 排列
*挤压
* 交易
另请参阅斯坦福 ML wiki 中的这些页面,以获得更多示例指导。
http://ufldl.stanford.edu/wiki/index.php/Vectorization
http://ufldl.stanford.edu/wiki/index.php/Logistic_Regression_Vectorization_Example
http://ufldl.stanford.edu/wiki/index.php/Neural_Network_Vectorization
The help files that come with Octave have this entry:
19.1 Basic Vectorization
To a very good first approximation, the goal in vectorization is to
write code that avoids loops and uses whole-array operations. As a
trivial example, consider
compared to the much simpler
This isn't merely easier to write; it is also internally much easier to
optimize. Octave delegates this operation to an underlying
implementation which, among other optimizations, may use special vector
hardware instructions or could conceivably even perform the additions in
parallel. In general, if the code is vectorized, the underlying
implementation has more freedom about the assumptions it can make in
order to achieve faster execution.
This is especially important for loops with "cheap" bodies. Often it
suffices to vectorize just the innermost loop to get acceptable
performance. A general rule of thumb is that the "order" of the
vectorized body should be greater or equal to the "order" of the
enclosing loop.
As a less trivial example, instead of
write
This shows an important general concept about using arrays for
indexing instead of looping over an index variable.  Index Expressions.
Also use boolean indexing generously. If a condition
needs to be tested, this condition can also be written as a boolean
index. For instance, instead of
write
which exploits the fact that 'a > 5' produces a boolean index.
Use elementwise vector operators whenever possible to avoid looping
(operators like '.*' and '.^').  Arithmetic Ops. For simple
inline functions, the 'vectorize' function can do this automatically.
-- Built-in Function: vectorize (FUN)
Create a vectorized version of the inline function FUN by replacing
all occurrences of '', '/', etc., with '.', './', etc.
Also exploit broadcasting in these elementwise operators both to
avoid looping and unnecessary intermediate memory allocations.
 Broadcasting.
Use built-in and library functions if possible. Built-in and
compiled functions are very fast. Even with an m-file library function,
chances are good that it is already optimized, or will be optimized more
in a future release.
For instance, even better than
is
Most Octave functions are written with vector and array arguments in
mind. If you find yourself writing a loop with a very simple operation,
chances are that such a function already exists. The following
functions occur frequently in vectorized code:
Index manipulation
Repetition
Vectorized arithmetic
Shape of higher dimensional arrays
Also look at these pages from a Stanford ML wiki for some more guidance with examples.
http://ufldl.stanford.edu/wiki/index.php/Vectorization
http://ufldl.stanford.edu/wiki/index.php/Logistic_Regression_Vectorization_Example
http://ufldl.stanford.edu/wiki/index.php/Neural_Network_Vectorization