如何向量化方程？

发布于 2025-01-07 08:16:59 字数 731 浏览 0 评论 0原文

在观看 Andrew Ng 教授关于 GLM 的讲座后，我尝试实现 Softmax 回归算法来解决 K 分类器问题。我以为我理解他所说的一切，直到最后编写代码来实现 Softmax 回归的成本函数，如下所示：

带有权重衰减的 Softmax 回归的成本函数

我遇到的问题是试图找出一种矢量化的方法。我再次认为我理解如何对这样的方程进行向量化，因为我能够对线性回归和逻辑回归进行向量化，但在查看了该公式后，我陷入了困境。

虽然我很想为此找到一个矢量化解决方案（我意识到已经发布了类似的问题：Softmax 回归的矢量化实现），我更感兴趣的是你们中的任何人是否可以告诉我一种方法（您的方式）有条理将这样的方程转换为矢量化形式。例如，对于 ML 领域的专家或经验丰富的老手来说，当您第一次阅读文献中的新算法，并看到它们以与上面等式类似的表示法编写时，您如何将它们转换为矢量化形式？

我意识到我可能会表现得像那个问莫扎特的学生：“你怎么弹得这么好？”但我的问题只是出于想要更好地掌握这种材料的愿望，并假设不是每个人生来就知道如何矢量化方程，因此肯定有人设计了自己的系统，如果是这样，请分享！非常感谢！

干杯

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

痴者 2025-01-14 08:17:00

这看起来很难矢量化，因为你在求和中进行指数运算。我假设你正在将 e 提高到任意幂。您可以向量化的是表达式 \sum \sum theta ^2 的第二项，只需确保在 matlab 中使用 .* 运算符在此处输入链接描述到计算机 \theta ^2

进入对数的比率的内项也是如此。 \theta ' x^(i) 是可向量化表达式。

您还可能受益于记忆或动态编程技术，并尝试重用 e^\theta' x^(i) 的计算结果。

一般来说，根据我的经验，矢量化的方法首先是让非矢量化实现工作。然后尝试对计算中最明显的部分进行矢量化。在每一步中，对您的函数进行很少的调整，并始终检查是否获得与非矢量化计算相同的结果。此外，拥有多个测试用例也非常有帮助。

回复收藏 0 原文

远昼 2025-01-14 08:17:00

Octave 附带的帮助文件有这样的条目：

19.1 基本矢量化

对于一个非常好的初步近似，矢量化的目标是
编写避免循环并使用整个数组操作的代码。作为一个
简单的例子，考虑

 for i = 1:n
   for j = 1:m
     c(i,j) = a(i,j) + b(i,j);
   endfor
 endfor

与更简单的例子相比，

 c = a + b;

这不仅更容易编写；在内部也更容易
优化。 Octave 将此操作委托给底层
除其他优化外，还可以使用特殊向量的实现
硬件指令或者可以想象甚至可以执行添加
平行线。一般来说，如果代码是矢量化的，底层的
实施对于它可以做出的假设有更多的自由
以达到更快的执行速度。

这对于具有“廉价”主体的循环尤其重要。经常它
只需对最内层循环进行矢量化即可接受
表现。一般的经验法则是，
矢量化体应该大于或等于“阶”
封闭循环。

作为一个不那么简单的例子，而不是

 for i = 1:n-1
   a(i) = b(i+1) - b(i);
 endfor

write

 a = b(2:n) - b(1:n-1);

这显示了一个关于使用数组的重要一般概念
索引而不是循环索引变量。索引表达式。
还要大量使用布尔索引。如果有条件
需要测试，这个条件也可以写成布尔值
指数。例如，不使用

 for i = 1:n
   if (a(i) > 5)
     a(i) -= 20
   endif
 endfor

write

 a(a>5) -= 20;

而是利用 'a > > 的事实。 5' 产生布尔索引。

尽可能使用元素向量运算符以避免循环
（“.*”和“.^”等运算符）。算术运算。对于简单的
内联函数，“向量化”函数可以自动执行此操作。

-- 内置函数：向量化（FUN）
通过替换创建内联函数 FUN 的矢量化版本
所有出现的 ''、'/' 等以及 '.'、'./' 等。

 This may be useful, for example, when using inline functions with
 numerical integration or optimization where a vector-valued
 function is expected.

      fcn = vectorize (inline ("x^2 - 1"))
         => fcn = f(x) = x.^2 - 1
      quadv (fcn, 0, 3)
         => 6

 See also:  inline,  formula,
  argnames.

还利用这些元素运算符中的广播来
避免循环和不必要的中间内存分配。
广播。

如果可能的话，使用内置函数和库函数。内置和
编译后的函数非常快。即使有m文件库函数，
它很可能已经被优化，或者将会被进一步优化
在未来的版本中。

例如，甚至比

 a = b(2:n) - b(1:n-1);

大多数

 a = diff (b);

Octave 函数都用向量和数组参数编写
头脑。如果您发现自己编写了一个具有非常简单操作的循环，
很可能这样的功能已经存在。下列
函数在向量化代码中频繁出现：

索引操作
<前><代码>* 查找
* 子2ind
* ind2sub
* 种类
* 独特的
* 抬头
* 如果否则/合并
重复
<前><代码>* 重复格式
* 重复
矢量化算术
<前><代码>* 总和
* 产品
* 累积值
* 累积量
* 总和
* 差异
* 点
* 最大高潮
*小康明
高维数组的形状
<前><代码>* 重塑
* 调整大小
* 排列
*挤压
* 交易

另请参阅斯坦福 ML wiki 中的这些页面，以获得更多示例指导。

http://ufldl.stanford.edu/wiki/index.php/Vectorization

http://ufldl.stanford.edu/wiki/index.php/Logistic_Regression_Vectorization_Example

http://ufldl.stanford.edu/wiki/index.php/Neural_Network_Vectorization

The help files that come with Octave have this entry:

19.1 Basic Vectorization

To a very good first approximation, the goal in vectorization is to
write code that avoids loops and uses whole-array operations. As a
trivial example, consider

 for i = 1:n
   for j = 1:m
     c(i,j) = a(i,j) + b(i,j);
   endfor
 endfor

compared to the much simpler

 c = a + b;

This isn't merely easier to write; it is also internally much easier to
optimize. Octave delegates this operation to an underlying
implementation which, among other optimizations, may use special vector
hardware instructions or could conceivably even perform the additions in
parallel. In general, if the code is vectorized, the underlying
implementation has more freedom about the assumptions it can make in
order to achieve faster execution.

This is especially important for loops with "cheap" bodies. Often it
suffices to vectorize just the innermost loop to get acceptable
performance. A general rule of thumb is that the "order" of the
vectorized body should be greater or equal to the "order" of the
enclosing loop.

As a less trivial example, instead of

 for i = 1:n-1
   a(i) = b(i+1) - b(i);
 endfor

write

 a = b(2:n) - b(1:n-1);

This shows an important general concept about using arrays for
indexing instead of looping over an index variable. Index Expressions.
Also use boolean indexing generously. If a condition
needs to be tested, this condition can also be written as a boolean
index. For instance, instead of

 for i = 1:n
   if (a(i) > 5)
     a(i) -= 20
   endif
 endfor

write

 a(a>5) -= 20;

which exploits the fact that 'a > 5' produces a boolean index.

Use elementwise vector operators whenever possible to avoid looping
(operators like '.*' and '.^'). Arithmetic Ops. For simple
inline functions, the 'vectorize' function can do this automatically.

-- Built-in Function: vectorize (FUN)
Create a vectorized version of the inline function FUN by replacing
all occurrences of '', '/', etc., with '.', './', etc.

 This may be useful, for example, when using inline functions with
 numerical integration or optimization where a vector-valued
 function is expected.

      fcn = vectorize (inline ("x^2 - 1"))
         => fcn = f(x) = x.^2 - 1
      quadv (fcn, 0, 3)
         => 6

 See also:  inline,  formula,
  argnames.

Also exploit broadcasting in these elementwise operators both to
avoid looping and unnecessary intermediate memory allocations.
Broadcasting.

Use built-in and library functions if possible. Built-in and
compiled functions are very fast. Even with an m-file library function,
chances are good that it is already optimized, or will be optimized more
in a future release.

For instance, even better than

 a = b(2:n) - b(1:n-1);

 a = diff (b);

Most Octave functions are written with vector and array arguments in
mind. If you find yourself writing a loop with a very simple operation,
chances are that such a function already exists. The following
functions occur frequently in vectorized code:

Index manipulation

* find

* sub2ind

* ind2sub

* sort

* unique

* lookup

* ifelse / merge

Repetition
```
* repmat

* repelems
```

Vectorized arithmetic

* sum

* prod

* cumsum

* cumprod

* sumsq

* diff

* dot

* cummax

* cummin

Shape of higher dimensional arrays

* reshape

* resize

* permute

* squeeze

* deal

Also look at these pages from a Stanford ML wiki for some more guidance with examples.

http://ufldl.stanford.edu/wiki/index.php/Vectorization

http://ufldl.stanford.edu/wiki/index.php/Logistic_Regression_Vectorization_Example

http://ufldl.stanford.edu/wiki/index.php/Neural_Network_Vectorization

回复收藏 0 原文

~没有更多了~

关于作者

夏见

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何向量化方程？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

19.1 基本矢量化

19.1 Basic Vectorization

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

如何向量化方程？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

19.1 基本矢量化

19.1 Basic Vectorization

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。