BLAS:gemm 与 gemv
为什么 BLAS 有一个用于矩阵-矩阵乘法的gemm函数和一个单独的用于矩阵-向量乘法的gemv函数?矩阵-向量乘法不是矩阵-矩阵乘法的一种特殊情况,其中一个矩阵只有一行/列吗?
Why does BLAS have a gemm
function for matrix-matrix multiplication and a separate gemv
function for matrix-vector multiplication? Isn't matrix-vector multiplication just a special case of matrix-matrix multiplication where one matrix only has one row/column?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在数学上,矩阵-向量乘法是矩阵-矩阵乘法的特例,但在软件库中实现的情况不一定如此。
他们支持不同的选择。例如,gemv 支持对其正在操作的向量进行跨步访问,而gemm 不支持跨步矩阵布局。在 C 语言绑定中,gemm 要求您指定所有三个矩阵的存储顺序,而在向量参数的 gemv 中则不需要,因为它没有意义。
除了支持不同的选项之外,还有一些可能在
gemm
上执行但不适用于gemv
的优化。如果您知道自己正在做矩阵向量乘积,那么您不希望库在切换到针对该情况优化的代码路径之前浪费时间来弄清楚这种情况;你宁愿直接调用它。Mathematically, matrix-vector multiplication is a special case of matrix-matrix multiplication, but that's not necessarily true of them as realized in a software library.
They support different options. For example,
gemv
supports strided access to the vectors on which it is operating, whereasgemm
does not support strided matrix layouts. In the C language bindings,gemm
requires that you specify the storage ordering of all three matrices, whereas that is unnecessary ingemv
for the vector arguments because it would be meaningless.Besides supporting different options, there are families of optimizations that might be performed on
gemm
that are not applicable togemv
. If you know that you are doing a matrix-vector product, you don't want the library to waste time figuring out that's the case before switching into a code path that is optimized for that case; you'd rather call it directly instead.当您优化 gemv 和 gemm 时,应用不同的技术:
如果您想了解更多详细信息,请告诉我。
When you optimize gemv and gemm different techniques apply:
Let me know if you want more details.
我认为它的 1 级(向量-向量)、2 级(矩阵-向量)和 3 级(矩阵-矩阵)例程更适合 BLAS 层次结构。如果你知道它只是一个向量,它可能可以更好地优化。
I think it just fits the BLAS hierarchy better with its level 1 (vector-vector), level 2 (matrix-vector) and level 3 (matrix-matrix) routines. And it maybe optimizable a bit better if you know it is only a vector.