BLAS sgemm/dgemm 如何工作?
我正在尝试使用 python 中的 ctypes 来使用 BLAS 中的函数 sgemm 。尝试求解C = A x B,以下代码工作得很好:
no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_B)
k = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)
blaslib.sgemm_(byref(no_trans), byref(no_trans), byref(m), byref(n), byref(k),
byref(one), A, byref(m), B, byref(k), byref(zero), C, byref(m))
现在我想求解这个方程:C = A' x A,其中A' 是 A 的转置,以下代码运行时没有异常,但返回的结果是错误的:
trans = c_char("t")
no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)
blaslib.sgemm_(byref(trans), byref(no_trans), byref(n), byref(n), byref(m),
byref(one), A, byref(m), A, byref(m), byref(zero), C, byref(n))
对于测试,我插入了一个矩阵 A = [1 2; 3 4]。正确的结果是C = [10 14; 14 20] 但 sgemm
例程会输出 C = [5 11; 11 25]。
据我了解,矩阵 A 不必由我转置,因为算法会处理它。在第二种情况下我的参数传递有什么问题?
如有任何帮助、链接、文章、建议,我们将不胜感激!
I am trying to make use of the function sgemm
in BLAS using ctypes in python. Trying to solve C = A x B the following code works just fine:
no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_B)
k = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)
blaslib.sgemm_(byref(no_trans), byref(no_trans), byref(m), byref(n), byref(k),
byref(one), A, byref(m), B, byref(k), byref(zero), C, byref(m))
Now I would like to solve this equation: C = A' x A where A' is the transpose of A and the following code runs without an exception but the result returned is wrong:
trans = c_char("t")
no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)
blaslib.sgemm_(byref(trans), byref(no_trans), byref(n), byref(n), byref(m),
byref(one), A, byref(m), A, byref(m), byref(zero), C, byref(n))
For a test I inserted a matrix A = [1 2; 3 4]. The correct result is C = [10 14; 14 20] but the sgemm
routine spits out C = [5 11; 11 25].
As far as I understand it, the matrix A does not have to be transposed by me since the algorithm takes care of it. What is wrong with my parameter passing in the second case?
Any help, link, article, advice is appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Blas 通常使用列主矩阵(如 Fortran),因此 A = [1 2; 3 4] 表示
结果是正确的(假设您的 Python 库也这样做)。请参阅此自述文件
Blas typically uses column-major matrices (like Fortran), hence
A = [1 2; 3 4]
meansand the result is correct (assuming that your Python library does the same). See this read-me
您得到的结果表明 sgemm 已计算出 A*A' 而不是您想要的 A'*A。简单的解决方案是将两个输入切换到该函数。
The result you have indicates
sgemm
had computed A*A' rather than A'*A as you wanted. The simple solution is to switch the two inputs to the function.