BLAS sgemm/dgemm 如何工作?

发布于 2024-11-02 06:15:15 字数 1087 浏览 4 评论 0原文

我正在尝试使用 python 中的 ctypes 来使用 BLAS 中的函数 sgemm 。尝试求解C = A x B,以下代码工作得很好:

no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_B)
k = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)

blaslib.sgemm_(byref(no_trans), byref(no_trans), byref(m), byref(n), byref(k),
               byref(one), A, byref(m), B, byref(k), byref(zero), C, byref(m))

现在我想求解这个方程:C = A' x A,其中A' A 的转置,以下代码运行时没有异常,但返回的结果是错误的:

trans = c_char("t")
no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)

blaslib.sgemm_(byref(trans), byref(no_trans), byref(n), byref(n), byref(m),
               byref(one), A, byref(m), A, byref(m), byref(zero), C, byref(n))

对于测试,我插入了一个矩阵 A = [1 2; 3 4]。正确的结果是C = [10 14; 14 20]sgemm 例程会输出 C = [5 11; 11 25]

据我了解,矩阵 A 不必由我转置,因为算法会处理它。在第二种情况下我的参数传递有什么问题?

如有任何帮助、链接、文章、建议,我们将不胜感激!

I am trying to make use of the function sgemm in BLAS using ctypes in python. Trying to solve C = A x B the following code works just fine:

no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_B)
k = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)

blaslib.sgemm_(byref(no_trans), byref(no_trans), byref(m), byref(n), byref(k),
               byref(one), A, byref(m), B, byref(k), byref(zero), C, byref(m))

Now I would like to solve this equation: C = A' x A where A' is the transpose of A and the following code runs without an exception but the result returned is wrong:

trans = c_char("t")
no_trans = c_char("n")
m = c_int(number_of_rows_of_A)
n = c_int(number_of_columns_of_A)
one = c_float(1.0)
zero = c_float(0.0)

blaslib.sgemm_(byref(trans), byref(no_trans), byref(n), byref(n), byref(m),
               byref(one), A, byref(m), A, byref(m), byref(zero), C, byref(n))

For a test I inserted a matrix A = [1 2; 3 4]. The correct result is C = [10 14; 14 20] but the sgemm routine spits out C = [5 11; 11 25].

As far as I understand it, the matrix A does not have to be transposed by me since the algorithm takes care of it. What is wrong with my parameter passing in the second case?

Any help, link, article, advice is appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

野鹿林 2024-11-09 06:15:15

Blas 通常使用列主矩阵(如 Fortran),因此 A = [1 2; 3 4] 表示

    |1 3|   
A = |   |
    |2 4|

结果是正确的(假设您的 Python 库也这样做)。请参阅此自述文件

Blas typically uses column-major matrices (like Fortran), hence A = [1 2; 3 4] means

    |1 3|   
A = |   |
    |2 4|

and the result is correct (assuming that your Python library does the same). See this read-me

挽清梦 2024-11-09 06:15:15

您得到的结果表明 sgemm 已计算出 A*A' 而不是您想要的 A'*A。简单的解决方案是将两个输入切换到该函数。

The result you have indicates sgemm had computed A*A' rather than A'*A as you wanted. The simple solution is to switch the two inputs to the function.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文