BLAS 2 的 C 与 Fortran
我有一个应用程序,需要执行大量范数、点积,最重要的是矩阵向量乘法。
矩阵和向量巨大。矩阵维度往往是 100000x100000
循环结构是:
while(condition)
/* usually iterations=dimension of matrix, so around 1 million iterations are *at least* required (if not more) */
matrix-vector multiplication
3 dot prods
2 norms
我目前正在使用英特尔 Fortran 和英特尔 MKL。使用英特尔 MKL 在英特尔 C 中重写我的代码会有帮助吗? 有没有人进行过任何类型的基准测试(尤其是 DGEMV)? 重写代码是一个很大的痛苦,但如果我看到有理由的话,我不介意重写。
编辑:我说错了:矩阵维度是 100000 而不是 100 万。相当严重的错误:|
是的,矩阵是稠密的,而且它需要稠密的。 而且,它不是对称的,甚至不是正定的。 我的算法是 QMR 的修改版本。
I have an application in which I need to carry out a lot of Norms, Dot Products and most importantly, Matrix Vector multiplications.
matrix and vectors are huge. Matrix dimension is tending to be a 100000x100000
the loop structure is:
while(condition)
/* usually iterations=dimension of matrix, so around 1 million iterations are *at least* required (if not more) */
matrix-vector multiplication
3 dot prods
2 norms
I am currently using Intel Fortran with Intel MKL. Will rewriting my codes in Intel C with Intel MKL help any?
Has anyone ever carried out a benchmark of any kind (for DGEMV especially)?
Rewriting codes is a major pain but I would not mind rewriting iff I see a reason to.
EDIT: I misspoke: The matrix dimensions are 100000 not a million. Pretty serious error :|
And yes, the matrix is dense and it needs to be dense.
Moreover, it is not symmetric and not even positive definite.
My algorithm is a modified version of QMR.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
C 或 Fortran 的性能完全相同,因为支持库调用的实际实现是相同的,并且代码中的所有时间基本上都花在这些库调用上。
The performance will be completely identical in either C or Fortran, as the actual implementation backing the library calls are the same, and essentially all of the time in your code is spent in those library calls.