在 FORTRAN 中禁用矢量化循环?
是否可以绕过 FORTRAN 中的循环向量化?我正在为特定项目编写 F77 标准,但 GNU gfortran 通过现代 FORTRAN 进行编译,例如 F95。有谁知道某些 FORTRAN 标准是否避免了循环矢量化,或者 gfortran 中是否有任何标志/选项可以将其关闭?
更新:所以,我认为我的具体问题的最终解决方案必须使用 FORTRAN DO 循环“DO”,不允许更新迭代变量。对此的提及可以在@High Performance Mark对此相关主题的回复中找到... 循环矢量化以及如何避免它
[进入堡垒,跑去躲避菜鸟。]
Is it possible to bypass loop vectorization in FORTRAN? I'm writing to F77 standards for a particular project, but the GNU gfortran compiles up through modern FORTRANs, such as F95. Does anyone know if certain FORTRAN standards avoided loop vectorization or if there are any flags/options in gfortran to turn this off?
UPDATE: So, I think the final solution to my specific problem has to "DO" with the FORTRAN DO loops not allowing the updating of the iteration variable. Mention of this can be found in @High Performance Mark's reply on this related thread... Loop vectorization and how to avoid it
[Into the FORT, RAN the noobs for shelter.]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Fortran 标准通常对如何实现该语言保持沉默,将其留给编译器编写者,他们可以更好地确定在任何芯片架构上实现该语言的各种功能的最佳或好的(和坏的)选项(s) 他们正在为之写作。
当您写下要绕过循环矢量化时,您的意思是什么?并在下一句中暗示这对于 FORTRAN77 程序不可用?如果 CPU 能够遵守向量指令,那么现代 CPU 的编译器生成向量指令是完全正常的。无论程序是用什么版本的语言编写的,都是如此。
如果您真的不想生成向量指令,那么您必须仔细检查 gfortran 文档 - 它不是我使用的编译器,所以我不能指出您可以选择特定选项或标志。您可能想了解其特定于体系结构的代码生成功能,特别注意 SSE 级别。
如果您的所有循环都是显式的(因此没有整个数组操作)并且您使代码难以以其他方式矢量化(例如,循环迭代之间的依赖关系),您可能能够强制编译器不进行矢量化循环。但是一个好的现代编译器,在不受干扰的情况下,会为了你自己的利益而尽最大努力对循环进行矢量化。
对我来说,试图强迫编译器违背其本质似乎相当反常,也许您可以更详细地解释为什么要这样做。
The Fortran standards are generally silent on how the language is to be implemented, leaving that to the compiler writers who are in a better position to determine the best, or good (and bad) options for implementation of the language's various features on whatever chip architecture(s) they are writing for.
What do you mean when you write that you want to bypass loop vectorisation ? And in the next sentence suggest that this would be unavailable to FORTRAN77 programs ? It is perfectly normal for a compiler for a modern CPU to generate vector instructions if the CPU is capable of obeying them. This is true whatever version of the language the program is written in.
If you really don't want to generate vector instructions then you'll have to examine the gfortran documentation carefully -- it's not a compiler I use so I can't point you to specific options or flags. You might want to look at its capabilities for architecture-specific code generation, paying particular attention to SSE level.
You might be able to coerce the compiler into not vectorising loops if all your loops are explicit (so no whole-array operations) and if you make your code hard to vectorise in other ways (dependencies between loop iterations for example). But a good modern compiler, without interference, is going to try its damndest to vectorise loops for your own good.
It seems rather perverse to me to try to force the compiler to go against its nature, perhaps you could explain why you want to do that in more detail.
正如 High Performance Mark 所写,只要结果遵循语言规则,编译器就可以自由选择机器指令来实现源代码。您应该无法观察到循环矢量化导致的输出值的任何差异......您的代码应该运行得更快。那么你为什么关心呢?
有时可以在优化级别之间观察到差异,例如,在某些架构上寄存器具有额外的精度。
寻找此类编译器优化的地方是 gcc 手册。它们位于此处,因为它们在 gcc 编译器套件中很常见。
As High Performance Mark wrote, the compiler is free to select machine instructions to implement your source code as long as the results follow the rules of the language. You should not be able to observe any difference in the output values as a result of loop vectorization ... you code should run faster. So why do you care?
Sometimes differences can be observed across optimization levels, e.g., on some architectures registers have extra precision.
The place to look for these sorts of compiler optimizations is the gcc manual. They are located there since they are common across the gcc compiler suite.
对于大多数现代编译器,命令行选项 -O0 应关闭所有优化,包括循环向量化。
我有时发现这会导致错误明显消失。然而,通常这意味着我的代码有问题,所以如果这种事情发生在你身上,那么你几乎肯定编写了一个有错误的程序。
理论上这是可能的,但编译器中存在错误的可能性要小得多,您可以通过在另一个 fortran 编译器中编译代码来轻松检查这一点。 (例如 gfortran 或 g95)。
With most modern compilers, the command-line option -O0 should turn off all optimisations, including loop vectorisation.
I have sometimes found that this causes bugs to apparently disappear. However usually this means that there is something wrong with my code so if this sort of thing is happening to you then you have almost certainly written a buggy program.
It is theoretically possible but much less likely that there is a bug in the compiler, you can easily check this by compiling your code in another fortran compiler. (e.g. gfortran or g95).
除非您设置了 -O3 或 -ftree-vectorize,否则 gfortran 不会自动矢量化。所以很容易避免矢量化。您可能需要阅读(浏览)gcc 手册以及 gfortran 手册。
35 年来,自动矢量化一直是 Fortran 编译器的众所周知的功能,甚至 Fortran 77 DO 循环的定义也是考虑到这一点(并且还考虑到一些已知的不可移植的 F66 标准滥用)。您不能指望关闭矢量化作为使不正确的代码正常工作的一种方法,尽管它可能会暴露不正确代码的症状。
gfortran doesn't auto-vectorize unless you have set -O3 or -ftree-vectorize. So it's easy to avoid vectorization. You will probably need to read (skim) the gcc manual as well as the gfortran one.
Auto-vectorization has been a well-known feature of Fortran compilers for over 35 years, and even the Fortran 77 definition of DO loops was set with this in mind (and also in view of some known non-portable abuses of F66 standard). You could not count on turning off vectorization as a way of making incorrect code work, although it might expose symptoms of incorrect code.