这个概念可以通过 OpenMP 进行优化吗?
我宁愿不使用代码,因为这是常见的概念:
假设我们有一个函数的场景,该函数既不太大也不太小,而且本身也不容易通过 OpenMP for 循环优化进行优化。
然而,它是一个在项目运行过程中在代码中数百个不相关的情况下被调用数百万次的函数。
[内联本身似乎并没有做太多事情(默认情况下在优化的 gcc 结果上)并且将其放入宏而不是并行,这将是一个兼容的承诺。]
I'd rather not use code since it's common concept:
Say we have the scenario of a function which is neither too big or too small and also can't easily in itself be optimized with OpenMP for-loop optimizations.
However, it is a function which is called millions of times throughout the project's run in a few hundred unrelated circumstances in the code.
[inline in itself doesn't seem to do much (on by default on optimized gcc outcomes) and making it into a macro while not parallel either, it would be an undertaking to be compatible.]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一般来说,OpenMP 的目的是“让事物并行运行”。不仅仅是
for
循环...好吧,您甚至根本不需要任何循环即可充分利用 OpenMP 并加快代码速度。唯一重要的是:“我是否有多个独立操作,它们逐一运行,并且可以同时工作?”。如果是这样,那么您已经找到了使用 OpenMP 进行优化的简单位置。
OpenMP is for "making things run in parallel" - in general. Not only
for
loops... Well, you don't even need to have any loops at all to make some good use of OpenMP and speed up your code.The only thing which matters is: "do I have a several independent operations which run one after one, and which could work at the same time instead?". If so, then you've found an easy spot for optimization with OpenMP.
当函数被调用时,是否被多次调用,特别是在循环中?这个问题有点模糊——也许是(在几百个不相关的地方中每一个都被调用了数千次 - > 数百万次)或者也许不是(在一百个不相关的地方中每一个都被调用了一次,并且你点击了这些部分代码数千次 -> 数百万次)。
在第一种情况下,是的,并行化“映射”——即将函数独立地应用于一堆情况——很容易,而且 OpenMP 也非常好。
在第二种情况下,如果该函数被调用一百万次但每次一次,则不会。那里有执行的重复,但没有公开的并发;没有必须同时完成但可以独立完成的任务列表。如果函数可能会使用重复的参数来调用,那么您所能做的就是使用记忆化,这是内存/计算时间的权衡,而不是并行化技术。
在第二种情况下,您可能可以重组代码,以便立即进行一堆函数调用,从而暴露并发性并允许并行化——但这不是 OpenMP(或任何并行编程模型)所需要的。 )可以自动为您做。
When the function is called, is it called multiple times, particularly in a loop? The question is a little vague -- maybe yes (it's called thousands of times in each of a few hundred unrelated places -> millions) or maybe no (it's called once in each of a hundred unrelated places, and you hit those sections of code thousands of times -> millions).
In the first case, then yes, parallelizing the `map' -- that is, applying the function independantly to a bunch of cases -- is easy and OpenMPs very well.
In the second case, if the function is called a million times but each time once, then no. There's repetition of execution there, but no exposed concurrency; there's no list of tasks that have to be done at the same time that can be done independantly. All that you can do there, if the function is likely to be called with repeated parameters, is to use memoization, which is a memory/compute time tradeoff, not a parallelization technique.
In the second case, it may be the case that you can restructure the code so that a bunch of those function calls are made at once, thus exposing the concurrency and allowing parallelization -- but its not something that OpenMP (or any parallel programming model) can automatically do for you.