C 编译器可以在调用期间预取数据吗?
是否有可能启用高度优化的良好 C 编译器通过预取来优化代码并在某些函数调用之前放置预取:
struct *abc;
//...
function_first(&(abc->field1));
abc->field2= abc->field3+ abc->field4 + abc->field5 + ...;
// a lot work on struct fields
function_second(&(abc->field1))
因此,可以在编译器优化后编写代码以对 abc
字段进行预取并将其移至更高位置比 function_first()
调用,如下所示:
struct *abc;
//...
__prefetch(abc->field2);__prefetch(abc->field5);
function_first(&(abc->field1));
abc->field2= abc->field3+ abc->field4 + abc->field5 + ...;
// a lot work on struct fields
function_second(&(abc->field1))
函数 function_first()
可以注释为 clean
(对其他 abc 字段没有副作用)比 field1),或者可以在整个程序优化中编译程序(-ipo /Qipo for intel),编译器可以检查 function_first
做了什么。
更新:没有调用,预取是可能的,但这个问题是关于混合调用和预取的,
谢谢。
Is it possible to good C compiler with high optimization enabled to optimize code with prefetches and to place prefetches before some function call:
struct *abc;
//...
function_first(&(abc->field1));
abc->field2= abc->field3+ abc->field4 + abc->field5 + ...;
// a lot work on struct fields
function_second(&(abc->field1))
So, can code after compiler optimization to have a prefetches for abc
fields and move it higher than function_first()
call, like this:
struct *abc;
//...
__prefetch(abc->field2);__prefetch(abc->field5);
function_first(&(abc->field1));
abc->field2= abc->field3+ abc->field4 + abc->field5 + ...;
// a lot work on struct fields
function_second(&(abc->field1))
The function function_first()
can be annotated as clean
(have no side-effects on abc fields other than field1), or the programm can be compiled in whole-program optimization (-ipo /Qipo for intel), where compiler can check, what function_first
do.
UPDATE: without calls the prefetches are possible, but this question is about mixing calls and prefetches
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的,英特尔的 ICC 编译器可以做到这一点 (*)。但它是否真的对性能产生任何影响还有待商榷。
(*) 请参阅-opt-prefetch=n 开关。
Yes, Intel's ICC compiler can do this (*). It's debatable whether it actually makes any difference to performance though.
(*) See the -opt-prefetch=n switch.