循环展开与循环平铺

发布于 2024-10-27 01:39:51 字数 57 浏览 5 评论 0原文

有人可以告诉我这两种优化技术是相同还是不同吗?

另外,程序员或编译器有责任这样做吗?

Can someone please tell if the 2 optimization techniques are same or different?

Also, is it responsibility of programmer or compiler to do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

冷清清 2024-11-03 01:39:51

这两种技术是不同的。请参阅循环展开的说明循环平铺

循环展开是为了消除循环的开销。它(通常)仅对相当小的循环有用,其中迭代次数很少并且在编译时已知。它主要是由编译器完成的。

在较早的时代,当计算机速度较慢且编译器更加原始时,程序员会手动展开循环,但现在程序员这样做是不寻常的 - 除非可能是非常严格的嵌入式系统。

循环平铺通常是针对非常大的数据集完成的。目的是:在分页一些新数据之前将一些数据加载到高速缓冲存储器中并对其执行所有操作。

根据正在执行的操作和数据的内部组织,一个简单的循环可能会跳转到不同的数据页,从而导致大量缓存未命中(和页面加载)。仔细规划执行顺序可以显着改善某些问题的运行时间。

虽然编译器可能会执行循环平铺,但有时程序员可能会手动执行此操作,并且可能比编译器做得更好。

一般来说,不要尝试进行这些类型的优化,因为它们会给代码增加很多复杂性(和错误),并且通常只能提供适度的性能提升。但是,如果您的代码速度很慢并且分析表明存在特定类型的瓶颈,那么应该考虑循环平铺之类的方法,这可能会带来巨大的性能提升。

The two techniques are different. See descriptions for Loop unrolling and Loop tiling.

Loop unrolling is done to eliminate the overhead of looping. It is (usually) only useful for fairly small loops where the number of iterations is small and is known at compile time. It is mostly done by the compiler.

In older times when computers were slower and compilers were more primitive, programmers would do manual loop unrolling but now it would be unusual for a programmer to do it -- except possibly for a very restrictive embedded system.

Loop tiling is commonly done with very large data sets. The object is: to load some data into cache memory and perform all operations on it before paging in some new data.

Depending on the operations being performed and the internal organisation of the data, a simple loop might jump about into different data pages causing a lot of cache misses (and page loads). Careful planning of the order of execution can significantly improve run-times for certain problems.

While it is likely that a compiler might perform loop tiling, there are times when the programmer might do so manually and possibly do a better job than the compiler.

In general, don't try to do these types of optimisation as they add a lot of complexity (and bugs) to the code and usually provide only modest performance gains. However if your code is slow and profiling indicates particular types of bottlenecks, then something like loop tiling should be considered and may lead to large performance gains.

于我来说 2024-11-03 01:39:51

这是两种完全不同的性能优化。

循环展开是一种代码优化,其中代码在循环内复制并减少循环迭代的总数。好处是减少了循环开销(通常仅与非常小的循环相关),以及更好的指令调度,减少了超标量 CPU 中的依赖性停顿。这可以手动完成和/或通过编译器优化来完成。

分块是一种内存优化,旨在通过处理分块(较大数据结构中的小块)来更好地利用缓存,通常在图像或其他 2D 数据结构的上下文。这通常在源代码级别实现,作为算法实现的总体设计的一部分。

These are two totally different performance optimisations.

Loop unrolling is a code optimisation where code is replicated within a loop and the total number of loop iterations is reduced. The benefit is reduced loop overhead (normally only relevant for very small loops), and better instruction scheduling with reduced dependency stalls in superscalar CPUs. This can be done both manually and/or as a compiler optimisation.

Tiling is a memory optimisation which aims to make better use of cache by processing tiles (small blocks within a larger data structure), typically in the context of an image or other 2D data structure. This is normally implemented at the source code level, as part of the overall design of an algorithm implementation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文