编译器优化问题

发布于 2024-07-19 09:17:32 字数 110 浏览 8 评论 0原文

  1. 编译器消除重复的子表达式重新计算的方法有哪些? 你如何跟踪子表达式? 以及如何识别重复的?
  2. 除了使用按位运算符之外,常见编译器还使用哪些强度降低技术?
  1. What are some of the ways a compiler eliminates repeated subexpressions recomputations? How do you keep track of the sub-expressions? And How do you identify the repeated ones?
  2. Besides the usage of bitwise operators, what are some of the strength reduction techniques common compilers use?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

待天淡蓝洁白时 2024-07-26 09:17:32

对于 1,您正在寻找的优化的名称是公共子表达式消除 (CSE)。 根据您的表现,这可能相当容易。 通常,编译器会有一些程序的中间表示,其中操作被尽可能地分解和线性化。 例如,表达式 c = a * b + a * b 可能会被分解为:

v1 = a * b
v2 = a * b
c = v1 + v2

因此,您可以通过查找具有相同运算符和操作数的操作来在非常低的级别上执行 CSE。 当您遇到重复项(在本例中为 v2)时,您可以将其所有实例替换为原始项。 因此,我们可以将上面的代码简化为:

v1 = a * b
c = v1 + v1

这通常假设您只为每个变量分配一次(单个静态分配形式),但您可以在没有该限制的情况下实现类似的东西。 当您尝试跨分支执行此优化时,这会变得更加复杂。 正如 Zifre 提到的,研究部分冗余消除。

无论哪种方式,您都会获得一些基本的改进,并且您需要跟踪的只是基本表达式。 您可能想更进一步并寻找算术恒等式。 例如,a * bb * a 相同。 另外,x * (y + z) = x * y + x * z。 这使您的优化变得更加复杂,并且尚不清楚它是否会给您带来如此多的性能改进。 有趣的是,CSE 优化的大部分好处来自地址计算(例如数组访问),并且您不需要像上面那样的复杂身份。

对于 2,什么强度降低是有用的实际上取决于您编译的架构。 通常这仅涉及将乘法和除法转换为移位、加法和减法。

For 1, The name of the optimization you're looking for is common subexpression elimination (CSE). Depending on your representation, this can be fairly easy. Usually, a compiler will have some intermediate representation of a program where operations are broken down as much as possible and linearized. So for example, the expression c = a * b + a * b might be broken down as:

v1 = a * b
v2 = a * b
c = v1 + v2

So you could do CSE at a very low level by looking for operations with the same operator and operands. When you encounter a duplicate (v2 in this case), you replace all instances of it with the original. So we could simplify the code above to be:

v1 = a * b
c = v1 + v1

This generally assumes that you only assign each variable once (single static assignment form), but you can implement something like this without that restriction. This gets more complicated when you try and perform this optimization across branches. As Zifre mentions, look into Partial Redundancy Elimination.

Either way, you get some basic improvement, and all you need to keep track of are basic expressions. You may want to take this a step further and look for arithmetic identities. For instance, a * b is the same as b * a. Also, x * (y + z) = x * y + x * z. This makes your optimization more complicated, and it's not clear that it would give you that much performance improvement. Anecdotally, most of the benefit from a CSE optimization comes from address computations like array accesses, and you won't need complicated identities like the ones above.

For 2, what strength reductions are useful really depends on the architecture you compile for. Usually this just involves transforming multiplications and divisions into shifts, additions, and subtractions.

猫七 2024-07-26 09:17:32

我强烈推荐关于这些主题的两本印刷参考资料:

  1. 高级编译器设计和编译器设计 实现,作者:Steven S. Muchnick
  2. 构建优化编译器,作者:Robert Morgan

Muchnick 的书比较正式,但可读性很强,并且对所有重要的优化技术都有很好的描述。 摩根的书具有更多的实践感觉,并且将成为专注于优化技术的编译器项目的良好基础。 这两本书都没有太多关于词法分析或语法分析的内容,假设您了解这些主题。

I would highly recommend two printed references on these subjects:

  1. Advanced Compiler Design & Implementation by Steven S. Muchnick
  2. Building an Optimizing Compiler by Robert Morgan

The Muchnick book is on the formal side but is very readable and has good descriptions of all of the important optimization techniques. The Morgan book has a much more hands-on feel and would be a great basis for a compiler project focused on optimization techniques. Neither book has much to say about lexical analysis or parsing, knowledge of these subjects is assumed.

荒岛晴空 2024-07-26 09:17:32
  1. 我相信很多编译器都使用SSAPRE(静态单赋值部分冗余消除)来消除重复的表达式。 这要求代码采用SSA 形式,允许更多优化。

  2. 我对此不太确定,但请查看此 LLVM 通行证列表LLVM 是一种针对编译器的优化 IR,其速度通常比 GCC 还要快。 每个通道都有一个小解释。 如果您需要更多信息,请查看这些通道的 LLVM 源代码。 它是用 C++ 编写的,但非常干净且易于理解。

编辑:顺便说一句,如果您正在开发编译器,我强烈推荐 LLVM,它非常易于使用并生成高度优化的代码。

  1. I believe many compilers use SSAPRE (Static Single Assignment Partial Redundancy Elimination) to eliminate repeated expressions. This requires the code to be in SSA form, allowing many more optimizations.

  2. I'm not really sure about this one, but look at this list of LLVM passes. LLVM is an optimizing IR for compilers that is often faster than even GCC. There is a small explanation of each pass. If you need more info, look at the LLVM source for these passes. It is written in C++ but is quite clean and understandable.

Edit: By the way, if you're developing a compiler, I highly recommend LLVM, it is very easy to use and generates highly optimized code.

那小子欠揍 2024-07-26 09:17:32

要将另一本书添加到推荐列表中,请查看“黑客之乐” 亨利·沃伦 (Henry S. Warren)。 它是优化常见运算(例如将整数除法转换为乘法)的技术的重要概要。

To add one more book to the list of recommendations, check out "Hacker's Delight" by Henry S. Warren. It's a great compendium of techniques for optimizing common operations, like transforming integer divisions into multiplications.

等风也等你 2024-07-26 09:17:32

您正在寻找部分冗余消除(PRE)。 CSE(来自其他答案)和循环不变代码运动都包含在 PRE 中。 (PRE 的一个变体是 Lazy Code Motion,我认为这是最佳的)。

查看 Keith Cooper 的讲义,其中似乎描述了技术非常好。

不要不要使用 SSAPRE。 AFAIK,这需要一种称为 HSSA 的特殊形式的 SSA,它有一些缺点:

  • 它非常复杂
  • 它需要全局值编号(因此 SSAPRE 不提供值编号,因为它预计已经存在)。
  • 如果您的语言不支持指向堆栈变量的指针,它不会提供任何内容(如果支持,请停止编写自己的分析并使用 LLVM 或 gcc)。
  • gcc 使用 HSSA 有一段时间了,但他们已经放弃它了。
  • LLVM 尝试过它,但据我所知他们不再使用它了。

编辑:

穆奇尼克的书有详细的描述,它链接在另一个答案中。

You're looking for partial-redundancy elimination (PRE). Both CSE (from the other answers) and loop-invariant code motion are subsumed by PRE. (A variation of PRE is Lazy Code Motion, which I believe is optimal).

Check out Keith Cooper's lecture notes, which seem to describe the techniques very well.

Do NOT use SSAPRE. AFAIK, this requires a particular form of SSA known as HSSA, which has a few downsides:

  • Its pretty complicated
  • It requires global value numbering (and so SSAPRE doesn't provide value numbering, as its expected to exist already).
  • It doesn't provide anything if your language doesn't support pointers to stack variables (and if it does, stop writing your own analysis and use LLVM or gcc).
  • gcc used HSSA for a while, but they have moved away from it.
  • LLVM experimented with it, but AFAIK they don't use it anymore.

EDIT:

Muchnick's book has a detailed description, its linked in another answer.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文