测试代码生成器优化

发布于 2024-11-08 17:47:28 字数 1015 浏览 8 评论 0原文

我为 LLVM 代码生成器后端编写了一个低级优化。基本上，优化将在基本块级别重新排序汇编指令，以允许稍后（现有）优化更有效地优化结果代码。我想验证许多测试用例，并且我想对测试过程提出一些建议，因为这是我第一次尝试这样的事情。

到目前为止我考虑过的事情：

编译用 C 编写的基准并检查使用 -S 选项生成的结果 ASM。我已经这样做了，并将优化后的结果与原始结果进行了比较。此方法使我能够看到我的优化有效，但即使我编写自定义的不可执行的 C 文件，我也无法检查所有所需的指令排序测试用例。
将基准编译为 LLVM 程序集，对其进行编辑，然后将 ASM 降低到目标机器程序集。这可能有效，但由于 LLVM 和目标 ASM 之间的抽象级别不同，我怀疑我是否能够通过侵入 LLVM ASM 来检查所有测试用例，直到它生成我想要的内容。
使用目标 ASM 测试用例作为 LLVM 的输入，并使用新的优化重新编译。我无法找到 LLVM 或 gcc（LLVM 接受其大部分选项）的选项来接受 ASM 作为输入。

在验证低级 ASM 编译器优化时，测试特定 ASM 测试用例的好策略是什么？LLVM（或 gcc）是否有一些命令行选项可以使此过程变得更容易？

编辑：澄清一下，我不是在问自动生成 ASM 测试用例；而是在问如何自动生成 ASM 测试用例。我的问题是我有这些测试用例（例如，ASM_before.s和reference_ASM_after.s），但我需要能够通过ASM_before.s code> 导入 LLVM 并确保优化输出 ASM_after.s 与已知良好的 reference_ASM_after.s 匹配。我正在寻找一种方法来做到这一点，而不必将 ASM_before.s“反编译”为高级语言，然后将其编译（经过优化）为 ASM_after.s< /代码>。

原文

I have written a low-level optimization for the LLVM code-generator backend. Basically, the optimization will reorder assembly instructions at the basic block level to allow a later (existing) optimization to more efficiently optimize the resultant code. There are a number of test cases I'd like to verify, and I'd like some suggestions for the testing process, as this is the first time I've attempted something like this.

Things I've considered so far:

Compile benchmarks written in C and examine the resulting ASM generated using the -S option. I have done this, and compared the results with my optimization to the original results. This method allows me to see that my optimization works, but even if I write custom non-executable C files I will not be able to examine all of my desired instruction ordering test cases.
Compile benchmarks to LLVM assembly, edit that, then lower the ASM down to the target machine assembly. This may work, but because of the different level of abstraction between LLVM and target ASM, I doubt that I'd be able to examine all the test cases by hacking at the LLVM ASM until it generates what I want it to.
Use the target ASM test cases as input to LLVM and recompile using the new optimization. I was unable to find an option for either LLVM or gcc (most of whose options LLVM accepts) to accept ASM as an input.

What is a good strategy for testing specific ASM test cases when validating a low-level ASM compiler optimization? Does LLVM (or gcc) have some command line options that would make this process easier?

Edit: To clarify, I'm not asking about automatically generating ASM test cases; my problem is that I have those test cases (e.g., ASM_before.s and reference_ASM_after.s) but I need to be able to pass ASM_before.s into LLVM and ensure that the optimized output ASM_after.s matches known good reference_ASM_after.s. I'm looking for a way to do this without having to "decompile" ASM_before.s into a high-level language and then compile it (with the optimization) down to ASM_after.s.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寄离 2024-11-15 17:47:28

基准测试是那些滑坡之一，你可以提出一个基准来使任何语言或工具看起来好或坏，具体取决于你想要证明的内容。

首先，我通常在没有操作系统的arm平台上工作，因此对执行进行计时非常简单，有时会精确到时钟，加上或减去一个来比较编译器或选项。

特别是当您进入带有缓存的平台时，情况会变得更糟。如果您从启动代码中添加或删除 nops，导致整个程序更改其在内存中的位置，这意味着所有内容都会更改其缓存对齐方式，而无需任何编译器优化更改，您有时会发现由于缓存而导致的性能差异比编译器或后端的差异更多优化。

我通常会进行水石，但不会以此来宣布胜利或失败。如果您使用浮子或带有软 FPU 的磨刀石，您可能还想使用磨刀石。

正如上面有人已经提到的，自我检查测试是一个好主意。现实世界的代码也是如此。例如，压缩例程，获取一些文本（可能是古腾堡项目中一本书的一部分），压缩它，然后解压缩它并将输出与输入进行比较，您可以通过在主机等控制平台上压缩它来添加额外的验证并将压缩大小硬编码到测试中，如果测试中的压缩版本不匹配，但它获得了正确的输出，但仍然失败。我还使用 jpeg 库将图像从/转换为 jpeg，如果预计图像不会通过有损压缩返回到其原始状态，那么您只需进行一次传输和校验和或验证大小或携带一份副本预期输出并进行比较。 Aes 和 des 加密和解密。

您可以将大量开源项目与修改后的编译器一起使用，以将其与库存编译器或其他编译器进行比较。作为现实世界的代码，无论如何，编译器都会使用它。请注意，当您访问 toms hardware 或其他基准测试网站时，会有许多不同的基准测试、渲染某些内容所需的时间、编译 gcc 或 linux 或执行数据库搜索所需的时间、一堆现实世界的应用程序。不同的应用程序获得不同的分数，很少有一个平台/解决方案横扫一系列测试。

当您的性能因更改而下降时，您就需要检查汇编程序并尝试找出原因。记住 Michael Abrash（和其他人）所说的，无论你认为你的汇编器有多好，你仍然需要计时。还要尝试一些疯狂的事情，你确信这些事情会很慢，因为有时你会发现它们速度很快，原因是你从未想过的。

Benchmarking is one of those slippery slopes, you can come up with a benchmark to make any language or tool look good or bad depending on what you are trying to prove.

first off I normally work on arm platforms with no operating system so it is quite simple to time the execution, sometimes down to the clock, plus or minus one to compare compilers or options.

Particularly when you get into platforms with a cache things just get worse. If you add or remove nops from your startup code, causing the whole program to change its location in memory meaning everything changes its cache alignment, without any compiler optimization changes you can sometimes find more performance differences due to the cache than differences in compiler or backend optimizations.

I normally run a dhrystone, but dont declare victory or failure with that. You might want to do a whetstone as well if you use float or a whetstone with a soft fpu.

As already mentioned by someone above, self checking tests are a good idea. Real world code too. For example compression routines, take some text (perhaps a portion of a book from project gutenburg), compress it, then decompress it and compare the output to the intput, you could add an extra validation by compressing it on a control platform like your host and hardcode the compressed size into the test if the compressed version under test does not match but it gets the right output it still fails. I have also used the jpeg library to convert images from/to jpeg, if the image is not expected to return to its original state with the lossy compression then you can just do one transfer and checksum or verify the size or carry a copy of the expected output and compare. Aes and des encryption and decryption.

There are volumes of open source projects that you can use with your modified compiler to compare it to the stock compiler or other compilers. Being real world code, it is the kind of thing your compiler will be used with anyway. Note how when you go to toms hardware or other benchmark sites there are many different benchmarks, the time it takes to render something, the time it takes to compile gcc or linux or perform a database search, a bunch of real world applications. And the various applications get various scores, very rare that one platform/solution sweeps the battery of tests.

When your performance drops as you make changes that is the time you examine the assembler and try to figure out why. Remember what Michael Abrash (and others) said, no matter how good you think your assembler is you still have to time it. Also try crazy things that you are sure are going to be slow, because sometimes you find out they are fast for reasons you never thought about.

回复收藏 0 原文