如何在 C++ 中进行内联汇编(视觉工作室 2010)

发布于 2024-09-01 14:55:31 字数 265 浏览 1 评论 0原文

我正在编写一个性能关键、数字运算的 C++ 项目,其中 70% 的时间被 200 行核心模块使用。

我想使用内联汇编来优化核心,但我对此完全陌生。不过,我确实了解一些 x86 汇编语言,包括 GCC 和 NASM 使用的语言。

我所知道的是:

我必须将汇编指令放在我想要的位置的 _asm{} 中。

问题:

  • 我不知道从哪里开始。当我的内联汇编开始发挥作用时,哪个寄存器中的内容是什么?

I'm writing a performance-critical, number-crunching C++ project where 70% of the time is used by the 200 line core module.

I'd like to optimize the core using inline assembly, but I'm completely new to this. I do, however, know some x86 assembly languages including the one used by GCC and NASM.

All I know:

I have to put the assembler instructions in _asm{} where I want them to be.

Problem:

  • I have no clue where to start. What is in which register at the moment my inline assembly comes into play?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

鲸落 2024-09-08 14:55:31

您可以通过名称访问变量并将它们复制到寄存器。
以下是来自 MSDN 的示例:

int power2( int num, int power )
{
   __asm
   {
      mov eax, num    ; Get first argument
      mov ecx, power  ; Get second argument
      shl eax, cl     ; EAX = EAX * ( 2 to the power of CL )
   }
   // Return with result in EAX
}

在 ASM 块中使用 C 或 C++ 可能是你也很感兴趣。

You can access variables by their name and copy them to registers.
Here's an example from MSDN:

int power2( int num, int power )
{
   __asm
   {
      mov eax, num    ; Get first argument
      mov ecx, power  ; Get second argument
      shl eax, cl     ; EAX = EAX * ( 2 to the power of CL )
   }
   // Return with result in EAX
}

Using C or C++ in ASM blocks might be also interesting for you.

仙女 2024-09-08 14:55:31

当涉及内联汇编时,微软编译器的优化能力非常差。它必须备份寄存器,因为如果您使用 eax,那么它不会将 eax 移动到另一个空闲寄存器,它将继续使用 eax。 GCC 汇编器在这方面要先进得多。

为了解决这个问题,微软开始提供 内在函数。这些是进行优化的更好方法,因为它允许编译器与您一起工作。正如 Chris 提到的,内联汇编在 MS 编译器的 x64 下也不起作用,因此在该平台上,您最好只使用内在函数。

它们易于使用并具有良好的性能。我承认我经常能够通过使用外部汇编器来挤出更多的周期,但它们对于提高生产力非常有帮助

The microsoft compiler is very poor at optimisations when inline assembly gets involved. It has to back up registers because if you use eax then it won't move eax to another free register it will continue using eax. The GCC assembler is far more advanced on this front.

To get round this microsoft started offering intrinsics. These are a far better way to do your optimisation as it allows the compiler to work with you. As Chris mentioned inline assembly doesn't work under x64 with the MS compiler as well so on that platform you REALLY are better off just using the intrinsics.

They are easy to use and give good performance. I will admit I am often able to squeeze a few more cycles out of it by using an external assembler but they're bloody good for the productivity improvement they provide

靑春怀旧 2024-09-08 14:55:31

寄存器中什么也没有。当 _asm 块被执行时。您需要将内容移入寄存器。如果有一个变量:'a',那么你需要 值得

__asm {
  mov eax, [a]
}

指出的是,VS2010 附带了 Microsoft 的汇编器。右键单击一个项目,转到构建规则并打开汇编器构建规则,然后 IDE 将处理 .asm 文件。

这是一个更好的解决方案,因为 VS2010 支持 32 位和 64 位项目,并且 __asm 关键字在 64 位构建中不起作用。对于 64 位代码,您必须使用外部汇编器:/

Nothing is in the registers. as the _asm block is executed. You need to move stuff into the registers. If there is a variable: 'a', then you would need to

__asm {
  mov eax, [a]
}

It is worth pointing out that VS2010 comes with Microsofts assembler. Right click on a project, go to build rules and turn on the assembler build rules and the IDE will then process .asm files.

this is a somewhat better solution as VS2010 supports 32bit AND 64bit projects and the __asm keyword does NOT work in 64bit builds. You MUST use external assembler for 64bit code :/

酒浓于脸红 2024-09-08 14:55:31

我更喜欢在汇编中编写整个函数,而不是使用内联汇编。这允许您在构建过程中用汇编语言函数替换高级语言函数。此外,您不必担心编译器优化会妨碍您。

在编写一行汇编代码之前,请打印出函数的汇编语言列表。这为您提供了构建或修改的基础。另一个有用的工具是将汇编与源代码交织在一起。这将告诉您编译器如何编码特定语句。

如果需要为大型函数插入内联汇编,请为需要内联的代码创建一个新函数。在构建期间再次替换为 C++ 或汇编。

这些是我的建议,您的里程可能会有所不同 (YMMV)。

I prefer writing entire functions in assembly rather than using inline assembly. This allows you to swap out the high level language function with the assembly one during the build process. Also, you don't have to worry about compiler optimizations getting in the way.

Before you write a single line of assembly, print out the assembly language listing for your function. This gives you a foundation to build upon or modify. Another helpful tool is the interweaving of assembly with source code. This will tell you how the compiler is coding specific statements.

If you need to insert inline assembly for a large function, make a new function for the code that you need to inline. Again replace with C++ or assembly during build time.

These are my suggestions, Your Mileage May Vary (YMMV).

忆依然 2024-09-08 14:55:31

首先追求容易实现的目标...

正如其他人所说,微软编译器在优化方面相当糟糕。只需投资一个像样的编译器(例如英特尔的 ICC)并“按原样”重新编译代码,您也许可以节省大量精力。您可以从英特尔获得 30 天免费评估许可证并试用。

此外,如果您可以选择构建 64 位可执行文件,那么在 64 位模式下运行可以提高 30% 的性能,因为可用寄存器数量增加了 2 倍。

Go for the low hanging fruit first...

As other have said, the Microsoft compiler is pretty poor at optimisation. You may be able to save yourself a lot of effort just by investing in a decent compiler, such as Intel's ICC, and re-compiling the code "as is". You can get a 30 day free evaluation license from Intel and try it out.

Also, if you have the option to build a 64-bit executable, then running in 64-bit mode can yield a 30% performance improvement, due to the x2 increase in number of available registers.

信仰 2024-09-08 14:55:31

我真的很喜欢组装,所以我不会在这里成为反对者。看来您已经分析了代码并找到了“热点”,这是正确的开始方式。我还假设所讨论的 200 行没有使用很多高级结构,例如 vector

我确实必须给出一点警告:如果数字运算涉及浮点数学,那么您将陷入痛苦的世界,特别是一整套 专门说明,以及一个大学学期的算法研究

话虽如此:如果我是您,我会使用“反汇编”视图在 VS 调试器中单步执行相关代码。如果您在阅读代码时感觉很舒服,这是一个好兆头。之后,进行发布编译(调试关闭优化)并生成该模块的 ASM 列表。 那么如果您认为自己有改进的空间...您就有一个起点。其他人的答案已链接到 MSDN 文档,这确实相当简陋,但仍然是一个合理的开始。

I really like assembly, so I'm not going to be a nay-sayer here. It appears that you've profiled your code and found the 'hotspot', which is the correct way to start. I also assume that the 200 lines in question don't use a lot of high-level constructs like vector.

I do have to give one bit of warning: if the number-crunching involves floating-point math, you are in for a world of pain, specifically a whole set of specialized instructions, and a college term's worth of algorithmic study.

All that said: if I were you, I'd step through the code in question in the VS debugger, using the Disassembly view. If you feel comfortable reading the code as you go along, that's a good sign. After that, do a Release compile (Debug turns off optimization) and generate an ASM listing for that module. Then if you think you see room for improvement...you have a place to start. Other people's answers have linked to the MSDN documentation, which is really pretty skimpy but still a reasonable start.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文