与普通指针相比,通过值通过“ unique_ptr”的性能惩罚吗?

发布于 2025-01-28 14:29:10 字数 2078 浏览 2 评论 0原文

常见的智慧是std :: simelor_ptr不引入绩效惩罚”,而不是内存罚款),但是我最近偶然发现了一个讨论,表明它实际上引入了另一个间接方向/code>不能在Itanium abi的平台上的寄存器中传递。发布的示例类似于

#include <memory>

int foo(std::unique_ptr<int> u) {
    return *u;
}

int boo(int* i) {
    return *i;
}

,该>与boo相比,它在foo中生成了额外的汇编指令。

foo(std::unique_ptr<int, std::default_delete<int> >):
        mov     rax, QWORD PTR [rdi]
        mov     eax, DWORD PTR [rax]
        ret
boo(int*):
        mov     eax, DWORD PTR [rdi]
        ret

解释是Itanium abi要求unique_ptr由于非平凡的构造函数而不得传递在寄存器中,因此它在堆栈上创建,然后在寄存器中传递该对象的地址。

我知道这并没有真正影响现代PC平台上的性能,但是我想知道是否有人可以提供有关不得将其复制到寄存器的原因的更多详细信息。由于零成本的抽象是C ++的主要目标之一,因此我想知道这是否已在标准化过程中作为公认的偏差进行了讨论,还是它是实施问题的质量。在考虑收益时,尤其是在现代PC平台上,绩效罚款当然足够小。

评论者指出,这两个函数不是完全等效的,因此比较有缺陷,因为foo还将在simory_ptr parameter上调用deleter,但boo does not release the memory.但是,与传递普通指针相比,我只对传递unique_ptr y-by-by-by-by-by-by-by-by-by-by-by Indrol感兴趣。 I've modified the example code and included a call to delete to free the plain pointer; the call is in the caller because the unique_ptr's deleter also gets called in the caller's context to make the generated code more identical.此外,手册delete还检查ptr!= nullptr,因为destructor也可以做到这一点。不过,foo不通过寄存器中的参数传递,必须 进行间接访问。

我也想知道为什么编译器在调用nullptr之前不介绍操作员删除,因为这被定义为NOOP。我想unique_ptr可以专门用于默认的deleter,以不执行删除器中的检查,但这将是一个很小的微观优化。

Common wisdom is that std::unique_ptr does not introduce a performance penalty (and not a memory penalty when not using a deleter parameter), but I recently stumbled over a discussion showing that it actually introduces an additional indirection because the unique_ptr cannot be passed in a register on platforms with Itanium ABI. The example posted was similar to

#include <memory>

int foo(std::unique_ptr<int> u) {
    return *u;
}

int boo(int* i) {
    return *i;
}

Which generates an additional assembler instruction in foo compared to boo.

foo(std::unique_ptr<int, std::default_delete<int> >):
        mov     rax, QWORD PTR [rdi]
        mov     eax, DWORD PTR [rax]
        ret
boo(int*):
        mov     eax, DWORD PTR [rdi]
        ret

The explanation was that the Itanium ABI demands that the unique_ptr shall not be passed in a register because of the non-trivial constructor, so it created on the stack and then the address of this object is passed in a register.

I know that this does not really impact performance on a modern PC platform, but I am wondering if somebody could provide more details on the reasons why it shall not be copied to a register. Since zero-cost abstractions are one of the major goals of C++, I am wondering if this has been discussed in the standardization process as an accepted deviation or if it is a quality of implementation issue. The performance penalty is certainly small enough when considering the benefits, especially on modern PC platforms.

Commenters have pointed out that the two functions are not fully equivalent and thus the comparison is flawed since foo will also call the deleter on the unique_ptr parameter but boo does not release the memory. However, I was only interested in the difference resulting from passing a unique_ptr by-value compared to passing a plain pointer. I've modified the example code and included a call to delete to free the plain pointer; the call is in the caller because the unique_ptr's deleter also gets called in the caller's context to make the generated code more identical. In addition, the manual delete also checks ptr != nullptr because the destructor also does this. Still, foo does not pass the parameter in a register and has to
do an indirect access.

I also wonder why the compiler does not elide the check for nullptr before calling operator delete since this is defined to be a noop anyway. I guess that unique_ptr could be specialized for the default deleter to not perform the check in the destructor, but that would be a very small micro-optimization.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冰雪梦之恋 2025-02-04 14:29:10

System V ABI使用ITANIUM C ++ ABI并指它。特别是, c ++ c+ c ++ itanium abi

如果参数类型是为了呼叫的目的而不是平凡的,
来电者必须分配临时空间,然后通过
参考。

特别:

...

如果该类型具有非平凡的驱动器,则呼叫者在封闭完整表达的结束时呼叫控制后的destructor返回其返回到它。

因此,问题“ 为什么不传递到寄存器”的简单答案是“ ”,因为它不能”。

现在,一个有趣的问题可能是'为什么C ++ iTanium abi决定选择'。

虽然我不会声称我对基本原理有深入的了解,但我想到了两件事:

  • 如果对函数的论点是暂时的,这允许复制elision,
  • 这会使尾巴呼叫的优化更加强大。如果Callee需要调用其参数的驱动器,那么接受非平凡参数的任何函数都无法进行TCO。

System V ABI uses Itanium C++ ABI and refers to it. In particular, C++ Itanium ABI specifies that

If the parameter type is non-trivial for the purposes of calls, the
caller must allocate space for a temporary and pass that temporary by
reference.

Specifically:

...

If the type has a non-trivial destructor, the caller calls that destructor after control returns to it (including when the caller throws an exception), at the end of enclosing full-expression.

So a simple answer to question "why it is not passed into register" is "because it can't".

Now, an interesting question might be 'why did C++ Itanium ABI decided to go with that'.

While I wouldn't claim that I have intimate knowledge with rationale, two things come to mind:

  • This allows for copy elision if the argument to the function is a temporary
  • This makes tail-call optimizations more powerful. If callee would need to call destructors of it's arguments, TCO wouldn't be possible for any function which accepts non-trivial arguments.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文