与普通指针相比，通过值通过“ unique_ptr”的性能惩罚吗？

发布于 2025-01-28 14:29:10 字数 2078 浏览 2 评论 0原文

常见的智慧是std :: simelor_ptr不引入绩效惩罚（”，而不是内存罚款），但是我最近偶然发现了一个讨论，表明它实际上引入了另一个间接方向/code>不能在Itanium abi的平台上的寄存器中传递。发布的示例类似于

#include <memory>

int foo(std::unique_ptr<int> u) {
    return *u;
}

int boo(int* i) {
    return *i;
}

，该>与boo相比，它在foo中生成了额外的汇编指令。

foo(std::unique_ptr<int, std::default_delete<int> >):
        mov     rax, QWORD PTR [rdi]
        mov     eax, DWORD PTR [rax]
        ret
boo(int*):
        mov     eax, DWORD PTR [rdi]
        ret

解释是Itanium abi要求unique_ptr由于非平凡的构造函数而不得传递在寄存器中，因此它在堆栈上创建，然后在寄存器中传递该对象的地址。

我知道这并没有真正影响现代PC平台上的性能，但是我想知道是否有人可以提供有关不得将其复制到寄存器的原因的更多详细信息。由于零成本的抽象是C ++的主要目标之一，因此我想知道这是否已在标准化过程中作为公认的偏差进行了讨论，还是它是实施问题的质量。在考虑收益时，尤其是在现代PC平台上，绩效罚款当然足够小。

评论者指出，这两个函数不是完全等效的，因此比较有缺陷，因为foo还将在simory_ptr parameter上调用deleter，但boo does not release the memory.但是，与传递普通指针相比，我只对传递unique_ptr y-by-by-by-by-by-by-by-by-by-by-by Indrol感兴趣。 I've modified the example code and included a call to delete to free the plain pointer; the call is in the caller because the unique_ptr's deleter also gets called in the caller's context to make the generated code more identical.此外，手册delete还检查ptr！= nullptr，因为destructor也可以做到这一点。不过，foo不通过寄存器中的参数传递，必须进行间接访问。

我也想知道为什么编译器在调用nullptr之前不介绍操作员删除，因为这被定义为NOOP。我想unique_ptr可以专门用于默认的deleter，以不执行删除器中的检查，但这将是一个很小的微观优化。

原文

Common wisdom is that std::unique_ptr does not introduce a performance penalty (and not a memory penalty when not using a deleter parameter), but I recently stumbled over a discussion showing that it actually introduces an additional indirection because the unique_ptr cannot be passed in a register on platforms with Itanium ABI. The example posted was similar to

#include <memory>

int foo(std::unique_ptr<int> u) {
    return *u;
}

int boo(int* i) {
    return *i;
}

Which generates an additional assembler instruction in foo compared to boo.

foo(std::unique_ptr<int, std::default_delete<int> >):
        mov     rax, QWORD PTR [rdi]
        mov     eax, DWORD PTR [rax]
        ret
boo(int*):
        mov     eax, DWORD PTR [rdi]
        ret

The explanation was that the Itanium ABI demands that the unique_ptr shall not be passed in a register because of the non-trivial constructor, so it created on the stack and then the address of this object is passed in a register.

I know that this does not really impact performance on a modern PC platform, but I am wondering if somebody could provide more details on the reasons why it shall not be copied to a register. Since zero-cost abstractions are one of the major goals of C++, I am wondering if this has been discussed in the standardization process as an accepted deviation or if it is a quality of implementation issue. The performance penalty is certainly small enough when considering the benefits, especially on modern PC platforms.

Commenters have pointed out that the two functions are not fully equivalent and thus the comparison is flawed since foo will also call the deleter on the unique_ptr parameter but boo does not release the memory. However, I was only interested in the difference resulting from passing a unique_ptr by-value compared to passing a plain pointer. I've modified the example code and included a call to delete to free the plain pointer; the call is in the caller because the unique_ptr's deleter also gets called in the caller's context to make the generated code more identical. In addition, the manual delete also checks ptr != nullptr because the destructor also does this. Still, foo does not pass the parameter in a register and has to
do an indirect access.

I also wonder why the compiler does not elide the check for nullptr before calling operator delete since this is defined to be a noop anyway. I guess that unique_ptr could be specialized for the default deleter to not perform the check in the destructor, but that would be a very small micro-optimization.

分享到QQ

分享到微博