与普通指针相比,通过值通过“ unique_ptr”的性能惩罚吗?
常见的智慧是std :: simelor_ptr
不引入绩效惩罚(”,而不是内存罚款),但是我最近偶然发现了一个讨论,表明它实际上引入了另一个间接方向/code>不能在Itanium abi的平台上的寄存器中传递。发布的示例类似于
#include <memory>
int foo(std::unique_ptr<int> u) {
return *u;
}
int boo(int* i) {
return *i;
}
foo(std::unique_ptr<int, std::default_delete<int> >):
mov rax, QWORD PTR [rdi]
mov eax, DWORD PTR [rax]
ret
boo(int*):
mov eax, DWORD PTR [rdi]
ret
解释是Itanium abi要求unique_ptr
由于非平凡的构造函数而不得传递在寄存器中,因此它在堆栈上创建,然后在寄存器中传递该对象的地址。
我知道这并没有真正影响现代PC平台上的性能,但是我想知道是否有人可以提供有关不得将其复制到寄存器的原因的更多详细信息。由于零成本的抽象是C ++的主要目标之一,因此我想知道这是否已在标准化过程中作为公认的偏差进行了讨论,还是它是实施问题的质量。在考虑收益时,尤其是在现代PC平台上,绩效罚款当然足够小。
评论者指出,这两个函数不是完全等效的,因此比较有缺陷,因为foo
还将在simory_ptr
parameter上调用deleter,但boo does not release the memory.但是,与传递普通指针相比,我只对传递
unique_ptr
y-by-by-by-by-by-by-by-by-by-by-by Indrol感兴趣。 I've modified the example code and included a call to delete
to free the plain pointer; the call is in the caller because the unique_ptr
's deleter also gets called in the caller's context to make the generated code more identical.此外,手册delete
还检查ptr!= nullptr
,因为destructor也可以做到这一点。不过,foo
不通过寄存器中的参数传递,必须 进行间接访问。
我也想知道为什么编译器在调用nullptr
之前不介绍操作员删除
,因为这被定义为NOOP。我想unique_ptr
可以专门用于默认的deleter,以不执行删除器中的检查,但这将是一个很小的微观优化。
Common wisdom is that std::unique_ptr
does not introduce a performance penalty (and not a memory penalty when not using a deleter parameter), but I recently stumbled over a discussion showing that it actually introduces an additional indirection because the unique_ptr
cannot be passed in a register on platforms with Itanium ABI. The example posted was similar to
#include <memory>
int foo(std::unique_ptr<int> u) {
return *u;
}
int boo(int* i) {
return *i;
}
Which generates an additional assembler instruction in foo compared to boo.
foo(std::unique_ptr<int, std::default_delete<int> >):
mov rax, QWORD PTR [rdi]
mov eax, DWORD PTR [rax]
ret
boo(int*):
mov eax, DWORD PTR [rdi]
ret
The explanation was that the Itanium ABI demands that the unique_ptr
shall not be passed in a register because of the non-trivial constructor, so it created on the stack and then the address of this object is passed in a register.
I know that this does not really impact performance on a modern PC platform, but I am wondering if somebody could provide more details on the reasons why it shall not be copied to a register. Since zero-cost abstractions are one of the major goals of C++, I am wondering if this has been discussed in the standardization process as an accepted deviation or if it is a quality of implementation issue. The performance penalty is certainly small enough when considering the benefits, especially on modern PC platforms.
Commenters have pointed out that the two functions are not fully equivalent and thus the comparison is flawed since foo
will also call the deleter on the unique_ptr
parameter but boo
does not release the memory. However, I was only interested in the difference resulting from passing a unique_ptr
by-value compared to passing a plain pointer. I've modified the example code and included a call to delete
to free the plain pointer; the call is in the caller because the unique_ptr
's deleter also gets called in the caller's context to make the generated code more identical. In addition, the manual delete
also checks ptr != nullptr
because the destructor also does this. Still, foo
does not pass the parameter in a register and has to
do an indirect access.
I also wonder why the compiler does not elide the check for nullptr
before calling operator delete
since this is defined to be a noop anyway. I guess that unique_ptr
could be specialized for the default deleter to not perform the check in the destructor, but that would be a very small micro-optimization.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
System V ABI使用ITANIUM C ++ ABI并指它。特别是, c ++ c+ c ++ itanium abi
因此,问题“ 为什么不传递到寄存器”的简单答案是“ ”,因为它不能”。
现在,一个有趣的问题可能是'为什么C ++ iTanium abi决定选择'。
虽然我不会声称我对基本原理有深入的了解,但我想到了两件事:
System V ABI uses Itanium C++ ABI and refers to it. In particular, C++ Itanium ABI specifies that
So a simple answer to question "why it is not passed into register" is "because it can't".
Now, an interesting question might be 'why did C++ Itanium ABI decided to go with that'.
While I wouldn't claim that I have intimate knowledge with rationale, two things come to mind: