数组放置新需要缓冲区中未指定的开销?
C++11 Feb 草案的 5.3.4 [expr.new]
给出了示例:
new(2,f) T[5]
导致调用operator new[](sizeof(T)*5+y,2,f)
.这里,x和y是非负未指定值,表示数组分配开销; new-表达式的结果将从
operator new[]
返回的值中偏移这个量。此开销可能会应用于所有数组new-表达式,包括引用库函数operator new[](std::size_t, void*)
和其他布局分配函数的数组。每次 new 调用与另一次调用之间的开销量可能会有所不同。 —结束示例]
现在看下面的示例代码:
void* buffer = malloc(sizeof(std::string) * 10);
std::string* p = ::new (buffer) std::string[10];
根据上面的引用,第二行 new (buffer) std::string[10]
将在内部调用 operator new[](sizeof(std::string) * 10 + y, buffer)
(在构造各个 std::string
对象之前)。问题是如果 y > 0
,预分配的缓冲区会太小!
那么,在使用 arrayplacement-new 时,我如何知道要预分配多少内存呢?
void* buffer = malloc(sizeof(std::string) * 10 + how_much_additional_space);
std::string* p = ::new (buffer) std::string[10];
或者某个地方的标准是否保证在这种情况下 y == 0
?再次,引用说:
此开销可能会应用于所有数组new-表达式,包括引用库函数
operator new[](std::size_t, void*)
和其他数组布局分配函数。
5.3.4 [expr.new]
of the C++11 Feb draft gives the example:
new(2,f) T[5]
results in a call ofoperator new[](sizeof(T)*5+y,2,f)
.Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by
operator new[]
. This overhead may be applied in all array new-expressions, including those referencing the library functionoperator new[](std::size_t, void*)
and other placement allocation functions. The amount of overhead may vary from one invocation of new to another. —end example ]
Now take the following example code:
void* buffer = malloc(sizeof(std::string) * 10);
std::string* p = ::new (buffer) std::string[10];
According to the above quote, the second line new (buffer) std::string[10]
will internally call operator new[](sizeof(std::string) * 10 + y, buffer)
(before constructing the individual std::string
objects). The problem is that if y > 0
, the pre-allocated buffer will be too small!
So how do I know how much memory to pre-allocate when using array placement-new?
void* buffer = malloc(sizeof(std::string) * 10 + how_much_additional_space);
std::string* p = ::new (buffer) std::string[10];
Or does the standard somewhere guarantee that y == 0
in this case? Again, the quote says:
This overhead may be applied in all array new-expressions, including those referencing the library function
operator new[](std::size_t, void*)
and other placement allocation functions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
更新
Nicol Bolas 在下面的评论中正确指出,该问题已得到修复,
operator new[](std::size_t, void* p)
的开销始终为零。此修复是作为缺陷报告完成的2019 年 11 月,这使其追溯至 C++ 的所有版本。
原始答案
不要使用
operator new[](std::size_t, void* p)
,除非你先验知道这个问题的答案。答案是实现细节,并且可以随编译器/平台的不同而改变。尽管它对于任何给定平台通常都是稳定的。例如,这是 Itanium ABI 指定的内容。如果您不知道这个问题的答案,请编写自己的新放置数组,该数组可以在运行时检查这一点:
通过改变数组大小并检查上面示例中的
n
,您可以推断 < code>y 适合您的平台。对于我的平台y
是 1 个单词。 sizeof(word) 根据我是针对 32 位还是 64 位体系结构进行编译而有所不同。Update
Nicol Bolas correctly points out in the comments below that this has been fixed such that the overhead is always zero for
operator new[](std::size_t, void* p)
.This fix was done as a defect report in November 2019, which makes it retroactive to all versions of C++.
Original Answer
Don't use
operator new[](std::size_t, void* p)
unless you know a-priori the answer to this question. The answer is an implementation detail and can change with compiler/platform. Though it is typically stable for any given platform. E.g. this is something specified by the Itanium ABI.If you don't know the answer to this question, write your own placement array new that can check this at run time:
By varying the array size and inspecting
n
in the example above, you can infery
for your platform. For my platformy
is 1 word. The sizeof(word) varies depending on whether I'm compiling for a 32 bit or 64 bit architecture.更新:经过一番讨论,我了解到我的答案不再适用于该问题。我将把它留在这里,但肯定仍然需要一个真正的答案。
如果很快没有找到好的答案,我将很乐意以一些赏金来支持这个问题。
我将在这里根据我的理解重申这个问题,希望较短的版本可以帮助其他人理解所问的内容。问题是:
下面的结构总是正确的吗?最后是
arr == addr
吗?从标准中我们知道,#1 会导致调用
::operator new[](???, addr)
,其中???
是一个不小于的未指定数字N * sizeof(T)
,我们还知道该调用仅返回addr
,没有其他效果。我们还知道arr
相应地相对于addr
有偏移。我们不知道的是addr
指向的内存是否足够大,或者我们如何知道要分配多少内存。您似乎混淆了一些事情:
您的示例调用
operator new[]()
,而不是。operator new()
分配函数不构造任何东西。它们分配。
发生的情况是,表达式
T * p = new T[10];
导致:调用
operator new[]()
带有大小参数10 * sizeof(T) + x
,十次调用默认值
T
的构造函数,实际上是::new (p + i) T()
。唯一的特点是 array-new 表达式 要求的内存比数组数据本身使用的内存更多。您看不到这些信息,除了默默接受之外,您无法以任何方式使用这些信息。
如果您想知道实际分配了多少内存,您可以简单地替换数组分配函数
operator new[]
和operator delete[]
并使其打印出实际大小。更新:作为一条随机信息,您应该注意全局放置新函数必须是无操作的。也就是说,当您像这样就地构造对象或数组时:
然后相应地调用
::operator new(std::size_t, void*)
和::operator new[ ](std::size_t, void*)
除了返回第二个参数之外什么也不做。但是,您不知道buf10
应该指向什么:它需要指向10 * sizeof(T) + y
字节的内存,但您无法知道<代码>y。Update: After some discussion, I understand that my answer no longer applies to the question. I'll leave it here, but a real answer is definitely still called for.
I'll be happy to support this question with some bounty if a good answer isn't found soon.
I'll restate the question here as far as I understand it, hoping that a shorter version might help others understand what's being asked. The question is:
Is the following construction always correct? Is
arr == addr
at the end?We know from the standard that #1 causes the call
::operator new[](???, addr)
, where???
is an unspecified number no smaller thanN * sizeof(T)
, and we also know that that call only returnsaddr
and has no other effects. We also know thatarr
is offset fromaddr
correspondingly. What we do not know is whether the memory pointed to byaddr
is sufficiently large, or how we would know how much memory to allocate.You seem to confuse a few things:
Your example calls
operator new[]()
, not.operator new()
The allocation functions do not construct anything. They allocate.
What happens is that the expression
T * p = new T[10];
causes:a call to
operator new[]()
with size argument10 * sizeof(T) + x
,ten calls to the default constructor of
T
, effectively::new (p + i) T()
.The only peculiarity is that the array-new expression asks for more memory than what is used by the array data itself. You don't see any of this and cannot make use of this information in any way other than by silent acceptance.
If you are curious how much memory was actually allocated, you can simply replace the array allocation functions
operator new[]
andoperator delete[]
and make it print out the actual size.Update: As a random piece of information, you should note that the global placement-new functions are required to be no-ops. That is, when you construct an object or array in-place like so:
Then the corresponding calls to
::operator new(std::size_t, void*)
and::operator new[](std::size_t, void*)
do nothing but return their second argument. However, you do not know whatbuf10
is supposed to point to: It needs to point to10 * sizeof(T) + y
bytes of memory, but you cannot knowy
.对于固定大小的内存区域,调用任何版本的operator new[] () 都不能很好地工作。本质上,假设它委托给一些实际的内存分配函数,而不仅仅是返回指向已分配内存的指针。如果您已经有一个要构造对象数组的内存区域,则需要使用 std::uninitialized_fill() 或 std::uninitialized_copy() 来构造对象(或单独构造对象的某种其他形式)。
您可能会争辩说,这意味着您还必须手动销毁内存区域中的对象。但是,在从放置
new
返回的指针上调用delete[] array
不起作用:它将使用operator delete[] 的非放置版本()
!也就是说,当使用放置new
时,您需要手动销毁对象并释放内存。Calling any version of
operator new[] ()
won't work too well with a fixed size memory area. Essentially, it is assumed that it delegates to some real memory allocation function rather than just returning a pointer to the allocated memory. If you already have a memory arena where you want to construct an array of objects, you want to usestd::uninitialized_fill()
orstd::uninitialized_copy()
to construct the objects (or some other form of individually constructing the objects).You might argue that this means that you have to destroy the objects in your memory arena manually as well. However, calling
delete[] array
on the pointer returned from the placementnew
won't work: it would use the non-placement version ofoperator delete[] ()
! That is, when using placementnew
you need to manually destroy the object(s) and release the memory.正如 Kerrek SB 在评论中提到的,此缺陷首先被报告 2004年,并于2012年解决为:
然后该缺陷于 2013 年向 EWG 报告,但作为 NAD 关闭(大概意味着“不是缺陷”)并带有注释:
这大概意味着建议的解决方法是使用一个循环,为每个正在构造的对象调用一次非数组放置 new 。
线程中其他地方没有提到的推论是,此代码会导致所有
T
出现未定义的行为:即使我们遵守生命周期规则(即
T
要么有轻微的破坏,要么该程序不依赖于析构函数的副作用),问题是ptr
已针对此未指定的 cookie 进行了调整,因此传递给operator delete[]< 的值是错误的/代码>。
As mentioned by Kerrek SB in comments, this defect was first reported in 2004, and it was resolved in 2012 as:
Then the defect was reported to EWG in 2013, but closed as NAD (presumably means "Not A Defect") with the comment:
which presumably means that the suggested workaround is to use a loop with a call to non-array placement new once for each object being constructed.
A corollary not mentioned elsewhere on the thread is that this code causes undefined behaviour for all
T
:Even if we comply with the lifetime rules (i.e.
T
either has trivial destruction, or the program does not depend on the destructor's side-effects), the problem is thatptr
has been adjusted for this unspecified cookie, so it is the wrong value to pass tooperator delete[]
.请注意,C++20 改变了这个答案。
C++17(及之前版本)[expr.new]/11< /a> 清楚地表明此函数可能获得其大小的实现定义的偏移量:
这允许(但不要求)给予数组分配函数的大小可以从
sizeof(T) * size
增加。C++20 明确不允许这样做。来自 [expr.new]/15:
添加了强调。甚至您引用的非规范性注释也发生了变化:
Note that C++20 changes this answer.
C++17's (and before) [expr.new]/11 clearly says that this function may get an implementation defined offset to its size:
This permits, but does not require, that the size given to the array allocation function could be increased from
sizeof(T) * size
.C++20 explicitly disallows this. From [expr.new]/15:
Emphasis added. Even the non-normative note you quoted was changed:
在阅读了相应的标准部分之后,我开始认为数组类型的放置 new 根本就是无用的想法,标准允许它的唯一原因是描述 new 运算符的通用方式:
在我看来,数组放置新只是源于定义的紧凑性(所有可能的用途作为一个方案),而且似乎没有充分的理由禁止它。
这使我们陷入这样一种情况:我们有无用的运算符,它需要在知道需要多少内存之前分配内存。我看到的唯一解决方案是要么过度分配内存并希望编译器不需要超过提供的内存,要么在重写的数组放置新函数/方法中重新分配内存(这违背了使用的目的首先是
数组放置新
)。回答 Kerrek SB 指出的问题:
你的例子:
并不总是正确的。在大多数实现中,
arr!=addr
(并且有充分的理由),因此您的代码无效,并且您的缓冲区将溢出。关于这些“充分的理由” - 请注意,当使用 array new 运算符时,标准创建者将您从一些内务管理中释放出来,并且 arrayplacement new 在这方面没有什么不同。请注意,您不需要告知
delete[]
有关数组的长度,因此该信息必须保存在数组本身中。在哪里?就在这个额外的内存中。如果没有它,删除[]将需要将数组长度分开(就像 stl 使用循环和非放置 new 一样)After reading corresponding standard sections I am satarting to think that placement new for array types is simply useless idea, and the only reason for it being allowed by standard is generic way in which new-operator is described:
To me it seems that
array placement new
simply stems from compactness of the definition (all possible uses as one scheme), and it seems there is no good reason for it to be forbidden.This leaves us in a situation where we have useless operator, which needs memory allocated before it is known how much of it will be needed. The only solutions I see would be to either overallocate memory and hope that compiler will not want more than supplied, or re-allocate memory in overriden
array placement new
function/method (which rather defeats the purpose of usingarray placement new
in the first place).To answer question pointed out by Kerrek SB:
Your example:
is not always correct. In most implementations
arr!=addr
(and there are good reasons for it) so your code is not valid, and your buffer will be overrun.About those "good reasons" - note that you are released by standard creators from some house-keeping when using
array new
operator, andarray placement new
is no different in this respect. Note that you do not need to informdelete[]
about length of array, so this information must be kept in the array itself. Where? Exactly in this extra memory. Without itdelete[]
'ing would require keeping array length separate (as stl does using loops and non-placementnew
)这是标准中的缺陷。有传言他们找不到志愿者来编写例外情况(消息#1173 )。
不可替换数组placement-new不能与
delete[]
表达式一起使用,因此您需要循环遍历数组并调用每个析构函数 。开销针对用户定义的数组放置新函数,该函数像常规
T* tp = new T[length]
一样分配内存。它们与delete[]
兼容,因此会产生数组长度的开销。This is a defect in the standard. Rumor has it they couldn't find a volunteer to write an exception to it (Message #1173).
The non-replaceable array placement-new cannot be used with
delete[]
expressions, so you need to loop through the array and call each destructor.The overhead is targetted at the user-defined array placement-new functions, which allocate memory just like the regular
T* tp = new T[length]
. Those are compatible withdelete[]
, hence the overhead that carries the array length.