数组放置新需要缓冲区中未指定的开销？

发布于 2024-12-24 19:59:13 字数 1150 浏览 2 评论 0原文

C++11 Feb 草案的 5.3.4 [expr.new] 给出了示例：

new(2,f) T[5] 导致调用 operator new[](sizeof(T)*5+y,2,f) .
这里，x和y是非负未指定值，表示数组分配开销； new-表达式的结果将从operator new[]返回的值中偏移这个量。此开销可能会应用于所有数组new-表达式，包括引用库函数operator new[](std::size_t, void*) 和其他布局分配函数的数组。每次 new 调用与另一次调用之间的开销量可能会有所不同。 —结束示例]

现在看下面的示例代码：

void* buffer = malloc(sizeof(std::string) * 10);
std::string* p = ::new (buffer) std::string[10];

根据上面的引用，第二行 new (buffer) std::string[10] 将在内部调用 operator new[](sizeof(std::string) * 10 + y, buffer) （在构造各个 std::string 对象之前）。问题是如果 y > 0，预分配的缓冲区会太小！

那么，在使用 arrayplacement-new 时，我如何知道要预分配多少内存呢？

void* buffer = malloc(sizeof(std::string) * 10 + how_much_additional_space);
std::string* p = ::new (buffer) std::string[10];

或者某个地方的标准是否保证在这种情况下 y == 0 ？再次，引用说：

此开销可能会应用于所有数组new-表达式，包括引用库函数operator new[](std::size_t, void*) 和其他数组布局分配函数。

原文

5.3.4 [expr.new] of the C++11 Feb draft gives the example:

new(2,f) T[5] results in a call of operator new[](sizeof(T)*5+y,2,f).
Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[]. This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another. —end example ]

Now take the following example code:

void* buffer = malloc(sizeof(std::string) * 10);
std::string* p = ::new (buffer) std::string[10];

According to the above quote, the second line new (buffer) std::string[10] will internally call operator new[](sizeof(std::string) * 10 + y, buffer) (before constructing the individual std::string objects). The problem is that if y > 0, the pre-allocated buffer will be too small!

So how do I know how much memory to pre-allocate when using array placement-new?

void* buffer = malloc(sizeof(std::string) * 10 + how_much_additional_space);
std::string* p = ::new (buffer) std::string[10];

Or does the standard somewhere guarantee that y == 0 in this case? Again, the quote says:

This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

左岸枫 2024-12-31 19:59:13

更新

Nicol Bolas 在下面的评论中正确指出，该问题已得到修复，operator new[](std::size_t, void* p) 的开销始终为零。

此修复是作为缺陷报告完成的2019 年 11 月，这使其追溯至 C++ 的所有版本。

原始答案

不要使用operator new[](std::size_t, void* p)，除非你先验知道这个问题的答案。答案是实现细节，并且可以随编译器/平台的不同而改变。尽管它对于任何给定平台通常都是稳定的。例如，这是 Itanium ABI 指定的内容。

如果您不知道这个问题的答案，请编写自己的新放置数组，该数组可以在运行时检查这一点：

inline
void*
operator new[](std::size_t n, void* p, std::size_t limit)
{
    if (n <= limit)
        std::cout << "life is good\n";
    else
        throw std::bad_alloc();
    return p;
}

int main()
{
    alignas(std::string) char buffer[100];
    std::string* p = new(buffer, sizeof(buffer)) std::string[3];
}

通过改变数组大小并检查上面示例中的 n，您可以推断 < code>y 适合您的平台。对于我的平台 y 是 1 个单词。 sizeof(word) 根据我是针对 32 位还是 64 位体系结构进行编译而有所不同。

Update

Nicol Bolas correctly points out in the comments below that this has been fixed such that the overhead is always zero for operator new[](std::size_t, void* p).

This fix was done as a defect report in November 2019, which makes it retroactive to all versions of C++.

Original Answer

Don't use operator new[](std::size_t, void* p) unless you know a-priori the answer to this question. The answer is an implementation detail and can change with compiler/platform. Though it is typically stable for any given platform. E.g. this is something specified by the Itanium ABI.

If you don't know the answer to this question, write your own placement array new that can check this at run time:

inline
void*
operator new[](std::size_t n, void* p, std::size_t limit)
{
    if (n <= limit)
        std::cout << "life is good\n";
    else
        throw std::bad_alloc();
    return p;
}

int main()
{
    alignas(std::string) char buffer[100];
    std::string* p = new(buffer, sizeof(buffer)) std::string[3];
}

By varying the array size and inspecting n in the example above, you can infer y for your platform. For my platform y is 1 word. The sizeof(word) varies depending on whether I'm compiling for a 32 bit or 64 bit architecture.

回复收藏 0 原文

雨落□心尘 2024-12-31 19:59:13

更新：经过一番讨论，我了解到我的答案不再适用于该问题。我将把它留在这里，但肯定仍然需要一个真正的答案。

如果很快没有找到好的答案，我将很乐意以一些赏金来支持这个问题。

我将在这里根据我的理解重申这个问题，希望较短的版本可以帮助其他人理解所问的内容。问题是：

下面的结构总是正确的吗？最后是arr == addr吗？

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

从标准中我们知道，#1 会导致调用 ::operator new[](???, addr)，其中 ??? 是一个不小于的未指定数字N * sizeof(T)，我们还知道该调用仅返回addr，没有其他效果。我们还知道arr相应地相对于addr有偏移。我们不知道的是addr指向的内存是否足够大，或者我们如何知道要分配多少内存。

您似乎混淆了一些事情：

您的示例调用operator new[]()，而不是~~operator new()~~。
分配函数不构造任何东西。它们分配。

发生的情况是，表达式 T * p = new T[10]; 导致：

调用operator new[]() 带有大小参数 10 * sizeof(T) + x，
十次调用默认值T 的构造函数，实际上是 ::new (p + i) T()。

唯一的特点是 array-new 表达式 要求的内存比数组数据本身使用的内存更多。您看不到这些信息，除了默默接受之外，您无法以任何方式使用这些信息。

如果您想知道实际分配了多少内存，您可以简单地替换数组分配函数 operator new[] 和 operator delete[] 并使其打印出实际大小。

更新：作为一条随机信息，您应该注意全局放置新函数必须是无操作的。也就是说，当您像这样就地构造对象或数组时：

T * p = ::new (buf1) T;
T * arr = ::new (buf10) T[10];

然后相应地调用 ::operator new(std::size_t, void*) 和 ::operator new[ ](std::size_t, void*) 除了返回第二个参数之外什么也不做。但是，您不知道 buf10 应该指向什么：它需要指向 10 * sizeof(T) + y 字节的内存，但您无法知道<代码>y。

Update: After some discussion, I understand that my answer no longer applies to the question. I'll leave it here, but a real answer is definitely still called for.

I'll be happy to support this question with some bounty if a good answer isn't found soon.

I'll restate the question here as far as I understand it, hoping that a shorter version might help others understand what's being asked. The question is:

Is the following construction always correct? Is arr == addr at the end?

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

We know from the standard that #1 causes the call ::operator new[](???, addr), where ??? is an unspecified number no smaller than N * sizeof(T), and we also know that that call only returns addr and has no other effects. We also know that arr is offset from addr correspondingly. What we do not know is whether the memory pointed to by addr is sufficiently large, or how we would know how much memory to allocate.

You seem to confuse a few things:

Your example calls operator new[](), not ~~operator new()~~.
The allocation functions do not construct anything. They allocate.

What happens is that the expression T * p = new T[10]; causes:

a call to operator new[]() with size argument 10 * sizeof(T) + x,
ten calls to the default constructor of T, effectively ::new (p + i) T().

The only peculiarity is that the array-new expression asks for more memory than what is used by the array data itself. You don't see any of this and cannot make use of this information in any way other than by silent acceptance.

If you are curious how much memory was actually allocated, you can simply replace the array allocation functions operator new[] and operator delete[] and make it print out the actual size.

Update: As a random piece of information, you should note that the global placement-new functions are required to be no-ops. That is, when you construct an object or array in-place like so:

T * p = ::new (buf1) T;
T * arr = ::new (buf10) T[10];

Then the corresponding calls to ::operator new(std::size_t, void*) and ::operator new[](std::size_t, void*) do nothing but return their second argument. However, you do not know what buf10 is supposed to point to: It needs to point to 10 * sizeof(T) + y bytes of memory, but you cannot know y.

回复收藏 0 原文

花开柳相依 2024-12-31 19:59:13

对于固定大小的内存区域，调用任何版本的operator new[] () 都不能很好地工作。本质上，假设它委托给一些实际的内存分配函数，而不仅仅是返回指向已分配内存的指针。如果您已经有一个要构造对象数组的内存区域，则需要使用 std::uninitialized_fill() 或 std::uninitialized_copy() 来构造对象（或单独构造对象的某种其他形式）。

您可能会争辩说，这意味着您还必须手动销毁内存区域中的对象。但是，在从放置 new 返回的指针上调用 delete[] array 不起作用：它将使用 operator delete[] 的非放置版本()！也就是说，当使用放置new时，您需要手动销毁对象并释放内存。

回复收藏 0 原文

私藏温柔 2024-12-31 19:59:13

正如 Kerrek SB 在评论中提到的，此缺陷首先被报告 2004年，并于2012年解决为：

CWG 同意 EWG 是处理此问题的适当场所。

然后该缺陷于 2013 年向 EWG 报告，但作为 NAD 关闭（大概意味着“不是缺陷”）并带有注释：

问题在于尝试使用 array new 将数组放入预先存在的存储中。我们不需要为此使用 array new ；只需构建它们即可。

这大概意味着建议的解决方法是使用一个循环，为每个正在构造的对象调用一次非数组放置 new 。

线程中其他地方没有提到的推论是，此代码会导致所有 T 出现未定义的行为：

T *ptr = new T[N];
::operator delete[](ptr);

即使我们遵守生命周期规则（即 T 要么有轻微的破坏，要么该程序不依赖于析构函数的副作用），问题是 ptr 已针对此未指定的 cookie 进行了调整，因此传递给 operator delete[]< 的值是错误的/代码>。

As mentioned by Kerrek SB in comments, this defect was first reported in 2004, and it was resolved in 2012 as:

The CWG agreed that EWG is the appropriate venue for dealing with this issue.

Then the defect was reported to EWG in 2013, but closed as NAD (presumably means "Not A Defect") with the comment:

The problem is in trying to use array new to put an array into pre-existing storage. We don't need to use array new for that; just construct them.

which presumably means that the suggested workaround is to use a loop with a call to non-array placement new once for each object being constructed.

A corollary not mentioned elsewhere on the thread is that this code causes undefined behaviour for all T:

T *ptr = new T[N];
::operator delete[](ptr);

Even if we comply with the lifetime rules (i.e. T either has trivial destruction, or the program does not depend on the destructor's side-effects), the problem is that ptr has been adjusted for this unspecified cookie, so it is the wrong value to pass to operator delete[].

回复收藏 0 原文

忆离笙 2024-12-31 19:59:13

请注意，C++20 改变了这个答案。

C++17（及之前版本）[expr.new]/11< /a> 清楚地表明此函数可能获得其大小的实现定义的偏移量：

当 new 表达式调用分配函数并且分配尚未扩展时，new 表达式将请求的空间量作为 std :: size_t 类型的第一个参数传递给分配函数。该参数不得小于正在创建的对象的大小；仅当对象是数组时，它才可能大于正在创建的对象的大小。

这允许（但不要求）给予数组分配函数的大小可以从 sizeof(T) * size 增加。

C++20 明确不允许这样做。来自 [expr.new]/15：

当 new 表达式调用分配函数并且分配尚未扩展时，new 表达式将请求的空间量作为 std :: size_t 类型的第一个参数传递给分配函数。
该参数不得小于正在创建的对象的大小；仅当对象是数组并且分配函数不是非分配形式（[new.delete.placement]）时，它才可能大于正在创建的对象的大小。

添加了强调。甚至您引用的非规范性注释也发生了变化：

此开销可能会应用于所有数组 new 表达式，包括引用布局分配函数的表达式，但引用库函数运算符 new[](std :: size_t, void*) 时除外。

回复收藏 0 原文

陪你到最终 2024-12-31 19:59:13

在阅读了相应的标准部分之后，我开始认为数组类型的放置 new 根本就是无用的想法，标准允许它的唯一原因是描述 new 运算符的通用方式：

新表达式尝试创建 typeid (8.1) 或
它所应用到的 newtypeid。该对象的类型是
分配类型。该类型应该是一个完整的对象类型，但不是一个
抽象类类型或其数组（1.8、3.9、10.4）。 [注：因为
引用不是对象，不能通过以下方式创建引用
新的表达方式。 ] [注意：typeid 可能是 cv 限定的类型，在
在这种情况下，new 表达式创建的对象具有 cv 限定
类型。 ]

new-expression: 
    ::(opt) new new-placement(opt) new-type-id new-initializer(opt)
    ::(opt) new new-placement(opt) ( type-id ) new-initializer(opt)

new-placement: ( expression-list )

newtypeid:
    type-specifier-seq new-declarator(opt)

new-declarator:
    ptr-operator new-declarator(opt)
    direct-new-declarator

direct-new-declarator:
    [ expression ]
    direct-new-declarator [ constant-expression ]

new-initializer: ( expression-list(opt) )

在我看来，数组放置新只是源于定义的紧凑性（所有可能的用途作为一个方案），而且似乎没有充分的理由禁止它。

这使我们陷入这样一种情况：我们有无用的运算符，它需要在知道需要多少内存之前分配内存。我看到的唯一解决方案是要么过度分配内存并希望编译器不需要超过提供的内存，要么在重写的数组放置新函数/方法中重新分配内存（这违背了使用的目的首先是数组放置新）。

回答 Kerrek SB 指出的问题：
你的例子：

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

并不总是正确的。在大多数实现中，arr!=addr（并且有充分的理由），因此您的代码无效，并且您的缓冲区将溢出。

关于这些“充分的理由” - 请注意，当使用 array new 运算符时，标准创建者将您从一些内务管理中释放出来，并且 arrayplacement new 在这方面没有什么不同。请注意，您不需要告知delete[]有关数组的长度，因此该信息必须保存在数组本身中。在哪里？就在这个额外的内存中。如果没有它，删除[]将需要将数组长度分开（就像 stl 使用循环和非放置 new 一样）

After reading corresponding standard sections I am satarting to think that placement new for array types is simply useless idea, and the only reason for it being allowed by standard is generic way in which new-operator is described:

The new expression attempts to create an object of the typeid (8.1) or
newtypeid to which it is applied. The type of that object is the
allocated type. This type shall be a complete object type, but not an
abstract class type or array thereof (1.8, 3.9, 10.4). [Note: because
references are not objects, references cannot be created by
newexpressions. ] [Note: the typeid may be a cvqualified type, in
which case the object created by the newexpression has a cvqualified
type. ]

new-expression: 
    ::(opt) new new-placement(opt) new-type-id new-initializer(opt)
    ::(opt) new new-placement(opt) ( type-id ) new-initializer(opt)

new-placement: ( expression-list )

newtypeid:
    type-specifier-seq new-declarator(opt)

new-declarator:
    ptr-operator new-declarator(opt)
    direct-new-declarator

direct-new-declarator:
    [ expression ]
    direct-new-declarator [ constant-expression ]

new-initializer: ( expression-list(opt) )

To me it seems that array placement new simply stems from compactness of the definition (all possible uses as one scheme), and it seems there is no good reason for it to be forbidden.

This leaves us in a situation where we have useless operator, which needs memory allocated before it is known how much of it will be needed. The only solutions I see would be to either overallocate memory and hope that compiler will not want more than supplied, or re-allocate memory in overriden array placement new function/method (which rather defeats the purpose of using array placement new in the first place).

To answer question pointed out by Kerrek SB:
Your example:

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

is not always correct. In most implementations arr!=addr (and there are good reasons for it) so your code is not valid, and your buffer will be overrun.

About those "good reasons" - note that you are released by standard creators from some house-keeping when using array new operator, and array placement new is no different in this respect. Note that you do not need to inform delete[] about length of array, so this information must be kept in the array itself. Where? Exactly in this extra memory. Without it delete[]'ing would require keeping array length separate (as stl does using loops and non-placement new)

回复收藏 0 原文