使用 std::basic_string是否合理?作为目标 C++03 时的连续缓冲区?

发布于 2024-08-21 13:15:27 字数 1152 浏览 3 评论 0原文

我知道在 C++03 中,从技术上讲,std::basic_string 模板不需要具有连续的内存。然而,我很好奇现代编译器有多少实现实际上利用了这种自由。例如,如果想要使用 basic_string 接收某些 C API 的结果(如下例所示),那么分配一个向量并立即将其转换为字符串似乎很愚蠢。

示例:

DWORD valueLength = 0;
DWORD type;
LONG errorCheck = RegQueryValueExW(
        hWin32,
        value.c_str(),
        NULL,
        &type,
        NULL,
        &valueLength);

if (errorCheck != ERROR_SUCCESS)
    WindowsApiException::Throw(errorCheck);
else if (valueLength == 0)
    return std::wstring();

std::wstring buffer;
do
{
    buffer.resize(valueLength/sizeof(wchar_t));
    errorCheck = RegQueryValueExW(
            hWin32,
            value.c_str(),
            NULL,
            &type,
            &buffer[0],
            &valueLength);
} while (errorCheck == ERROR_MORE_DATA);

if (errorCheck != ERROR_SUCCESS)
    WindowsApiException::Throw(errorCheck);

return buffer;

我知道这样的代码可能会稍微降低可移植性,因为它意味着 std::wstring 是连续的 - 但我想知道这使得该代码有多不可移植。换句话说,编译器如何真正利用非连续内存所允许的自由?


编辑:我更新了这个问题以提及 C++03。读者应该注意,当针对 C++11 时,标准现在要求 basic_string 是连续的,因此当针对该标准时,上述问题不是问题。

I know that in C++03, technically the std::basic_string template is not required to have contiguous memory. However, I'm curious how many implementations exist for modern compilers that actually take advantage of this freedom. For example, if one wants to use basic_string to receive the results of some C API (like the example below), it seems silly to allocate a vector just to turn it into a string immediately.

Example:

DWORD valueLength = 0;
DWORD type;
LONG errorCheck = RegQueryValueExW(
        hWin32,
        value.c_str(),
        NULL,
        &type,
        NULL,
        &valueLength);

if (errorCheck != ERROR_SUCCESS)
    WindowsApiException::Throw(errorCheck);
else if (valueLength == 0)
    return std::wstring();

std::wstring buffer;
do
{
    buffer.resize(valueLength/sizeof(wchar_t));
    errorCheck = RegQueryValueExW(
            hWin32,
            value.c_str(),
            NULL,
            &type,
            &buffer[0],
            &valueLength);
} while (errorCheck == ERROR_MORE_DATA);

if (errorCheck != ERROR_SUCCESS)
    WindowsApiException::Throw(errorCheck);

return buffer;

I know code like this might slightly reduce portability because it implies that std::wstring is contiguous -- but I'm wondering just how unportable that makes this code. Put another way, how may compilers actually take advantage of the freedom having noncontiguous memory allows?


EDIT: I updated this question to mention C++03. Readers should note that when targeting C++11, the standard now requires that basic_string be contiguous, so the above question is a non issue when targeting that standard.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

泪之魂 2024-08-28 13:15:27

我认为假设 std::string 连续分配其存储是相当安全的。

目前,所有已知的 std::string 实现都是连续分配空间的。

此外,C++ 0x 的当前草案 (N3000< /a>) [编辑:警告,直接链接到大型 PDF] 要求连续分配空间 (§21.4.1/5):

a 中的类 char 对象
basic_string 对象应被存储
连续地。也就是说,对于任意
basic_string 对象 s,身份
&*(s.begin() + n) == &*s.begin() + n
对于 n 的所有值都成立
即 0 <= n < s.size()。

因此,当前或将来使用非连续存储实现 std::string 的可能性基本上为零。

I'd consider it quite safe to assume that std::string allocates its storage contiguously.

At the present time, all known implementations of std::string allocate space contiguously.

Moreover, the current draft of C++ 0x (N3000) [Edit: Warning, direct link to large PDF] requires that the space be allocated contiguously (§21.4.1/5):

The char-like objects in a
basic_string object shall be stored
contiguously. That is, for any
basic_string object s, the identity
&*(s.begin() + n) == &*s.begin() + n
shall hold for all values of n such
that 0 <= n < s.size().

As such, the chances of a current or future implementation of std::string using non-contiguous storage are essentially nil.

不离久伴 2024-08-28 13:15:27

不久前,有一个关于是否能够将 std::string 写入存储的问题,就好像它是一个字符数组一样,这取决于 std 的内容是否::string 是连续的:

我的回答表明,根据几个备受推崇的消息来源(Herb Sutter 和 Matt Austern),当前的 C++ 标准确实需要 std::string 来存储其数据在某些条件下是连续的(一旦你调用 str[0] 假设 strstd::string),并且这一事实几乎迫使任何实施的手。

基本上,如果您将 string::data()string::operator[]() 的承诺结合起来,您会得出以下结论:&str[0] 需要返回一个连续的缓冲区。因此,Austern 建议委员会明确这一点,显然这就是 0x 标准中会发生的情况(或者他们现在称之为 1x 标准?)。

因此严格来说,实现不必使用连续存储来实现 std::string,但它必须根据需要来实现。您的示例代码通过传入 &buffer[0] 来实现这一点。

链接:

A while back there was a question about being able to write to the storage for a std::string as if it were an array of characters, and it hinged on whether the contents of a std::string were contiguous:

My answer indicated that according to a couple well regarded sources (Herb Sutter and Matt Austern) the current C++ standard does require std::string to store its data contiguous under certain conditions (once you call str[0] assuming str is a std::string) and that that fact pretty much forces the hand of any implementation.

Basically, if you combine the promises made by string::data() and string::operator[]() you conclude that &str[0] needs to return a contiguous buffer. Therefore Austern suggests that the committee just make that explicit, and apparently that's what'll happen in the 0x standard (or are they calling it the 1x standard now?).

So strictly speaking an implementation doesn't have to implement std::string using contiguous storage, but it has to do so pretty much on demand. And your example code does just that by passing in &buffer[0].

Links:

归途 2024-08-28 13:15:27

编辑:您想要调用&buffer[0]而不是buffer.data(),因为< code>[] 返回一个非 const 引用,并且确实通知对象其内容可能会意外更改。


执行 buffer.data() 会更干净,但您应该更少担心连续内存而不是结构之间共享的内存。 string 实现可以并且确实期望在对象被修改时被告知。 string::data 特别要求程序不修改返回的内部缓冲区。

除了将长度设置为 10 或其他值之外,某些实现很有可能会为所有未初始化的字符串创建一个缓冲区。

使用向量,甚至是带有new[]/delete[]的数组。如果您确实无法复制缓冲区,请在更改字符串之前合法地将字符串初始化为唯一的内容。

Edit: You want to call &buffer[0], not buffer.data(), because [] returns a non-const reference and does notify the object that its contents can change unexpectedly.


It would be cleaner to do buffer.data(), but you should worry less about contiguous memory than memory shared between structures. string implementations can and do expect to be told when an object is being modified. string::data specifically requires that the program not modify the internal buffer returned.

VERY high chances that some implementation will create one buffer for all strings uninitialized besides having length set to 10 or whatever.

Use a vector or even an array with new[]/delete[]. If you really can't copy the buffer, legally initialize the string to something unique before changing it.

吃不饱 2024-08-28 13:15:27

结果是未定义的,我不会这样做。在现代 C++ 堆中,读取向量然后转换为字符串的成本微不足道。 VS 代码在 Windows 9 中也会死掉的风险

,难道不需要在 &buffer[0] 上进行 const_cast 吗?

The result is undefined and I would not do it. The cost of reading into a vector and then converting to a string is trivial in modern c++ heaps. VS the risk that your code will die in windows 9

also, doesnt that need a const_cast on &buffer[0]?

自由如风 2024-08-28 13:15:27

当然,在这里分配向量是愚蠢的。在这里使用 std::wstring 也不明智。最好使用char数组来调用winapi。返回值时构造一个wstring。

Of course, allocating a vector here is silly. Using std::wstring here is not wise also. It's better to use a char array to call the winapi. construct a wstring when returning value.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文