C++11 中 std::string 总是以 null 结尾吗?

发布于 2024-11-09 03:14:42 字数 471 浏览 7 评论 0原文

Herb Sutter 在其网站上的 2008 年帖子中指出:

出于与并发相关的原因,有一项积极的提案要求在 C++0x 中进一步加强这一点,并要求空终止,并可能禁止写时复制实现。这是论文:http://www. open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2534.html。我认为本文中的一项或两项提案很可能会被采纳,但我们将在下一两次会议上看到结果。

我知道 C++11 现在保证 std::string 内容连续存储,但是他们在最终草案中采用了上述内容吗?

现在使用 &str[0] 这样的东西安全吗?

In a 2008 post on his site, Herb Sutter states the following:

There is an active proposal to tighten this up further in C++0x and require null-termination and possibly ban copy-on-write implementations, for concurrency-related reasons. Here’s the paper: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2534.html . I think that one or both of the proposals in this paper is likely to be adopted, but we’ll see at the next meeting or two.

I know that C++11 now guarantees that the std::string contents get stored contiguously, but did they adopt the above in the final draft?

Will it now be safe to use something like &str[0]?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

风尘浪孓 2024-11-16 03:14:42

是的,根据 [string.accessors] p1std::basic_string::c_str():

返回:一个指针p,对于每个i,p + i == &operator[](i)[0,size()] 中。

复杂性:恒定时间。

要求:程序不得更改存储在字符数组中的任何值。

这意味着给定一个字符串 ss.c_str() 返回的指针必须与字符串中首字符的地址相同(& ;s[0])。

Yes, per [string.accessors] p1, std::basic_string::c_str():

Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].

Complexity: constant time.

Requires: The program shall not alter any of the values stored in the character array.

This means that given a string s, the pointer returned by s.c_str() must be the same as the address of the initial character in the string (&s[0]).

人事已非 2024-11-16 03:14:42

&str[0] 可以安全使用——只要您不假设它指向以 null 结尾的字符串。

自 C++11 起,要求包括([string.accessors] 部分):

  • str.data()str.c_str() 指向以 null 结尾的字符串。
  • &str[i] == str.data() + i ,对于 0 <= i <= str.size()
    • 请注意,这意味着存储是连续的。

但是,不要求 &str[0] + str.size() 指向空终止符。

data()c_str()operator[](str.size()) 时,一致的实现必须将空终止符连续放置在存储中> 被称为;但不需要将其放置在任何其他情况下,例如使用其他参数调用 operator[]


为了让您免于阅读下面的长篇讨论: 有人提出反对意见,如果 c_str() 编写一个空终止符,则会导致 c_str() 下的数据争用a href="https://timsong-cpp.github.io/cppwp/n3337/res.on.data.races#3" rel="nofollow noreferrer">res.on.data.races#3 ;我不同意这将是一场数据竞赛。

&str[0] is safe to use -- so long as you do not assume it points to a null-terminated string.

Since C++11 the requirements include (section [string.accessors]):

  • str.data() and str.c_str() point to a null-terminated string.
  • &str[i] == str.data() + i , for 0 <= i <= str.size()
    • note that this implies the storage is contiguous.

However, there is no requirement that &str[0] + str.size() points to a null terminator.

A conforming implementation must place the null terminator contiguously in storage when data(), c_str() or operator[](str.size()) are called; but there is no requirement to place it in any other situation, such as calls to operator[] with other arguments.


To save you on reading the long chat discussion below: The objection was been raised that if c_str() were to write a null terminator, it would cause a data race under res.on.data.races#3 ; and I disagreed that it would be a data race .

巡山小妖精 2024-11-16 03:14:42

尽管 c_str() 返回 std::string 的 null 终止版本,但将 C++ std::string 与 C char* 字符串混合时可能会出现意外情况。

空字符可能会出现在 C++ std::string 中,这可能会导致微妙的错误,因为 C 函数会看到较短的字符串。

有错误的代码可能会覆盖空终止符。这会导致未定义的行为。然后,C 函数将读取字符串缓冲区之外的内容,从而可能导致崩溃。

#include <string>
#include <iostream>
#include <cstdio>
#include <cstring>

int main()
{
    std::string embedded_null = "hello\n";
    embedded_null += '\0';
    embedded_null += "world\n";

    // C string functions finish early at embedded \0
    std::cout << "C++ size: " << embedded_null.size() 
              << " value: " << embedded_null;
    printf("C strlen: %d value: %s\n", 
           strlen(embedded_null.c_str()), 
           embedded_null.c_str());

    std::string missing_terminator(3, 'n');
    missing_terminator[3] = 'a'; // BUG: Undefined behaviour

    // C string functions read beyond buffer and may crash
    std::cout << "C++ size: " << missing_terminator.size() 
              << " value: " << missing_terminator << '\n';
    printf("C strlen: %d value: %s\n", 
           strlen(missing_terminator.c_str()), 
           missing_terminator.c_str());
}

输出:

$ c++ example.cpp
$ ./a.out
C++ size: 13 value: hello
world
C strlen: 6 value: hello

C++ size: 3 value: nnn
C strlen: 6 value: nnna�

Although c_str() returns a null terminated version of the std::string, surprises may await when mixing C++ std::string with C char* strings.

Null characters may end up within a C++ std::string, which can lead to subtle bugs as C functions will see a shorter string.

Buggy code may overwrite the null terminator. This results in undefined behaviour. C functions would then read beyond the string buffer, potentially causing a crash.

#include <string>
#include <iostream>
#include <cstdio>
#include <cstring>

int main()
{
    std::string embedded_null = "hello\n";
    embedded_null += '\0';
    embedded_null += "world\n";

    // C string functions finish early at embedded \0
    std::cout << "C++ size: " << embedded_null.size() 
              << " value: " << embedded_null;
    printf("C strlen: %d value: %s\n", 
           strlen(embedded_null.c_str()), 
           embedded_null.c_str());

    std::string missing_terminator(3, 'n');
    missing_terminator[3] = 'a'; // BUG: Undefined behaviour

    // C string functions read beyond buffer and may crash
    std::cout << "C++ size: " << missing_terminator.size() 
              << " value: " << missing_terminator << '\n';
    printf("C strlen: %d value: %s\n", 
           strlen(missing_terminator.c_str()), 
           missing_terminator.c_str());
}

Output:

$ c++ example.cpp
$ ./a.out
C++ size: 13 value: hello
world
C strlen: 6 value: hello

C++ size: 3 value: nnn
C strlen: 6 value: nnna�
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文