字符串 c_str() 与 data()

发布于 2024-07-06 17:28:15 字数 229 浏览 2 评论 0原文

我读过几个地方,c_str()data() (在 STL 和其他实现中)之间的区别是 c_str() 是总是以 null 终止,而 data() 则不是。 据我在实际实现中看到的,它们要么执行相同的操作,要么调用 data() 调用 c_str()。

我在这里缺少什么? 在什么场景下使用哪一种更正确?

I have read several places that the difference between c_str() and data() (in STL and other implementations) is that c_str() is always null terminated while data() is not.
As far as I have seen in actual implementations, they either do the same or data() calls c_str().

What am I missing here?
Which one is more correct to use in which scenarios?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

兮子 2024-07-13 17:28:15

文档是正确的。 如果您想要一个空终止字符串。

如果实现者碰巧实现了 data()c_str() 而言,您不必担心,在某些实现中,如果您不需要字符串以 null 结尾,仍然可以使用 data()它可能比 c_str() 执行得更好。

字符串不一定必须由字符数据组成,它们可以由任何类型的元素组成。 在这些情况下,data() 更有意义。 在我看来,c_str() 仅当字符串元素基于字符时才真正有用。

额外:从 C++11 开始,两个函数必须相同。 即现在要求 data 以 null 终止。 根据cppreference:“返回的数组以null结尾,即、data() 和 c_str() 执行相同的功能。”

The documentation is correct. Use c_str() if you want a null terminated string.

If the implementers happend to implement data() in terms of c_str() you don't have to worry, still use data() if you don't need the string to be null terminated, in some implementation it may turn out to perform better than c_str().

strings don't necessarily have to be composed of character data, they could be composed with elements of any type. In those cases data() is more meaningful. c_str() in my opinion is only really useful when the elements of your string are character based.

Extra: In C++11 onwards, both functions are required to be the same. i.e. data is now required to be null-terminated. According to cppreference: "The returned array is null-terminated, that is, data() and c_str() perform the same function."

回忆那么伤 2024-07-13 17:28:15

C++11/C++0x 中,< code>data() 和 c_str() 不再不同。 因此,data() 也需要在末尾有一个空终止符。

21.4.7.1 basic_string 访问器 [string.accessors]

const charT* c_str() const noexcept;

const charT* data() const noexcept;

1 返回:一个指针 p,对于 [0,size() 中的每个 ip + i == &operator[](i) )].


21.4.5 basic_string元素访问[string.access]

const_reference 运算符[](size_type pos) const noexcept;

1 要求:pos <= size()。
2 返回:*(begin() + pos) 如果 pos < size(),否则为 T 类型对象的引用
with value charT(); 引用的值不应被修改。

In C++11/C++0x, data() and c_str() is no longer different. And thus data() is required to have a null termination at the end as well.

21.4.7.1 basic_string accessors [string.accessors]

const charT* c_str() const noexcept;

const charT* data() const noexcept;

1 Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].


21.4.5 basic_string element access [string.access]

const_reference operator[](size_type pos) const noexcept;

1 Requires: pos <= size().
2 Returns: *(begin() + pos) if pos < size(), otherwise a reference to an object of type T
with value charT(); the referenced value shall not be modified.

夜唯美灬不弃 2024-07-13 17:28:15

即使知道您已经看到它们做了同样的事情,或者 .data() 调用 .c_str(),但假设其他编译器也会出现这种情况也是不正确的。 您的编译器也可能会随着未来的版本而改变。

使用 std::string 的 2 个理由:

std::string 可用于文本和任意二进制数据。

//Example 1
//Plain text:
std::string s1;
s1 = "abc";

//Example 2
//Arbitrary binary data:
std::string s2;
s2.append("a\0b\0b\0", 6);

当您使用字符串作为示例 1 时,您应该使用 .c_str() 方法。

当您使用字符串作为示例 2 时,您应该使用 .data() 方法。不是因为在以下情况中使用 .c_str() 是危险的这些情况,但因为更明确的是您正在使用二进制数据供其他人审查您的代码。

使用 .data() 的可能陷阱

以下代码是错误的,可能会导致程序中出现段错误:

std::string s;
s = "abc";   
char sz[512]; 
strcpy(sz, s.data());//This could crash depending on the implementation of .data()

为什么实现者通常会使用 .data() 和 .c_str() 来执行此操作同样的事情?

因为这样做效率更高。 使 .data() 返回非 null 终止的内容的唯一方法是让 .c_str() 或 .data() 复制其内部缓冲区,或者仅使用 2 个缓冲区。 拥有一个以 null 结尾的缓冲区始终意味着您在实现 std::string 时始终可以仅使用一个内部缓冲区。

Even know you have seen that they do the same, or that .data() calls .c_str(), it is not correct to assume that this will be the case for other compilers. It is also possible that your compiler will change with a future release.

2 reasons to use std::string:

std::string can be used for both text and arbitrary binary data.

//Example 1
//Plain text:
std::string s1;
s1 = "abc";

//Example 2
//Arbitrary binary data:
std::string s2;
s2.append("a\0b\0b\0", 6);

You should use the .c_str() method when you are using your string as example 1.

You should use the .data() method when you are using your string as example 2. Not because it is dangereous to use .c_str() in these cases, but because it is more explicit that you are working with binary data for others reviewing your code.

Possible pitfall with using .data()

The following code is wrong and could cause a segfault in your program:

std::string s;
s = "abc";   
char sz[512]; 
strcpy(sz, s.data());//This could crash depending on the implementation of .data()

Why is it common for implementers to make .data() and .c_str() do the same thing?

Because it is more efficient to do so. The only way to make .data() return something that is not null terminated, would be to have .c_str() or .data() copy their internal buffer, or to just use 2 buffers. Having a single null terminated buffer always means that you can always use just one internal buffer when implementing std::string.

两人的回忆 2024-07-13 17:28:15

前面的所有注释都是一致的,但我还想补充一点,从 c++17 开始,str.data() 返回 char* 而不是 const char*

All the previous commments are consistence, but I'd also like to add that starting in c++17, str.data() returns a char* instead of const char*

肥爪爪 2024-07-13 17:28:15

已经回答了,关于目的的一些注释:实施自由。

std::string 操作 - 例如迭代、串联和元素突变 - 不需要零终止符。 除非您将 string 传递给需要零终止字符串的函数,否则可以省略它。

这将允许实现让子字符串共享实际的字符串数据:string::substr 可以在内部保存对共享字符串数据以及开始/结束范围的引用,从而避免复制(和额外的分配)实际的字符串数据。 该实现将推迟复制,直到您调用 c_str 或修改任何字符串为止。 如果只是读取所涉及的子字符串,则不会进行任何复制。

(写时复制实现在多线程环境中并没有多大乐趣,而且典型的内存/分配节省不值得今天使用更复杂的代码,因此很少这样做)。


类似地,string::data 允许不同的内部表示,例如绳索(字符串段的链接列表)。 这可以显着改善插入/替换操作。 同样,当您调用 c_strdata 时,段列表必须折叠为单个段。

It has been answered already, some notes on the purpose: Freedom of implementation.

std::string operations - e.g. iteration, concatenation and element mutation - don't need the zero terminator. Unless you pass the string to a function expecting a zero terminated string, it can be omitted.

This would allow an implementation to have substrings share the actual string data: string::substr could internally hold a reference to shared string data, and the start/end range, avoiding the copy (and additional allocation) of the actual string data. The implementation would defer the copy until you call c_str or modify any of the strings. No copy would ever be made if the sub-strings involved are just read.

(copy-on-write implementation aren't much fun in multithreaded environments, plus the typical memory/allocation savings aren't worth the more complex code today, so it's rarely done).


Similarly, string::data allows a different internal representation, e.g. a rope (linked list of string segments). This can improve insert / replace operations significantly. again, the list of segments would have to be collapsed to a single segment when you call c_str or data.

甩你一脸翔 2024-07-13 17:28:15

引用 ANSI ISO IEC 14882 2003(C++03 标准):

    21.3.6 basic_string string operations [lib.string.ops]

    const charT* c_str() const;

    Returns: A pointer to the initial element of an array of length size() + 1 whose first size() elements
equal the corresponding elements of the string controlled by *this and whose last element is a
null character specified by charT().
    Requires: The program shall not alter any of the values stored in the array. Nor shall the program treat the
returned value as a valid pointer value after any subsequent call to a non-const member function of the
class basic_string that designates the same object as this.

    const charT* data() const;

    Returns: If size() is nonzero, the member returns a pointer to the initial element of an array whose first
size() elements equal the corresponding elements of the string controlled by *this. If size() is
zero, the member returns a non-null pointer that is copyable and can have zero added to it.
    Requires: The program shall not alter any of the values stored in the character array. Nor shall the program
treat the returned value as a valid pointer value after any subsequent call to a non- const member
function of basic_string that designates the same object as this.

Quote from ANSI ISO IEC 14882 2003 (C++03 Standard):

    21.3.6 basic_string string operations [lib.string.ops]

    const charT* c_str() const;

    Returns: A pointer to the initial element of an array of length size() + 1 whose first size() elements
equal the corresponding elements of the string controlled by *this and whose last element is a
null character specified by charT().
    Requires: The program shall not alter any of the values stored in the array. Nor shall the program treat the
returned value as a valid pointer value after any subsequent call to a non-const member function of the
class basic_string that designates the same object as this.

    const charT* data() const;

    Returns: If size() is nonzero, the member returns a pointer to the initial element of an array whose first
size() elements equal the corresponding elements of the string controlled by *this. If size() is
zero, the member returns a non-null pointer that is copyable and can have zero added to it.
    Requires: The program shall not alter any of the values stored in the character array. Nor shall the program
treat the returned value as a valid pointer value after any subsequent call to a non- const member
function of basic_string that designates the same object as this.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文