获取 std::string 字符串的大小(以字节为单位)

发布于 2024-11-14 05:50:52 字数 159 浏览 3 评论 0原文

我想获取 std::string 的字符串在内存中占用的字节数,而不是字符数。该字符串包含多字节字符串。 std::string::size() 会为我做这个吗?

编辑:另外,size()还包括终止NULL吗?

I would like to get the bytes a std::string's string occupies in memory, not the number of characters. The string contains a multibyte string. Would std::string::size() do this for me?

EDIT: Also, does size() also include the terminating NULL?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

如梦亦如幻 2024-11-21 05:50:52

std::string 对字节进行操作,而不是对 Unicode 字符进行操作,因此 std::string::size() 确实会返回以字节为单位的数据大小(不带当然,std::string 需要存储数据的开销)。

不,std::string 仅存储您告诉它存储的数据(它不需要尾随 NULL 字符)。因此,它不会包含在大小中,除非您显式创建一个带有尾随 NULL 字符的字符串。

std::string operates on bytes, not on Unicode characters, so std::string::size() will indeed return the size of the data in bytes (without the overhead that std::string needs to store the data, of course).

No, std::string stores only the data you tell it to store (it does not need the trailing NULL character). So it will not be included in the size, unless you explicitly create a string with a trailing NULL character.

旧城烟雨 2024-11-21 05:50:52

您可能对此很迂腐:

std::string x("X");

std::cout << x.size() * sizeof(std::string::value_type);

但是 std::string::value_typechar 并且 sizeof(char) 定义为 1

仅当您 typedef 字符串类型时,这一点才变得重要(因为它可能在将来或由于编译器选项而改变)。

// Some header file:
typedef   std::basic_string<T_CHAR>  T_string;

// Source a million miles away
T_string   x("X");

std::cout << x.size() * sizeof(T_string::value_type);  

You could be pedantic about it:

std::string x("X");

std::cout << x.size() * sizeof(std::string::value_type);

But std::string::value_type is char and sizeof(char) is defined as 1.

This only becomes important if you typedef the string type (because it may change in the future or because of compiler options).

// Some header file:
typedef   std::basic_string<T_CHAR>  T_string;

// Source a million miles away
T_string   x("X");

std::cout << x.size() * sizeof(T_string::value_type);  
白日梦 2024-11-21 05:50:52

std::string::size() 确实是以字节为单位的大小。

std::string::size() is indeed the size in bytes.

十年不长 2024-11-21 05:50:52

要获取字符串使用的内存量,您必须将 capacity() 与用于管理的开销相加。请注意,它是 capacity() 而不是 size()。容量决定分配的字符数 (charT),而 size() 告诉您实际使用的字符数。

特别是,std::string 实现通常不会 *shrink_to_fit* 内容,因此,如果您创建一个字符串,然后从末尾删除元素,size()将会递减,但在大多数情况下(这是实现定义的)capacity() 不会递减。

某些实现可能不会分配所需的确切内存量,而是获取给定大小的块以减少内存碎片。在使用字符串的两倍大小的块的实现中,大小为 17 的字符串可以分配多达 32 个字符。

To get the amount of memory in use by the string you would have to sum the capacity() with the overhead used for management. Note that it is capacity() and not size(). The capacity determines the number of characters (charT) allocated, while size() tells you how many of them are actually in use.

In particular, std::string implementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size() will be decremented, but in most cases (this is implementation defined) capacity() will not.

Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17 could be allocating as much as 32 characters.

夜灵血窟げ 2024-11-21 05:50:52

是的,size() 将为您提供字符串中 char 的数量。多字节编码中的一个字符占用多个char

Yes, size() will give you the number of char in the string. One character in multibyte encoding take up multiple char.

魂ガ小子 2024-11-21 05:50:52

所写问题存在固有冲突: std::string 被定义为 std::basic_string ——即它的元素类型是char(1字节),但后来你说“该字符串包含多字节字符串”(“multibyte”==wchar_t?)。

size() 成员函数不计算尾随 null。它的值表示字符数(而不是字节数)。

假设您打算说您的多字节字符串是 std::wstringstd::basic_string 的别名),的内存占用量code>std::wstring 的字符,包括空终止符是:

std::wstring myString;
 ...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);

考虑如何编写一个可重用的模板函数,该函数适用于 std::basic_string<> 的任何潜在实例化,是有启发性的。像这样**:

// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
   return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}

** 为了简单起见,忽略很少为 std::basic_string<> 显式指定的特征和分配器类型(它们有默认值)。

There is inherent conflict in the question as written: std::string is defined as std::basic_string<char,...> -- that is, its element type is char (1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t?).

The size() member function does not count a trailing null. It's value represents the number of characters (not bytes).

Assuming you intended to say your multibyte string is std::wstring (alias for std::basic_string<wchar_t,...>), the memory footprint for the std::wstring's characters, including the null-terminator is:

std::wstring myString;
 ...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);

It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:

// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
   return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}

** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<> (they have defaults).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文