C++相当于 StringBuffer/StringBuilder?

发布于 2024-08-25 16:03:31 字数 286 浏览 11 评论 0 原文

是否有一个 C++ 标准模板库类可以提供高效的字符串连接功能,类似于 C# 的 StringBuilder 或 Java 的 StringBuffer

Is there a C++ Standard Template Library class that provides efficient string concatenation functionality, similar to C#'s StringBuilder or Java's StringBuffer?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

追星践月 2024-09-01 16:03:31

C++ 方法是使用 std::stringstream 或只是简单的字符串连接。 C++ 字符串是可变的,因此连接的性能考虑不太重要。

关于格式化,您可以对流进行所有相同的格式化,但以不同的方式,类似于cout。或者您可以使用强类型函子来封装它并提供类似 String.Format 的接口,例如 boost::format

The C++ way would be to use std::stringstream or just plain string concatenations. C++ strings are mutable so the performance considerations of concatenation are less of a concern.

with regards to formatting, you can do all the same formatting on a stream, but in a different way, similar to cout. or you can use a strongly typed functor which encapsulates this and provides a String.Format like interface e.g. boost::format

世界如花海般美丽 2024-09-01 16:03:31

std::string.append 函数不是一个好的选择,因为它不接受多种形式的数据。更有用的替代方法是使用 std::stringstream ;像这样:

#include <sstream>
// ...

std::stringstream ss;

//put arbitrary formatted data into the stream
ss << 4.5 << ", " << 4 << " whatever";

//convert the stream buffer into a string
std::string str = ss.str();

The std::string.append function isn't a good option because it doesn't accept many forms of data. A more useful alternative is to use std::stringstream; like so:

#include <sstream>
// ...

std::stringstream ss;

//put arbitrary formatted data into the stream
ss << 4.5 << ", " << 4 << " whatever";

//convert the stream buffer into a string
std::string str = ss.str();
宛菡 2024-09-01 16:03:31

注意这个答案最近受到了一些关注。我并不提倡将此作为解决方案(这是我过去在 STL 之前见过的解决方案)。这是一种有趣的方法,只有在对代码进行分析后发现这会有所改进时,才应应用于 std::stringstd::stringstream

我通常使用 std::string 或 < a href="http://en.cppreference.com/w/cpp/io/basic_stringstream" rel="noreferrer">std::stringstream。我从来没有遇到过这些问题。如果我提前知道绳子的粗略尺寸,我通常会先保留一些房间。

我在很久以前就见过其他人制作自己的优化字符串生成器。

class StringBuilder {
private:
    std::string main;
    std::string scratch;

    const std::string::size_type ScratchSize = 1024;  // or some other arbitrary number

public:
    StringBuilder & append(const std::string & str) {
        scratch.append(str);
        if (scratch.size() > ScratchSize) {
            main.append(scratch);
            scratch.resize(0);
        }
        return *this;
    }

    const std::string & str() {
        if (scratch.size() > 0) {
            main.append(scratch);
            scratch.resize(0);
        }
        return main;
    }
};

它使用两个字符串,一个用于字符串的大部分,另一个作为用于连接短字符串的暂存区域。它通过在一个小字符串中批处理短追加操作然后将其追加到主字符串来优化追加,从而减少主字符串变大时所需的重新分配次数。

我不需要使用 std::stringstd::stringstream 使用此技巧。我认为它是在 std::string 之前与第三方字符串库一起使用的,那是很久以前的事了。如果您采用这样的策略,首先要分析您的应用程序。

NOTE this answer has received some attention recently. I am not advocating this as a solution (it is a solution I have seen in the past, before the STL). It is an interesting approach and should only be applied over std::string or std::stringstream if after profiling your code you discover this makes an improvement.

I normally use either std::string or std::stringstream. I have never had any problems with these. I would normally reserve some room first if I know the rough size of the string in advance.

I have seen other people make their own optimized string builder in the distant past.

class StringBuilder {
private:
    std::string main;
    std::string scratch;

    const std::string::size_type ScratchSize = 1024;  // or some other arbitrary number

public:
    StringBuilder & append(const std::string & str) {
        scratch.append(str);
        if (scratch.size() > ScratchSize) {
            main.append(scratch);
            scratch.resize(0);
        }
        return *this;
    }

    const std::string & str() {
        if (scratch.size() > 0) {
            main.append(scratch);
            scratch.resize(0);
        }
        return main;
    }
};

It uses two strings one for the majority of the string and the other as a scratch area for concatenating short strings. It optimise's appends by batching the short append operations in one small string then appending this to the main string, thus reducing the number of reallocations required on the main string as it gets larger.

I have not required this trick with std::string or std::stringstream. I think it was used with a third party string library before std::string, it was that long ago. If you adopt a strategy like this profile your application first.

手长情犹 2024-09-01 16:03:31

std::string C++ 的等价物:它是可变的。

std::string is the C++ equivalent: It's mutable.

烟若柳尘 2024-09-01 16:03:31

您可以使用 .append() 来简单地连接字符串。

std::string s = "string1";
s.append("string2");

我想你甚至可以这样做:

std::string s = "string1";
s += "string2";

至于C#的StringBuilder的格式化操作,我相信snprintf(或者sprintf如果你想将错误代码写入字符数组并转换回字符串的风险是唯一的选择。

You can use .append() for simply concatenating strings.

std::string s = "string1";
s.append("string2");

I think you might even be able to do:

std::string s = "string1";
s += "string2";

As for the formatting operations of C#'s StringBuilder, I believe snprintf (or sprintf if you want to risk writing buggy code ;-) ) into a character array and convert back to a string is about the only option.

君勿笑 2024-09-01 16:03:31

由于 C++ 中的 std::string 是可变的,因此您可以使用它。它有一个+=运算符和一个append函数。

如果您需要附加数字数据,请使用 std::to_string 函数。

如果您希望能够将任何对象序列化为字符串,从而获得更大的灵活性,请使用 std::stringstream 类。但是您需要实现自己的流运算符函数,才能使其与您自己的自定义类一起使用。

Since std::string in C++ is mutable you can use that. It has a += operator and an append function.

If you need to append numerical data use the std::to_string functions.

If you want even more flexibility in the form of being able to serialise any object to a string then use the std::stringstream class. But you'll need to implement your own streaming operator functions for it to work with your own custom classes.

安人多梦 2024-09-01 16:03:31

C++ 的便捷字符串生成器

就像许多人之前回答的那样,std::stringstream 是首选方法。
它运行良好,并且有很多转换和格式选项。在我看来,它有一个非常不方便的缺陷:你不能将它用作单行或表达式。
你总是必须写:

std::stringstream ss;
ss << "my data " << 42;
std::string myString( ss.str() );

这非常烦人,特别是当你想在构造函数中初始化字符串时。

原因是,a) std::stringstream 没有到 std::string 的转换运算符,b) 运算符 << stringstream 的 () 不返回 stringstream 引用,而是返回 std::ostream 引用 - 无法将其进一步计算为字符串流。

解决方案是重写 std::stringstream 并为其提供更好的匹配运算符:

namespace NsStringBuilder {
template<typename T> class basic_stringstream : public std::basic_stringstream<T>
{
public:
    basic_stringstream() {}

    operator const std::basic_string<T> () const                                { return std::basic_stringstream<T>::str();                     }
    basic_stringstream<T>& operator<<   (bool _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (char _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (signed char _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned char _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (short _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned short _val)                   { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (int _val)                              { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned int _val)                     { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long long _val)                        { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long long _val)               { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (float _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (double _val)                           { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long double _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (void* _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::streambuf* _val)                  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ostream& (*_val)(std::ostream&))  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios& (*_val)(std::ios&))          { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios_base& (*_val)(std::ios_base&)){ std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (const T* _val)                         { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val)); }
    basic_stringstream<T>& operator<<   (const std::basic_string<T>& _val)      { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val.c_str())); }
};

typedef basic_stringstream<char>        stringstream;
typedef basic_stringstream<wchar_t>     wstringstream;
}

这样,您就可以

std::string myString( NsStringBuilder::stringstream() << "my data " << 42 )

在构造函数中编写甚至这样的内容。

我必须承认我没有测量性能,因为我还没有在大量使用字符串构建的环境中使用它,但我认为它不会比 std::stringstream 差多少,因为一切都完成了通过引用(除了转换为字符串,但这也是 std::stringstream 中的复制操作)

A convenient string builder for c++

Like many people answered before, std::stringstream is the method of choice.
It works good and has a lot of conversion and formatting options. IMO it has one pretty inconvenient flaw though: You can not use it as a one liner or as an expression.
You always have to write:

std::stringstream ss;
ss << "my data " << 42;
std::string myString( ss.str() );

which is pretty annoying, especially when you want to initialize strings in the constructor.

The reason is, that a) std::stringstream has no conversion operator to std::string and b) the operator << ()'s of the stringstream don't return a stringstream reference, but a std::ostream reference instead - which can not be further computed as a string stream.

The solution is to override std::stringstream and to give it better matching operators:

namespace NsStringBuilder {
template<typename T> class basic_stringstream : public std::basic_stringstream<T>
{
public:
    basic_stringstream() {}

    operator const std::basic_string<T> () const                                { return std::basic_stringstream<T>::str();                     }
    basic_stringstream<T>& operator<<   (bool _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (char _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (signed char _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned char _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (short _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned short _val)                   { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (int _val)                              { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned int _val)                     { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long long _val)                        { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long long _val)               { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (float _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (double _val)                           { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long double _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (void* _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::streambuf* _val)                  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ostream& (*_val)(std::ostream&))  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios& (*_val)(std::ios&))          { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios_base& (*_val)(std::ios_base&)){ std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (const T* _val)                         { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val)); }
    basic_stringstream<T>& operator<<   (const std::basic_string<T>& _val)      { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val.c_str())); }
};

typedef basic_stringstream<char>        stringstream;
typedef basic_stringstream<wchar_t>     wstringstream;
}

With this, you can write things like

std::string myString( NsStringBuilder::stringstream() << "my data " << 42 )

even in the constructor.

I have to confess I didn't measure the performance, since I have not used it in an environment which makes heavy use of string building yet, but I assume it won't be much worse than std::stringstream, since everything is done via references (except the conversion to string, but thats a copy operation in std::stringstream as well)

弥繁 2024-09-01 16:03:31

std::string 的 += 不适用于 const char* (类似于“要添加的字符串”之类的东西),因此使用 stringstream 绝对是最接近所需的 - 您只需使用 <<而不是 +

std::string's += doesn't work with const char* (what stuff like "string to add" appear to be), so definitely using stringstream is the closest to what is required - you just use << instead of +

深海里的那抹蓝 2024-09-01 16:03:31

如果必须在随机位置插入/删除字符串,则 Rope 容器可能是值得的目标字符串或长字符序列。
下面是 SGI 实现的一个示例:

crope r(1000000, 'x');          // crope is rope<char>. wrope is rope<wchar_t>
                                // Builds a rope containing a million 'x's.
                                // Takes much less than a MB, since the
                                // different pieces are shared.
crope r2 = r + "abc" + r;       // concatenation; takes on the order of 100s
                                // of machine instructions; fast
crope r3 = r2.substr(1000000, 3);       // yields "abc"; fast.
crope r4 = r2.substr(1000000, 1000000); // also fast.
reverse(r2.mutable_begin(), r2.mutable_end());
                                // correct, but slow; may take a
                                // minute or more.

The Rope container may be worth if have to insert/delete string into the random place of destination string or for a long char sequences.
Here is an example from SGI's implementation:

crope r(1000000, 'x');          // crope is rope<char>. wrope is rope<wchar_t>
                                // Builds a rope containing a million 'x's.
                                // Takes much less than a MB, since the
                                // different pieces are shared.
crope r2 = r + "abc" + r;       // concatenation; takes on the order of 100s
                                // of machine instructions; fast
crope r3 = r2.substr(1000000, 3);       // yields "abc"; fast.
crope r4 = r2.substr(1000000, 1000000); // also fast.
reverse(r2.mutable_begin(), r2.mutable_end());
                                // correct, but slow; may take a
                                // minute or more.
瞎闹 2024-09-01 16:03:31

我想添加一些新的东西,因为以下原因:

第一次尝试时,我未能击败

std::ostringstreamoperator<<

效率,但随着更多的尝试,我能够制作出在某些情况下更快的 StringBuilder。

每次我附加一个字符串时,我只是在某处存储对它的引用并增加总大小的计数器。

我最终实现它的真正方法(恐怖!)是使用不透明缓冲区(std::vector < char >):

  • 1字节标头(2位来判断以下数据是否是:移动字符串,字符串或字节[] )
  • 6 位来告诉字节 [] 的长度

对于字节 []

  • 我直接存储短字符串的字节(用于顺序内存访问)

对于移动的字符串(附加有 std::move)

  • 指向 std::string 对象(我们拥有所有权)的指针会
  • 如果存在未使用的字符串保留字节,则

在类中设置一个标志 /em>

  • 指向 std::string 对象的指针(无所有权)

还有一个小的优化,如果最后插入的字符串被移动,它会检查空闲保留但未使用的字节并进一步存储那里的字节而不是使用不透明缓冲区(这是为了节省一些内存,它实际上使它稍微慢一些,可能还取决于CPU,并且无论如何都很少看到带有额外保留空间的字符串)

这最终比std::ostringstream 但它有一些缺点:

  • 我假设固定长度的字符类型(所以1,2或4字节,不适合UTF8),我并不是说它不适用于UTF8,只是我没有检查它的懒惰。
  • 我使用了糟糕的编码实践(不透明的缓冲区,很容易使其不可移植,顺便说一句,我相信我的是可移植的)
  • 缺乏 ostringstream 的所有功能
  • 如果在合并所有字符串之前删除了某些引用的字符串:未定义行为。

结论?使用
std::ostringstream

它已经解决了最大的瓶颈,同时通过我的实现提高了几个百分点的速度,这是不值得的。

I wanted to add something new because of the following:

At a first attemp I failed to beat

std::ostringstream 's operator<<

efficiency, but with more attemps I was able to make a StringBuilder that is faster in some cases.

Everytime I append a string I just store a reference to it somewhere and increase the counter of the total size.

The real way I finally implemented it (Horror!) is to use a opaque buffer(std::vector < char > ):

  • 1 byte header (2 bits to tell if following data is :moved string, string or byte[])
  • 6 bits to tell lenght of byte[]

for byte [ ]

  • I store directly bytes of short strings (for sequential memory access)

for moved strings (strings appended with std::move)

  • The pointer to a std::string object (we have ownership)
  • set a flag in the class if there are unused reserved bytes there

for strings

  • The pointer to a std::string object (no ownership)

There's also one small optimization, if last inserted string was mov'd in, it checks for free reserved but unused bytes and store further bytes in there instead of using the opaque buffer (this is to save some memory, it actually make it slightly slower, maybe depend also on the CPU, and it is rare to see strings with extra reserved space anyway)

This was finally slightly faster than std::ostringstream but it has few downsides:

  • I assumed fixed lenght char types (so 1,2 or 4 bytes, not good for UTF8), I'm not saying it will not work for UTF8, Just I don't checked it for laziness.
  • I used bad coding practise (opaque buffer, easy to make it not portable, I believe mine is portable by the way)
  • Lacks all features of ostringstream
  • If some referenced string is deleted before mergin all the strings: undefined behaviour.

conclusion? use
std::ostringstream

It already fix the biggest bottleneck while ganing few % points in speed with mine implementation is not worth the downsides.

余生一个溪 2024-09-01 16:03:31

常规 std::string 相当于 Java 中的 StringBuffer,因为它是可变的。如果我简化很多,Java 字符串就像不可变的全局常量,这会导致每次需要更改时都会创建新对象。相比之下,StringBuffer(和 std::string)相当于动态字符数组。附加到它上的是 O*(1)。追加的最坏情况是 O(n),但 k 次连续调用追加会导致 kO(1) 次操作 => O*(1)。


详细了解插入动态数组的摊销复杂度分析

Regular std::string is the equivalent to StringBuffer in Java as it is mutable. If I simplify a lot, Java strings are something like immutable global constants which causes creating new objects everytime they have to change. In contrast StringBuffer (and std::string) is the equivalent of a dynamic array of chars. Appending to it is O*(1). Worst case of append is O(n) but k consecutive calls to append lead to k.O(1) operations => O*(1).


Read more about amortized complexity analysisof insertion into dynamic array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文