std::stringstream 可以设置失败/坏位的方式吗?

发布于 2024-08-27 01:45:49 字数 1168 浏览 8 评论 0原文

我用于简单字符串分割的常见代码如下所示:

inline std::vector<std::string> split(const std::string &s, char delim) {
    std::vector<std::string> elems;
    std::stringstream ss(s);
    std::string item;
    while(std::getline(ss, item, delim)) {
        elems.push_back(item);
    }
    return elems;
}

有人提到这会默默地“吞掉”std::getline 中发生的错误。我当然同意事实就是如此。但我突然想到,在实践中可能会出现什么问题,我需要担心。基本上,这一切都可以归结为:

inline std::vector<std::string> split(const std::string &s, char delim) {
    std::vector<std::string> elems;
    std::stringstream ss(s);
    std::string item;
    while(std::getline(ss, item, delim)) {
        elems.push_back(item);
    }

    if(/* what error can I catch here? */) {
        // *** How did we get here!? ***
    }

    return elems;
}

stringstreamstring 支持,因此我们不必担心与从文件读取相关的任何问题。这里没有进行类型转换,因为 getline 只是读取,直到看到行分隔符或 EOF。因此,我们无法得到诸如 boost::lexical_cast 之类的东西需要担心的任何错误。

除了未能分配足够的内存可能会出错之外,我根本想不出什么办法,但这只会在 std::getline 之前抛出一个 std::bad_alloc 。 > 甚至发生。我缺少什么?

A common piece of code I use for simple string splitting looks like this:

inline std::vector<std::string> split(const std::string &s, char delim) {
    std::vector<std::string> elems;
    std::stringstream ss(s);
    std::string item;
    while(std::getline(ss, item, delim)) {
        elems.push_back(item);
    }
    return elems;
}

Someone mentioned that this will silently "swallow" errors occurring in std::getline. And of course I agree that's the case. But it occurred to me, what could possibly go wrong here in practice that I would need to worry about. basically it all boils down to this:

inline std::vector<std::string> split(const std::string &s, char delim) {
    std::vector<std::string> elems;
    std::stringstream ss(s);
    std::string item;
    while(std::getline(ss, item, delim)) {
        elems.push_back(item);
    }

    if(/* what error can I catch here? */) {
        // *** How did we get here!? ***
    }

    return elems;
}

A stringstream is backed by a string, so we don't have to worry about any of the issues associated with reading from a file. There is no type conversion going on here since getline simply reads until it sees the line delimeter or EOF. So we can't get any of the errors that something like boost::lexical_cast has to worry about.

I simply can't think of something besides failing to allocate enough memory that could go wrong, but that'll just throw a std::bad_alloc well before the std::getline even takes place. What am I missing?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

幸福丶如此 2024-09-03 01:45:49

我无法想象这个人认为可能会发生什么错误,你应该请他们解释一下。正如您所提到的,除了分配错误之外,不会出现任何问题,这些错误会被抛出而不是被吞掉。

我发现您直接错过的唯一一件事是 ss.fail() 在 while 循环之后保证为 true,因为这是正在测试的条件。 (bool(stream) 相当于 !stream.fail()不是 stream.good()。 ) 正如预期的那样,ss.eof() 也将为 true,表明失败是由于 EOF 造成的。

然而,对于实际发生的事情可能会有些混乱。由于getline使用delim-终止字段而不是delim-分隔字段,因此输入诸如“a\nb\n”之类的数据有两个而不是三个字段,这可能会令人惊讶。对于行来说,这是完全有意义的(并且是 POSIX 标准),但是您希望在 中找到多少个 delim'-' 的字段” ab-" 分割后?


顺便说一句,这是我编写的方式 < a href="http://bitbucket.org/kniht/scraps/src/0b6b73529123/cpp/test_strutil.cpp#cl-47" rel="noreferrer">split:

template<class OutIter>
OutIter split(std::string const& s, char delim, OutIter dest) {
  std::string::size_type begin = 0, end;
  while ((end = s.find(delim, begin)) != s.npos) {
    *dest++ = s.substr(begin, end - begin);
    begin = end + 1;
  }
  *dest++ = s.substr(begin);
  return dest;
}

这避免了 iostream 的所有问题首先,避免额外的副本(字符串流的支持字符串;加上 substr 返回的临时值,如果支持的话,甚至可以使用 C++0x 右值引用来移动语义),具有我期望 split 的行为(不同于你的),并且适用于任何容器。

deque<string> c;
split("a-b-", '-', back_inserter(c));
// c == {"a", "b", ""}

I can't imagine what errors this person thinks might happen, and you should ask them to explain. Nothing can go wrong except allocation errors, as you mentioned, which are thrown and not swallowed.

The only thing I see that you're directly missing is that ss.fail() is guaranteed to be true after the while loop, because that's the condition being tested. (bool(stream) is equivalent to !stream.fail(), not stream.good().) As expected, ss.eof() will also be true, indicating the failure was due to EOF.

However, there might be some confusion over what is actually happening. Because getline uses delim-terminated fields rather than delim-separated fields, input data such as "a\nb\n" has two instead of three fields, and this might be surprising. For lines this makes complete sense (and is POSIX standard), but how many fields, with a delim of '-', would you expect to find in "a-b-" after splitting?


Incidentally, here's how I'd write split:

template<class OutIter>
OutIter split(std::string const& s, char delim, OutIter dest) {
  std::string::size_type begin = 0, end;
  while ((end = s.find(delim, begin)) != s.npos) {
    *dest++ = s.substr(begin, end - begin);
    begin = end + 1;
  }
  *dest++ = s.substr(begin);
  return dest;
}

This avoids all of the problems with iostreams in the first place, avoids extra copies (the stringstream's backing string; plus the temp returned by substr can even use a C++0x rvalue reference for move semantics if supported, as written), has the behavior I expect from split (different from yours), and works with any container.

deque<string> c;
split("a-b-", '-', back_inserter(c));
// c == {"a", "b", ""}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文