输入行结尾混合时的 std::getline 替代方案

发布于 2024-11-19 19:33:23 字数 455 浏览 2 评论 0原文

我正在尝试从 std::istream 读取行,但输入可能包含 '\r' 和/或 '\n',所以 std::getline 没有用。

抱歉,但这似乎需要强调...

输入可能包含其中一个换行符类型或两者

有没有标准的方法来做到这一点?目前我正在尝试

char c;
while (in >> c && '\n' != c && '\r' != c)
    out .push_back (c);

......但这会跳过空白。噢! std::noskipws -- 需要更多的摆弄,现在却很糟糕。

当然必须有更好的方法吗?!?

I'm trying to read in lines from a std::istream but the input may contain '\r' and/or '\n', so std::getline is no use.

Sorry to shout but this seems to need emphasis...

The input may contain either newline type or both.

Is there a standard way to do this? At the moment I'm trying

char c;
while (in >> c && '\n' != c && '\r' != c)
    out .push_back (c);

...but this skips over whitespace. D'oh! std::noskipws -- more fiddling required and now it's misehaving.

Surely there must be a better way?!?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

晨光如昨 2024-11-26 19:33:23

好的,这是一种方法。基本上我已经实现了 std::getline ,它接受谓词而不是字符。这让你完成了 2/3 的任务:

template <class Ch, class Tr, class A, class Pred>
std::basic_istream<Ch, Tr> &getline(std::basic_istream<Ch, Tr> &is, std::basic_string<Ch, Tr, A>& str, Pred p) {

    typename std::string::size_type nread = 0;      
    if(typename std::istream::sentry(is, true)) {
        std::streambuf *sbuf = is.rdbuf();
        str.clear();

        while (nread < str.max_size()) {
            int c1 = sbuf->sbumpc();
            if (Tr::eq_int_type(c1, Tr::eof())) {
                is.setstate(std::istream::eofbit);
                break;
            } else {
                ++nread;
                const Ch ch = Tr::to_char_type(c1);
                if (!p(ch)) {
                    str.push_back(ch);
                } else {
                    break;
                }
            }
        }
    }

    if (nread == 0 || nread >= str.max_size()) {
        is.setstate(std::istream::failbit);
    }

    return is;
}

使用与此类似的函子:

struct is_newline {
    bool operator()(char ch) const {
        return ch == '\n' || ch == '\r';
    }
};

现在,剩下的唯一事情就是确定你是否以 '\r' 结束...,如果如果你这样做了,那么如果下一个字符是 '\n',只需使用它并忽略它。

编辑:为了将这一切放入功能解决方案中,这里有一个示例:

#include <string>
#include <sstream>
#include <iostream>

namespace util {

    struct is_newline { 
        bool operator()(char ch) {
            ch_ = ch;
            return ch_ == '\n' || ch_ == '\r';
        }

        char ch_;
    };

    template <class Ch, class Tr, class A, class Pred>
        std::basic_istream<Ch, Tr> &getline(std::basic_istream<Ch, Tr> &is, std::basic_string<Ch, Tr, A>& str, Pred &p) {

        typename std::string::size_type nread = 0;

        if(typename std::istream::sentry(is, true)) {
            std::streambuf *const sbuf = is.rdbuf();
                str.clear();

            while (nread < str.max_size()) {
                int c1 = sbuf->sbumpc();
                if (Tr::eq_int_type(c1, Tr::eof())) {
                    is.setstate(std::istream::eofbit);
                    break;
                } else {
                    ++nread;
                    const Ch ch = Tr::to_char_type(c1);
                    if (!p(ch)) {
                        str.push_back(ch);
                    } else {
                        break;
                    }
                }
            }
        }

        if (nread == 0 || nread >= str.max_size()) {
            is.setstate(std::istream::failbit);
        }

        return is;
    }
}

int main() {

    std::stringstream ss("this\ris a\ntest\r\nyay");
    std::string       item;
    util::is_newline  is_newline;

    while(util::getline(ss, item, is_newline)) {
        if(is_newline.ch_ == '\r' && ss.peek() == '\n') {
            ss.ignore(1);
        }

        std::cout << '[' << item << ']' << std::endl;
    }
}

我对原始示例做了一些细微的更改。 Pred p 参数现在是一个引用,以便谓词可以存储一些数据(特别是最后测试的 char)。同样,我将谓词 operator() 设置为非 const,以便它可以存储该字符。

在 main 中,我在 std::stringstream 中有一个字符串,它具有所有 3 个版本的换行符。我使用我的 util::getline,如果谓词对象表示最后一个 char'\r',那么我 peek() 前面并忽略 1 字符(如果它恰好是 '\n')。

OK, here's one way to do it. Basically I've made an implementation of std::getline which accepts a predicate instead of a character. This gets you 2/3's of the way there:

template <class Ch, class Tr, class A, class Pred>
std::basic_istream<Ch, Tr> &getline(std::basic_istream<Ch, Tr> &is, std::basic_string<Ch, Tr, A>& str, Pred p) {

    typename std::string::size_type nread = 0;      
    if(typename std::istream::sentry(is, true)) {
        std::streambuf *sbuf = is.rdbuf();
        str.clear();

        while (nread < str.max_size()) {
            int c1 = sbuf->sbumpc();
            if (Tr::eq_int_type(c1, Tr::eof())) {
                is.setstate(std::istream::eofbit);
                break;
            } else {
                ++nread;
                const Ch ch = Tr::to_char_type(c1);
                if (!p(ch)) {
                    str.push_back(ch);
                } else {
                    break;
                }
            }
        }
    }

    if (nread == 0 || nread >= str.max_size()) {
        is.setstate(std::istream::failbit);
    }

    return is;
}

with a functor similar to this:

struct is_newline {
    bool operator()(char ch) const {
        return ch == '\n' || ch == '\r';
    }
};

Now, the only thing left is to determine if you ended on a '\r' or not..., if you did, then if the next character is a '\n', just consume it and ignore it.

EDIT: So to put this all into a functional solution, here's an example:

#include <string>
#include <sstream>
#include <iostream>

namespace util {

    struct is_newline { 
        bool operator()(char ch) {
            ch_ = ch;
            return ch_ == '\n' || ch_ == '\r';
        }

        char ch_;
    };

    template <class Ch, class Tr, class A, class Pred>
        std::basic_istream<Ch, Tr> &getline(std::basic_istream<Ch, Tr> &is, std::basic_string<Ch, Tr, A>& str, Pred &p) {

        typename std::string::size_type nread = 0;

        if(typename std::istream::sentry(is, true)) {
            std::streambuf *const sbuf = is.rdbuf();
                str.clear();

            while (nread < str.max_size()) {
                int c1 = sbuf->sbumpc();
                if (Tr::eq_int_type(c1, Tr::eof())) {
                    is.setstate(std::istream::eofbit);
                    break;
                } else {
                    ++nread;
                    const Ch ch = Tr::to_char_type(c1);
                    if (!p(ch)) {
                        str.push_back(ch);
                    } else {
                        break;
                    }
                }
            }
        }

        if (nread == 0 || nread >= str.max_size()) {
            is.setstate(std::istream::failbit);
        }

        return is;
    }
}

int main() {

    std::stringstream ss("this\ris a\ntest\r\nyay");
    std::string       item;
    util::is_newline  is_newline;

    while(util::getline(ss, item, is_newline)) {
        if(is_newline.ch_ == '\r' && ss.peek() == '\n') {
            ss.ignore(1);
        }

        std::cout << '[' << item << ']' << std::endl;
    }
}

I've made a couple minor changes to my original example. The Pred p parameter is now a reference so that the predicate can store some data (specifically the last char tested). And likewise I made the predicate operator() non-const so it can store that character.

The in main, I have a string in a std::stringstream which has all 3 versions of line breaks. I use my util::getline, and if the predicate object says that the last char was a '\r', then I peek() ahead and ignore 1 character if it happens to be '\n'.

一腔孤↑勇 2024-11-26 19:33:23

读取一行的常用方法是使用 std::getline

编辑:如果您的 std::getline 实现被破坏,您可以编写类似的内容,如下所示:

std::istream &getline(std::istream &is, std::string &s) { 
    char ch;

    s.clear();

    while (is.get(ch) && ch != '\n' && ch != '\r')
        s += ch;
    return is;
}

我应该补充一点,从技术上讲,这可能不是 std 的问题::getline 被破坏,因为底层流实现被破坏了——由流将表示平台行结束的任何字符转换为换行符。然而,无论具体哪些部分被破坏,如果您的实现被破坏,这可能能够弥补它(话又说回来,如果您的实现被破坏得足够严重,则很难确定这是否会起作用)。

The usual way to read a line is with std::getline.

Edit: If your implementation of std::getline is broken, you could write something similar of your own, something like this:

std::istream &getline(std::istream &is, std::string &s) { 
    char ch;

    s.clear();

    while (is.get(ch) && ch != '\n' && ch != '\r')
        s += ch;
    return is;
}

I should add that technically this probably isn't a matter of std::getline being broken, as of the underlying stream implementation being broken -- it's up to the stream to translate from whatever characters signify the end of a line for the platform, into a newline character. Regardless of exactly which parts are broken, however, if your implementation is broken, this may be able to make up for it (then again, if your implementation is broken badly enough, it's hard to be sure this will work either).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文