使用 C 的流运算符读取格式化数据>>当数据有空格时

发布于 2024-08-23 00:42:26 字数 440 浏览 1 评论 0原文

我有以下格式的数据:

4:How do you do?
10:Happy birthday
1:Purple monkey dishwasher
200:The Ancestral Territorial Imperatives of the Trumpeter Swan

数字可以是 1 到 999 之间的任意值,字符串的长度最多为 255 个字符。我是 C++ 新手,似乎有一些来源建议使用流的 >> 运算符提取格式化数据,但是当我想提取字符串时,它会在第一个空白字符处停止。有没有办法配置流以仅在换行符或文件末尾停止解析字符串?我看到有一个 getline 方法可以提取整行,但是我仍然必须手动将其分割[使用 find_first_of],不是吗?

有没有一种简单的方法可以仅使用 STL 来解析这种格式的数据?

I have data in the following format:

4:How do you do?
10:Happy birthday
1:Purple monkey dishwasher
200:The Ancestral Territorial Imperatives of the Trumpeter Swan

The number can be anywhere from 1 to 999, and the string is at most 255 characters long. I'm new to C++ and it seems a few sources recommend extracting formatted data with a stream's >> operator, but when I want to extract a string it stops at the first whitespace character. Is there a way to configure a stream to stop parsing a string only at a newline or end-of-file? I saw that there was a getline method to extract an entire line, but then I still have to split it up manually [with find_first_of], don't I?

Is there an easy way to parse data in this format using only STL?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

再见回来 2024-08-30 00:42:26

C++ 字符串工具包库 (StrTk) 针对您的问题提供了以下解决方案:

#include <string>
#include <deque>
#include "strtk.hpp"

int main()
{
   struct line_type
   {
      unsigned int id;
      std::string str;
   };

   std::deque<line_type> line_list;

   const std::string file_name = "data.txt";

   strtk::for_each_line(file_name,
                        [&line_list](const std::string& line)
                        {
                           line_type temp_line;
                           const bool result = strtk::parse(line,
                                                            ":",
                                                            temp_line.id,
                                                            temp_line.str);
                           if (!result) return;
                           line_list.push_back(temp_line);
                        });

   return 0;
}

更多示例可以在此处找到

The C++ String Toolkit Library (StrTk) has the following solution to your problem:

#include <string>
#include <deque>
#include "strtk.hpp"

int main()
{
   struct line_type
   {
      unsigned int id;
      std::string str;
   };

   std::deque<line_type> line_list;

   const std::string file_name = "data.txt";

   strtk::for_each_line(file_name,
                        [&line_list](const std::string& line)
                        {
                           line_type temp_line;
                           const bool result = strtk::parse(line,
                                                            ":",
                                                            temp_line.id,
                                                            temp_line.str);
                           if (!result) return;
                           line_list.push_back(temp_line);
                        });

   return 0;
}

More examples can be found Here

鸠魁 2024-08-30 00:42:26

您可以在使用 std::getline,从流中读取并存储到 std 中: :string 对象。像这样的东西:

int num;
string str;

while(cin>>num){
    getline(cin,str);

}

You can read the number before you use std::getline, which reads from a stream and stores into a std::string object. Something like this:

int num;
string str;

while(cin>>num){
    getline(cin,str);

}
依 靠 2024-08-30 00:42:26

您已经了解了 std::getline,但他们没有提到您可能会发现有用的细节:当您调用 getline 时,您还可以传递一个参数,告诉它将哪个字符视为输入结束。要读取您的号码,您可以使用:

std::string number;
std::string name;

std::getline(infile, number, ':');
std::getline(infile, name);   

这会将“:”之前的数据放入 number 中,丢弃“:”,并将该行的其余部分读入 name代码>.

如果你想使用>>来读取数据,你也可以这样做,但它有点困难,并且深入到了标准库中大多数人从未接触过的区域。流具有关联的区域设置,用于格式化数字和(重要的是)确定“空白”的构成等。您可以定义自己的语言环境,将“:”定义为空格,将空格 (" ") 定义为空格。告诉流使用该区域设置,它会让您直接读取数据。

#include <locale>
#include <vector>

struct colonsep: std::ctype<char> {
    colonsep(): std::ctype<char>(get_table()) {}

    static std::ctype_base::mask const* get_table() {
        static std::vector<std::ctype_base::mask> 
            rc(std::ctype<char>::table_size,std::ctype_base::mask());

        rc[':'] = std::ctype_base::space;
        rc['\n'] = std::ctype_base::space;
        return &rc[0];
    }
};

现在要使用它,我们用语言环境“注入”流:

#include <fstream>
#include <iterator>
#include <algorithm>
#include <iostream>

typedef std::pair<int, std::string> data;

namespace std { 
    std::istream &operator>>(std::istream &is, data &d) { 
       return is >> d.first >> d.second;
    }
    std::ostream &operator<<(std::ostream &os, data const &d) { 
        return os << d.first << ":" << d.second;
    }
}

int main() {
    std::ifstream infile("testfile.txt");
    infile.imbue(std::locale(std::locale(), new colonsep));

    std::vector<data> d;

    std::copy(std::istream_iterator<data>(infile), 
              std::istream_iterator<data>(),
              std::back_inserter(d));

    // just for fun, sort the data to show we can manipulate it:
    std::sort(d.begin(), d.end());

    std::copy(d.begin(), d.end(), std::ostream_iterator<data>(std::cout, "\n"));
    return 0;
}

现在您知道为什么库的这一部分如此被忽视了。从理论上讲,让标准库为您完成工作固然很棒,但事实上,大多数时候您自己完成此类工作会更容易。

You've already been told about std::getline, but they didn't mention one detail that you'll probably find useful: when you call getline, you can also pass a parameter telling it what character to treat as the end of input. To read your number, you can use:

std::string number;
std::string name;

std::getline(infile, number, ':');
std::getline(infile, name);   

This will put the data up to the ':' into number, discard the ':', and read the rest of the line into name.

If you want to use >> to read the data, you can do that too, but it's a bit more difficult, and delves into an area of the standard library that most people never touch. A stream has an associated locale that's used for things like formatting numbers and (importantly) determining what constitutes "white space". You can define your own locale to define the ":" as white space, and the space (" ") as not white space. Tell the stream to use that locale, and it'll let you read your data directly.

#include <locale>
#include <vector>

struct colonsep: std::ctype<char> {
    colonsep(): std::ctype<char>(get_table()) {}

    static std::ctype_base::mask const* get_table() {
        static std::vector<std::ctype_base::mask> 
            rc(std::ctype<char>::table_size,std::ctype_base::mask());

        rc[':'] = std::ctype_base::space;
        rc['\n'] = std::ctype_base::space;
        return &rc[0];
    }
};

Now to use it, we "imbue" the stream with a locale:

#include <fstream>
#include <iterator>
#include <algorithm>
#include <iostream>

typedef std::pair<int, std::string> data;

namespace std { 
    std::istream &operator>>(std::istream &is, data &d) { 
       return is >> d.first >> d.second;
    }
    std::ostream &operator<<(std::ostream &os, data const &d) { 
        return os << d.first << ":" << d.second;
    }
}

int main() {
    std::ifstream infile("testfile.txt");
    infile.imbue(std::locale(std::locale(), new colonsep));

    std::vector<data> d;

    std::copy(std::istream_iterator<data>(infile), 
              std::istream_iterator<data>(),
              std::back_inserter(d));

    // just for fun, sort the data to show we can manipulate it:
    std::sort(d.begin(), d.end());

    std::copy(d.begin(), d.end(), std::ostream_iterator<data>(std::cout, "\n"));
    return 0;
}

Now you know why that part of the library is so neglected. In theory, getting the standard library to do your work for you is great -- but in fact, most of the time it's easier to do this kind of job on your own instead.

天赋异禀 2024-08-30 00:42:26

只需使用 getline 逐行(整行)读取数据并解析它。
解析使用 find_first_of()

Just read the data line by line (whole line) using getline and parse it.
To parse use find_first_of()

我还不会笑 2024-08-30 00:42:26
int i;
char *string = (char*)malloc(256*sizeof(char)); //since max is 255 chars, and +1 for '\0'
scanf("%d:%[^\n]s",&i, string); //use %255[^\n]s for accepting 255 chars max irrespective of input size
printf("%s\n", string);

它是 C 语言,也可以在 C++ 中运行。 scanf 提供更多控制,但没有错误管理。所以请谨慎使用:)。

int i;
char *string = (char*)malloc(256*sizeof(char)); //since max is 255 chars, and +1 for '\0'
scanf("%d:%[^\n]s",&i, string); //use %255[^\n]s for accepting 255 chars max irrespective of input size
printf("%s\n", string);

Its C and will work in C++ too. scanf provides more control, but no error management. So use with caution :).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文