在不使用外部库的情况下从文件中标记行的好方法?

发布于 2025-01-06 15:34:04 字数 600 浏览 0 评论 0原文

我正在尝试标记以逗号分隔的数据库转储。我只需要读取第一个单词,它会告诉我这是否是我需要的行,然后标记该行并将每个分隔的字符串保存在向量中。

我很难保持所有数据类型的顺序。我使用 getline 方法:

string line;
    vector<string> tokens;

// Iterate through each line of the file
while( getline( file, line ) )
{
    // Here is where i want to tokenize. strtok however uses a character array and not a string.
}

问题是,如果第一个单词是我想要的,我只想继续阅读并标记一行。以下是文件中一行的示例:

example,1,200,200,220,10,550,550,550,0,100,0,-84,255

因此,如果我在字符串示例之后,它会继续标记该行的其余部分以供我使用,然后停止从文件中读取。

我应该使用 strtok、stringstream 还是其他东西?

谢谢你!

I am trying to tokenize a database dump separated by commas. I only need to read the first word, which will tell me if this is the line I need and then tokenize the line and save each separated string in a vector.

I have had trouble keeping all of the datatypes in order. I use a method of getline:

string line;
    vector<string> tokens;

// Iterate through each line of the file
while( getline( file, line ) )
{
    // Here is where i want to tokenize. strtok however uses a character array and not a string.
}

The thing is, I only want to continue reading and tokenize a line if the first word is what I am after. Here is a sample of a line from the file:

example,1,200,200,220,10,550,550,550,0,100,0,-84,255

So, if I am after the string example, it goes ahead and tokenizes the rest of the line for my use and then stops reading from the file.

Should I be using strtok, stringstream or something else?

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

固执像三岁 2025-01-13 15:34:04
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;

void do(ifstream& file) {
    string line;
    string prefix = "example,";

    // Get all lines from the file
    while (getline(file,line).good()) {
        // Compare the beginning for your prefix
        if (line.compare(0, prefix.size(), prefix) == 0) {
            // Homemade tokenization
            vector<string> tokens;
            int oldpos = 0;
            int pos;
            while ((pos = line.find(',', oldpos)) != string::npos) {
                tokens.push_back(line.substr(oldpos, pos-oldpos));
                oldpos = pos + 1;
            }
            tokens.push_back(line.substr(oldpos)); // don't forget the last bit
            // And here you are!
        }
    }
}
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;

void do(ifstream& file) {
    string line;
    string prefix = "example,";

    // Get all lines from the file
    while (getline(file,line).good()) {
        // Compare the beginning for your prefix
        if (line.compare(0, prefix.size(), prefix) == 0) {
            // Homemade tokenization
            vector<string> tokens;
            int oldpos = 0;
            int pos;
            while ((pos = line.find(',', oldpos)) != string::npos) {
                tokens.push_back(line.substr(oldpos, pos-oldpos));
                oldpos = pos + 1;
            }
            tokens.push_back(line.substr(oldpos)); // don't forget the last bit
            // And here you are!
        }
    }
}
離殇 2025-01-13 15:34:04

如何在 C++ 中标记字符串?

http://www.daniweb.com/software-development/cpp/threads/27905

希望这会有所帮助,尽管我不是熟练的 C/C++ 程序员。作为记录,如果您可以在标签或您正在使用的帖子语言中指定,那就太好了。

How do I tokenize a string in C++?

http://www.daniweb.com/software-development/cpp/threads/27905

Hope this helps, though I am not proficient C/C++ programmer. For the record it would be nice if you could specify in the tags or in post language you are using.

最初的梦 2025-01-13 15:34:04

Tokenizer.h

#ifndef TOKENIZER_H
#define TOKENIZER_H

#include <string>
#include <vector>
#include <sstream>

class Tokenizer
{
public:
    Tokenizer();
    ~Tokenizer();
    void Tokenize(std::string& str, std::vector<std::string>& tokens);
};

#endif /* TOKENIZER_H */

Tokenizer.cpp

#include "Tokenizer.h"

using namespace std;

string seps(string& s) {
    if (!s.size()) return "";
    stringstream ss;
    ss << s[0];
    for (int i = 1; i < s.size(); i++)
        ss << '|' << s[i];
    return ss.str();
}

void tok(string& str, vector<string>& tokens, const string& delimiters = ",")
{
    seps(str);

    string::size_type lastPos = str.find_first_not_of(delimiters, 0);
    string::size_type pos = str.find_first_of(delimiters, lastPos);

    while (string::npos != pos || string::npos != lastPos)
    {
        tokens.push_back(str.substr(lastPos, pos - lastPos));
        lastPos = str.find_first_not_of(delimiters, pos);
        pos = str.find_first_of(delimiters, lastPos);
    }
}

Tokenizer::Tokenizer()
{
}

void Tokenizer::Tokenize(string& str, vector<string>& tokens)
{
    tok(seps(str), tokens);
}

Tokenizer::~Tokenizer()
{
}

对字符串进行标记

#include "Tokenizer.h"
#include <string>
#include <vector>
#include <iostream>
#include <cstdlib>

using namespace std;

int main()
{
    // Required variables for later below
    vector<string> t;
    string s = "This is one string,This is another,And this is another one aswell.";
    // What you need to include:
    Tokenizer tokenizer;
    tokenizer.Tokenize(s, t); // s = a string to tokenize, t = vector to store tokens
    // Below is just to show the tokens in the vector<string> (c++11+)
    for (auto c : t)
        cout << c << endl;
    system("pause");
    return 0;
}

Tokenizer.h

#ifndef TOKENIZER_H
#define TOKENIZER_H

#include <string>
#include <vector>
#include <sstream>

class Tokenizer
{
public:
    Tokenizer();
    ~Tokenizer();
    void Tokenize(std::string& str, std::vector<std::string>& tokens);
};

#endif /* TOKENIZER_H */

Tokenizer.cpp

#include "Tokenizer.h"

using namespace std;

string seps(string& s) {
    if (!s.size()) return "";
    stringstream ss;
    ss << s[0];
    for (int i = 1; i < s.size(); i++)
        ss << '|' << s[i];
    return ss.str();
}

void tok(string& str, vector<string>& tokens, const string& delimiters = ",")
{
    seps(str);

    string::size_type lastPos = str.find_first_not_of(delimiters, 0);
    string::size_type pos = str.find_first_of(delimiters, lastPos);

    while (string::npos != pos || string::npos != lastPos)
    {
        tokens.push_back(str.substr(lastPos, pos - lastPos));
        lastPos = str.find_first_not_of(delimiters, pos);
        pos = str.find_first_of(delimiters, lastPos);
    }
}

Tokenizer::Tokenizer()
{
}

void Tokenizer::Tokenize(string& str, vector<string>& tokens)
{
    tok(seps(str), tokens);
}

Tokenizer::~Tokenizer()
{
}

To tokenize a string

#include "Tokenizer.h"
#include <string>
#include <vector>
#include <iostream>
#include <cstdlib>

using namespace std;

int main()
{
    // Required variables for later below
    vector<string> t;
    string s = "This is one string,This is another,And this is another one aswell.";
    // What you need to include:
    Tokenizer tokenizer;
    tokenizer.Tokenize(s, t); // s = a string to tokenize, t = vector to store tokens
    // Below is just to show the tokens in the vector<string> (c++11+)
    for (auto c : t)
        cout << c << endl;
    system("pause");
    return 0;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文