需要帮助根据另一个文本文件的内容从大型文本文件中删除条目

发布于 2024-12-10 16:18:33 字数 1131 浏览 0 评论 0原文

再会。在这件事上我真的需要你的帮助。我有一个以下格式的统计文本文件。

ID=1000000 
Name=Name1
Field1=Value1 
...(Fields 2 to 25)
Field26=Value26 

ID=1000001
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26

ID=1000002
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26 

...goes up to 15000

我有一个由换行符分隔的活动人员文本文件。

Name2
Name5
Name11
Name12 
...goes up to 1400 Random Names

如果在活动人员文本文件中找不到该名称,我需要能够从统计文本文件(ID、名称、字段 1 到 26)中删除记录。在上面的示例中,Name1(ID、Name、Fields1 到 26)的关联记录应被删除,因为它不在活动人员文本文件中。

我尝试使用 TextFX->Quick->Find/Replace 通过记事本 ++ 重新格式化统计文件,将其转换为逗号分隔的文件,每个记录由换行符分隔。我将其重新排列为

ID       Name    Field1  ...Fields2 to Fields 25... Field26
1000000  Name1   Value1  ...Value2 to Value 25...   Value26
1000001  Name2   Value1  ...Value2 to Value 25...   Value26
1000002  Name2   Value1  ...Value2 to Value 25...   Value26

用 Excel 打开它,并使用 csv 文件在 mysql 中创建了两个表(统计表和活动名称表)。我不确定如何在自动功能中处理这个问题。除了删除不活动的记录之外,我遇到的另一个问题是将其重写回旧格式。

我连续一个小时一直在尽力解决这个问题。有没有一种解决方案不需要我在两个文件之间使用查找、复制、粘贴和切换 1400 次?不幸的是,我必须以这种格式保存统计文件。

请帮忙。谢谢。

Good day. I could really use your help on this one. I have a stats text file in the following format.

ID=1000000 
Name=Name1
Field1=Value1 
...(Fields 2 to 25)
Field26=Value26 

ID=1000001
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26

ID=1000002
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26 

...goes up to 15000

I have an active people text file separated by line breaks.

Name2
Name5
Name11
Name12 
...goes up to 1400 Random Names

I need to be able to delete records from the stats text file (ID, Name, Fields1 to 26) if the name is not found in the active people text file. In the example above, the associated record for Name1(ID, Name, Fields1 to 26) should be deleted since it's not in the active people text file.

I've tried reformatting the stats file through notepad++ using TextFX->Quick->Find/Replace to convert it to a comma separated file with each record separated by a line break. I had it rearranged to

ID       Name    Field1  ...Fields2 to Fields 25... Field26
1000000  Name1   Value1  ...Value2 to Value 25...   Value26
1000001  Name2   Value1  ...Value2 to Value 25...   Value26
1000002  Name2   Value1  ...Value2 to Value 25...   Value26

I've opened it with excel and I've created two tables (stats table and a active names table) in mysql using the csv file file. I'm not sure how to process this in an automatic function. Besides removing inactive records, the other problem I have is rewriting it back to its old format.

I've been trying my best to figure this out for a hours on end. Is there a solution that won't require me to use find, copy, paste and switch between the two files 1400 times? Unfortunately, I have to keep the stats file in this format.

Please help. Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

翻身的咸鱼 2024-12-17 16:18:33

这是一个将为您处理文件的 C++ 程序:

#include <algorithm>
#include <fstream>
#include <iostream>
#include <locale>
#include <set>
#include <string>
#include <vector>

//trim functions taken:
//http://stackoverflow.com/questions/216823/whats-the-best-way-to-trim-stdstring/217605#217605
//with a slight change because of trouble with ambiguity
static int myIsSpace(int test)
{
    static std::locale loc;
    return std::isspace(test,loc);
}
static std::string &rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(), std::not1(std::ptr_fun<int, int>(myIsSpace))).base(), s.end());
    return s;
}

static std::string <rim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(), std::not1(std::ptr_fun<int, int>(myIsSpace))));
    return s;
}

static std::string &trim(std::string &s) {return ltrim(rtrim(s));}

int main(int argc,char * argv[])
{
    std::ifstream peopleFile;
    peopleFile.open("people.txt");

    if (!peopleFile.is_open()) {
        std::cout << "Could not open people.txt" << std::endl;
        return -1;
    }

    std::set<std::string> people;

    while (!peopleFile.eof()) {
        std::string somePerson;
        std::getline(peopleFile,somePerson);
        trim(somePerson);
        if (!somePerson.empty()) {
            people.insert(somePerson);
        }
    }

    peopleFile.close();

    std::ifstream statsFile;
    statsFile.open("stats.txt");

    if (!statsFile.is_open()) {
        std::cout << "could not open stats.txt" << std::endl;
        return -2;
    }

    std::ofstream newStats;
    newStats.open("new_stats.txt");

    if (!newStats.is_open()) {
        std::cout << "could not open new_stats.txt" << std::endl;
        statsFile.close();
        return -3;
    }

    size_t totalRecords=0;
    size_t includedRecords=0;

    bool firstRecord=true;
    bool included=false;
    std::vector<std::string> record;
    while (!statsFile.eof()) {
        std::string recordLine;
        getline(statsFile,recordLine);
        std::string trimmedRecordLine(recordLine);
        trim(trimmedRecordLine);

        if (trimmedRecordLine.empty()) {
            if (!record.empty()) {
                ++totalRecords;

                if (included) {
                    ++includedRecords;

                    if (firstRecord) {
                        firstRecord=false;
                    } else {
                        newStats << std::endl;
                    }

                    for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                        newStats << *i << std::endl;
                    }
                    included=false;
                }

                record.clear();
            }
        } else {
            record.push_back(recordLine);
            if (!included) {
                if (0==trimmedRecordLine.compare(0,4,"Name")) {
                    trimmedRecordLine=trimmedRecordLine.substr(4);
                    ltrim(trimmedRecordLine);
                    if (!trimmedRecordLine.empty() && '='==trimmedRecordLine[0]) {
                        trimmedRecordLine=trimmedRecordLine.substr(1);
                        ltrim(trimmedRecordLine);
                        included=people.end()!=people.find(trimmedRecordLine);
                    }
                }
            }
        }
    }

    if (!record.empty()) {
        ++totalRecords;

        if (included) {
            ++includedRecords;

            if (firstRecord) {
                firstRecord=false;
            } else {
                newStats << std::endl;
            }

            for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                newStats << *i << std::endl;
            }
            included=false;
        }

        record.clear();
    }

    statsFile.close();
    newStats.close();

    std::cout << "Wrote new_stats.txt with " << includedRecords << " of the " << totalRecords << ((1==totalRecords)?" record":" records") << "found in stats.txt after filtering against the " << people.size() << ((1==people.size())?" person":" people") << " found in people.txt" << std::endl;

    return 0;
}

Here's a C++ program that will process the files for you:

#include <algorithm>
#include <fstream>
#include <iostream>
#include <locale>
#include <set>
#include <string>
#include <vector>

//trim functions taken:
//http://stackoverflow.com/questions/216823/whats-the-best-way-to-trim-stdstring/217605#217605
//with a slight change because of trouble with ambiguity
static int myIsSpace(int test)
{
    static std::locale loc;
    return std::isspace(test,loc);
}
static std::string &rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(), std::not1(std::ptr_fun<int, int>(myIsSpace))).base(), s.end());
    return s;
}

static std::string <rim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(), std::not1(std::ptr_fun<int, int>(myIsSpace))));
    return s;
}

static std::string &trim(std::string &s) {return ltrim(rtrim(s));}

int main(int argc,char * argv[])
{
    std::ifstream peopleFile;
    peopleFile.open("people.txt");

    if (!peopleFile.is_open()) {
        std::cout << "Could not open people.txt" << std::endl;
        return -1;
    }

    std::set<std::string> people;

    while (!peopleFile.eof()) {
        std::string somePerson;
        std::getline(peopleFile,somePerson);
        trim(somePerson);
        if (!somePerson.empty()) {
            people.insert(somePerson);
        }
    }

    peopleFile.close();

    std::ifstream statsFile;
    statsFile.open("stats.txt");

    if (!statsFile.is_open()) {
        std::cout << "could not open stats.txt" << std::endl;
        return -2;
    }

    std::ofstream newStats;
    newStats.open("new_stats.txt");

    if (!newStats.is_open()) {
        std::cout << "could not open new_stats.txt" << std::endl;
        statsFile.close();
        return -3;
    }

    size_t totalRecords=0;
    size_t includedRecords=0;

    bool firstRecord=true;
    bool included=false;
    std::vector<std::string> record;
    while (!statsFile.eof()) {
        std::string recordLine;
        getline(statsFile,recordLine);
        std::string trimmedRecordLine(recordLine);
        trim(trimmedRecordLine);

        if (trimmedRecordLine.empty()) {
            if (!record.empty()) {
                ++totalRecords;

                if (included) {
                    ++includedRecords;

                    if (firstRecord) {
                        firstRecord=false;
                    } else {
                        newStats << std::endl;
                    }

                    for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                        newStats << *i << std::endl;
                    }
                    included=false;
                }

                record.clear();
            }
        } else {
            record.push_back(recordLine);
            if (!included) {
                if (0==trimmedRecordLine.compare(0,4,"Name")) {
                    trimmedRecordLine=trimmedRecordLine.substr(4);
                    ltrim(trimmedRecordLine);
                    if (!trimmedRecordLine.empty() && '='==trimmedRecordLine[0]) {
                        trimmedRecordLine=trimmedRecordLine.substr(1);
                        ltrim(trimmedRecordLine);
                        included=people.end()!=people.find(trimmedRecordLine);
                    }
                }
            }
        }
    }

    if (!record.empty()) {
        ++totalRecords;

        if (included) {
            ++includedRecords;

            if (firstRecord) {
                firstRecord=false;
            } else {
                newStats << std::endl;
            }

            for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                newStats << *i << std::endl;
            }
            included=false;
        }

        record.clear();
    }

    statsFile.close();
    newStats.close();

    std::cout << "Wrote new_stats.txt with " << includedRecords << " of the " << totalRecords << ((1==totalRecords)?" record":" records") << "found in stats.txt after filtering against the " << people.size() << ((1==people.size())?" person":" people") << " found in people.txt" << std::endl;

    return 0;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文