根据一列中的相同值合并二维向量中的行

发布于 2025-01-13 21:21:22 字数 192 浏览 1 评论 0原文

我有一个 2d 向量,有 6 列和 500 行。我想通过比较单个列值(PDG_ID)来组合行,即如果行的 PDG_ID 列值相同,我将取其他五列的平均值并存储这些行作为一行。 知道如何在 C++ 中做到这一点吗? 六列二维向量

I have a 2d vector which has 6 columns and 500 rows.I want to combine the rows by comparing a single column value(PDG_ID)i.e. if the PDG_ID colum value is same for rows,i will take the mean of other five columns and store these rows as one row.
Any idea how to do that in c++?
2d vectror with six columns

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

旧情别恋 2025-01-20 21:21:22

您需要了解要求,然后选择合适的设计。

在您的情况下,您希望对具有相同 ID 的多行进行分组,并计算数据条目的平均值。

因此,1个ID和1个或多个数据条目之间存在关系。或者在 C++ 中,一个 ID 与一个或多个条目相关联。

在 C++ 中,我们有所谓的关联容器,例如 std::mapstd::unordered_map。在这里,我们可以存储一个密钥(ID)和许多关联数据。

如果我们将一行的所有数据放入结构体中,我们可以这样写:

struct PDG {
    int ID{};
    int status{};
    double Px{};
    double Py{};
    double Pz{};
    double E{};
}

并且,如果我们想要存储与多个 PDG 相关的 ID,我们可以定义一个这样的映射:

std::map<int, std::vector<PDG>> groupedPDGs{};

在这个映射中,我们可以存储 ID 和相关数据由一个或多个 PDG 组成。

然后我们可以添加一些非常小/简单的辅助函数,例如 IO 功能或平均值的计算。这样,我们就能将大的、更复杂的问题分解成更简单的部分。

然后,整体实现可能如下所示:

#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <iterator>
#include <map>
#include <iomanip>

// Simple data struct with all necessary values and IO functions
struct PDG {

    // Data part
    int ID{};
    int status{};
    double Px{};
    double Py{};
    double Pz{};
    double E{};

    // Input of one row
    friend std::istream& operator >> (std::istream& is, PDG& pdg) {
        char c{};
        return is >> pdg.ID >> c >> pdg.status >> c >> pdg.Px >> c >> pdg.Py >> c >> pdg.Pz >> c >> pdg.E;
    }
    // Output of one row
    friend std::ostream& operator << (std::ostream& os, const PDG& pdg) {
        return os << "ID: " << std::setw(5) << pdg.ID << "\tStatus: " << pdg.status << "\tPx: " << std::setw(9) << pdg.Px 
            << "\tPy: " << std::setw(9) << pdg.Py << "\tPz: " << std::setw(9) << pdg.Pz << "\tE: " << std::setw(9) << pdg.E;
    }
};

// Alias/Abbreviation for a vector of PDGs
using PDGS = std::vector<PDG>;

// Calculate a mean value for vector of PDG data
PDG calculateMean(const PDGS& pdgs) {

    // Here we store the result. Initilize with values from first row and zeroes
    PDG result{ pdgs.front().ID, pdgs.front().status, 0.0, 0.0, 0.0, 0.0};

    // Add up data fields according to type
    for (size_t i{}; i < pdgs.size(); ++i) {
        result.Px += pdgs[i].Px;
        result.Py += pdgs[i].Py;
        result.Pz += pdgs[i].Pz;
        result.E += pdgs[i].E;
    }
    // Get mean value
    result.Px /= pdgs.size();
    result.Py /= pdgs.size();
    result.Pz /= pdgs.size();
    result.E /= pdgs.size();

    // And return result to calling function
    return result;
}

int main() {

    // Open the source file containing the data, and check, if the file could be opened
    if (std::ifstream ifs{ "pdg.txt" }; ifs) {

        // Read header line and throw away
        std::string header{}; std::getline(ifs, header);
        
        // Here we will stored the PDGs grouped by their ID
        std::map<int, PDGS> groupedPDGs{};

        // Read all source lines
        PDG pdg{};
        while (ifs >> pdg)

            // Store read values grouped by their ID
            groupedPDGs[pdg.ID].push_back(pdg);

        // Result with mean values
        PDGS result{};

        // Calculate mean values and store in additional vector
        for (const auto& [id, pdgs] : groupedPDGs)
            result.push_back(std::move(calculateMean(pdgs)));

        // Debug: Show output to user
        for (const PDG& p : result)
            std::cout << p << '\n';
    }
    std::cerr << "\nError: Could not open source datafile\n\n";
}

使用如下输入文件:

PDG ID, Status, Px, Py, Pz, E
22, 1, 0.00658, 0.0131, -0.00395, 0.0152
13, 1, -43.2, -44.7, -49.6, 79.6
14, 1, 3.5, 21.4, 0.499, 21.7
16, 1, 41.1, -18, 27.8, 52.8
211, 1, 0.483, -0.312, 1.52, 1.63
211, 1, -0.247, -1.75, 45.2, 45.2
321, 1, 0.717, 0.982, 52.6, 52.6
321, 1, 0.112, 0.423, 33.2, 33.2
211, 1, 0.191, -0.68, -178, 178
2212, 1, 1.08, -0.428, -1.78E+03, 1.78E+03
2212, 1, 7.61, 4.28, 76.3, 76.8
211, 1, 0.176, 0.247, 8.9, 8.9
211, 1, 0.456, -0.73, 0.342, 0.937
2112, 1, 0.633, -0.904, 0.423, 1.51
2112, 1, 1, -0.645, 0.366, 1.56
211, 1, -0.0722, 0.147, -0.153, 0.264
211, 1, 0.339, 0.402, 0.304, 0.623
211, 1, 3.64, 2.58, -2.84, 5.29
211, 1, 0.307, 0.208, -5.69, 5.71
2212, 1, 0.118, 0.359, -3.29, 3.45

我们得到以下输出:

ID:    13       Status: 1       Px:     -43.2   Py:     -44.7   Pz:     -49.6   E:      79.6
ID:    14       Status: 1       Px:       3.5   Py:      21.4   Pz:     0.499   E:      21.7
ID:    16       Status: 1       Px:      41.1   Py:       -18   Pz:      27.8   E:      52.8
ID:    22       Status: 1       Px:   0.00658   Py:    0.0131   Pz:  -0.00395   E:    0.0152
ID:   211       Status: 1       Px:  0.585867   Py: 0.0124444   Pz:  -14.4908   E:   27.3949
ID:   321       Status: 1       Px:    0.4145   Py:    0.7025   Pz:      42.9   E:      42.9
ID:  2112       Status: 1       Px:    0.8165   Py:   -0.7745   Pz:    0.3945   E:     1.535
ID:  2212       Status: 1       Px:     2.936   Py:   1.40367   Pz:  -568.997   E:   620.083

You need to understand the requirements and then select a fitting design.

In your case, you want to group several rows, having the same ID, and calculate the mean values of the data entries.

So, there is a relation between 1 ID and 1 or many data entries. Or in C++, an ID has associated one or many entries.

In C++, we have so called associative containers, like std::map or std::unordered_map. Here, we can store a key (the ID) with many associated data.

If we put all data of one row into on struct, we could write something like:

struct PDG {
    int ID{};
    int status{};
    double Px{};
    double Py{};
    double Pz{};
    double E{};
}

And, if we want to store IDs with associated many PDGs, we can define a map like this:

std::map<int, std::vector<PDG>> groupedPDGs{};

In this map, we can store IDs and associated data consisting of one or many PDGs.

Then we can add some very small/simple helper functions, like for example for IO functionality or calculation of a the mean values. With that we break down big, more complicated problems into simpler parts.

Then, the overall implementation could loook like the below:

#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <iterator>
#include <map>
#include <iomanip>

// Simple data struct with all necessary values and IO functions
struct PDG {

    // Data part
    int ID{};
    int status{};
    double Px{};
    double Py{};
    double Pz{};
    double E{};

    // Input of one row
    friend std::istream& operator >> (std::istream& is, PDG& pdg) {
        char c{};
        return is >> pdg.ID >> c >> pdg.status >> c >> pdg.Px >> c >> pdg.Py >> c >> pdg.Pz >> c >> pdg.E;
    }
    // Output of one row
    friend std::ostream& operator << (std::ostream& os, const PDG& pdg) {
        return os << "ID: " << std::setw(5) << pdg.ID << "\tStatus: " << pdg.status << "\tPx: " << std::setw(9) << pdg.Px 
            << "\tPy: " << std::setw(9) << pdg.Py << "\tPz: " << std::setw(9) << pdg.Pz << "\tE: " << std::setw(9) << pdg.E;
    }
};

// Alias/Abbreviation for a vector of PDGs
using PDGS = std::vector<PDG>;

// Calculate a mean value for vector of PDG data
PDG calculateMean(const PDGS& pdgs) {

    // Here we store the result. Initilize with values from first row and zeroes
    PDG result{ pdgs.front().ID, pdgs.front().status, 0.0, 0.0, 0.0, 0.0};

    // Add up data fields according to type
    for (size_t i{}; i < pdgs.size(); ++i) {
        result.Px += pdgs[i].Px;
        result.Py += pdgs[i].Py;
        result.Pz += pdgs[i].Pz;
        result.E += pdgs[i].E;
    }
    // Get mean value
    result.Px /= pdgs.size();
    result.Py /= pdgs.size();
    result.Pz /= pdgs.size();
    result.E /= pdgs.size();

    // And return result to calling function
    return result;
}

int main() {

    // Open the source file containing the data, and check, if the file could be opened
    if (std::ifstream ifs{ "pdg.txt" }; ifs) {

        // Read header line and throw away
        std::string header{}; std::getline(ifs, header);
        
        // Here we will stored the PDGs grouped by their ID
        std::map<int, PDGS> groupedPDGs{};

        // Read all source lines
        PDG pdg{};
        while (ifs >> pdg)

            // Store read values grouped by their ID
            groupedPDGs[pdg.ID].push_back(pdg);

        // Result with mean values
        PDGS result{};

        // Calculate mean values and store in additional vector
        for (const auto& [id, pdgs] : groupedPDGs)
            result.push_back(std::move(calculateMean(pdgs)));

        // Debug: Show output to user
        for (const PDG& p : result)
            std::cout << p << '\n';
    }
    std::cerr << "\nError: Could not open source datafile\n\n";
}

With an input file like:

PDG ID, Status, Px, Py, Pz, E
22, 1, 0.00658, 0.0131, -0.00395, 0.0152
13, 1, -43.2, -44.7, -49.6, 79.6
14, 1, 3.5, 21.4, 0.499, 21.7
16, 1, 41.1, -18, 27.8, 52.8
211, 1, 0.483, -0.312, 1.52, 1.63
211, 1, -0.247, -1.75, 45.2, 45.2
321, 1, 0.717, 0.982, 52.6, 52.6
321, 1, 0.112, 0.423, 33.2, 33.2
211, 1, 0.191, -0.68, -178, 178
2212, 1, 1.08, -0.428, -1.78E+03, 1.78E+03
2212, 1, 7.61, 4.28, 76.3, 76.8
211, 1, 0.176, 0.247, 8.9, 8.9
211, 1, 0.456, -0.73, 0.342, 0.937
2112, 1, 0.633, -0.904, 0.423, 1.51
2112, 1, 1, -0.645, 0.366, 1.56
211, 1, -0.0722, 0.147, -0.153, 0.264
211, 1, 0.339, 0.402, 0.304, 0.623
211, 1, 3.64, 2.58, -2.84, 5.29
211, 1, 0.307, 0.208, -5.69, 5.71
2212, 1, 0.118, 0.359, -3.29, 3.45

we get the below output:

ID:    13       Status: 1       Px:     -43.2   Py:     -44.7   Pz:     -49.6   E:      79.6
ID:    14       Status: 1       Px:       3.5   Py:      21.4   Pz:     0.499   E:      21.7
ID:    16       Status: 1       Px:      41.1   Py:       -18   Pz:      27.8   E:      52.8
ID:    22       Status: 1       Px:   0.00658   Py:    0.0131   Pz:  -0.00395   E:    0.0152
ID:   211       Status: 1       Px:  0.585867   Py: 0.0124444   Pz:  -14.4908   E:   27.3949
ID:   321       Status: 1       Px:    0.4145   Py:    0.7025   Pz:      42.9   E:      42.9
ID:  2112       Status: 1       Px:    0.8165   Py:   -0.7745   Pz:    0.3945   E:     1.535
ID:  2212       Status: 1       Px:     2.936   Py:   1.40367   Pz:  -568.997   E:   620.083

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文