MongoDB聚合管道:通过分组和多个条件获取最小值

发布于 2025-01-16 23:34:26 字数 3115 浏览 0 评论 0原文

请帮我为以下查询编写一个有效的 MongoDB 聚合管道:

在任何区域(包括“实验室”党)不超过 5 个党派有候选人,获得“实验室”党的最低选票。

这里是我编写的管道:

    pipeline = [
    // stage 1: unwind the nested results
    { $unwind: "$results" },

    // stage 2: group the results by area, get party count
    {
        $group: {
            _id: "$area",
            partyCount: {
                $sum: 1
            },
            results: {
                $push: {
                    party: "$results.party"
                    votes: "$results.votes"
                }
            }
        }
    },

    // stage 3: filter by the given party and partyCount <= 5
    {
        $match: {
            $and: [
                { partyCount: { $lte: 5 } },
                { results.party: "lab" }
            ]
        }
    },

    // stage 4: unwind the results again
    { $unwind: "$results" },

    // stage 5: filter the results now to only include 'lab' party
    {
        $match: { "results.party": "lab" }
    },

    // stage 6: project the results with the area name
    {
        $project: {
            name: "$_id",
            _id: 0,
            "results.party": 1,
            "results.votes": 1
        }
    },

    // stage 7: group the results by the party and get the minimum votes of the 'lab' party
    {
        $group: {
            _id: "$results.party",
            minVotes: {
                $min: "$results.votes"
            }
        }
    }]

我不确定这是否给了我正确的结果,并且我无法将区域名称也包含在最终结果中。

示例数据

        { "num": 27, "area": "basildon", "electors": 56793, "results": [ { "party": "con", "leader": "thatcher", "votes": 17516}, { "party": "lab", "leader": "foot", "votes": 16137}, { "party": "sdp", "leader": "jenkins", "votes": 11634}]}
    { "num": 28, "area": "basingstoke", "electors": 60414, "results": [ { "party": "con", "leader": "thatcher", "votes": 28381}, { "party": "lab", "leader": "foot", "votes": 10646}, { "party": "sdp", "leader": "jenkins", "votes": 15931}, { "party": "bnp", "leader": "tyndall", "votes": 344}]}
    { "num": 29, "area": "bassetlaw", "electors": 61807, "results": [ { "party": "con", "leader": "thatcher", "votes": 18400}, { "party": "lab", "leader": "foot", "votes": 22231}, { "party": "sdp", "leader": "jenkins", "votes": 8124}]}
    { "num": 30, "area": "bath", "electors": 62355, "results": [ { "party": "con", "leader": "thatcher", "votes": 22544}, { "party": "lab", "leader": "foot", "votes": 7259}, { "party": "sdp", "leader": "jenkins", "votes": 17240}, { "party": "eco", "leader": "whittaker", "votes": 441}]}
    { "num": 31, "area": "batley and spen", "electors": 71206, "results": [ { "party": "con", "leader": "thatcher", "votes": 21433}, { "party": "lab", "leader": "foot", "votes": 20563}, { "party": "sdp", "leader": "jenkins", "votes": 11678}, { "party": "eco", "leader": "whittaker", "votes": 493}]}

数据集位于此处

Please help me write a valid MongoDB aggregate pipeline for the following query:

Get the minimum votes of the 'lab' party in any area where no more than 5 parties including the 'lab' party had candidates.

Here's the pipeline I have written:

    pipeline = [
    // stage 1: unwind the nested results
    { $unwind: "$results" },

    // stage 2: group the results by area, get party count
    {
        $group: {
            _id: "$area",
            partyCount: {
                $sum: 1
            },
            results: {
                $push: {
                    party: "$results.party"
                    votes: "$results.votes"
                }
            }
        }
    },

    // stage 3: filter by the given party and partyCount <= 5
    {
        $match: {
            $and: [
                { partyCount: { $lte: 5 } },
                { results.party: "lab" }
            ]
        }
    },

    // stage 4: unwind the results again
    { $unwind: "$results" },

    // stage 5: filter the results now to only include 'lab' party
    {
        $match: { "results.party": "lab" }
    },

    // stage 6: project the results with the area name
    {
        $project: {
            name: "$_id",
            _id: 0,
            "results.party": 1,
            "results.votes": 1
        }
    },

    // stage 7: group the results by the party and get the minimum votes of the 'lab' party
    {
        $group: {
            _id: "$results.party",
            minVotes: {
                $min: "$results.votes"
            }
        }
    }]

I am not sure if this giving me the correct result and I can't make it to include the area name in the final result as well.

Sample data

        { "num": 27, "area": "basildon", "electors": 56793, "results": [ { "party": "con", "leader": "thatcher", "votes": 17516}, { "party": "lab", "leader": "foot", "votes": 16137}, { "party": "sdp", "leader": "jenkins", "votes": 11634}]}
    { "num": 28, "area": "basingstoke", "electors": 60414, "results": [ { "party": "con", "leader": "thatcher", "votes": 28381}, { "party": "lab", "leader": "foot", "votes": 10646}, { "party": "sdp", "leader": "jenkins", "votes": 15931}, { "party": "bnp", "leader": "tyndall", "votes": 344}]}
    { "num": 29, "area": "bassetlaw", "electors": 61807, "results": [ { "party": "con", "leader": "thatcher", "votes": 18400}, { "party": "lab", "leader": "foot", "votes": 22231}, { "party": "sdp", "leader": "jenkins", "votes": 8124}]}
    { "num": 30, "area": "bath", "electors": 62355, "results": [ { "party": "con", "leader": "thatcher", "votes": 22544}, { "party": "lab", "leader": "foot", "votes": 7259}, { "party": "sdp", "leader": "jenkins", "votes": 17240}, { "party": "eco", "leader": "whittaker", "votes": 441}]}
    { "num": 31, "area": "batley and spen", "electors": 71206, "results": [ { "party": "con", "leader": "thatcher", "votes": 21433}, { "party": "lab", "leader": "foot", "votes": 20563}, { "party": "sdp", "leader": "jenkins", "votes": 11678}, { "party": "eco", "leader": "whittaker", "votes": 493}]}

Dataset is available here.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

☆獨立☆ 2025-01-23 23:34:26

您可以在聚合管道中执行以下操作:

  1. $match 仅获取具有 lab
  2. $addFields 2 个辅助字段的记录:
    1. partyCount:使用$size查找该区域的派对数量(假设没有重复的派对结果)
    2. minVote:使用$reduce有条件地查找实验室的最低投票
  3. /code> 有条件地通过 partyCount
db.collection.aggregate([
  {
    $match: {
      "results.party": "lab"
    }
  },
  {
    "$addFields": {
      "partyCount": {
        $size: "$results"
      },
      "minVote": {
        "$reduce": {
          "input": "$results",
          "initialValue": null,
          "in": {
            "$cond": {
              "if": {
                $and: [
                  {
                    $eq: [
                      "$this.party",
                      "lab"
                    ]
                  },
                  {
                    $or: [
                      {
                        $eq: [
                          "$value",
                          null
                        ]
                      },
                      {
                        $lt: [
                          "$this.votes",
                          "$value"
                        ]
                      }
                    ]
                  }
                ]
              },
              "then": "$this.votes",
              "else": "$value"
            }
          }
        }
      }
    }
  },
  {
    "$match": {
      partyCount: {
        $lte: 5
      }
    }
  }
])

<= 5这里是 Mongo Playground 供您参考。

You can do the followings in an aggregation pipeline:

  1. $match to get only records with lab
  2. $addFields 2 auxiliary fields:
    1. partyCount: using $size to find number of party in that area(assuming no duplicate party result)
    2. minVote: use $reduce to conditionally find the min vote of lab
  3. $match by partyCount <= 5
db.collection.aggregate([
  {
    $match: {
      "results.party": "lab"
    }
  },
  {
    "$addFields": {
      "partyCount": {
        $size: "$results"
      },
      "minVote": {
        "$reduce": {
          "input": "$results",
          "initialValue": null,
          "in": {
            "$cond": {
              "if": {
                $and: [
                  {
                    $eq: [
                      "$this.party",
                      "lab"
                    ]
                  },
                  {
                    $or: [
                      {
                        $eq: [
                          "$value",
                          null
                        ]
                      },
                      {
                        $lt: [
                          "$this.votes",
                          "$value"
                        ]
                      }
                    ]
                  }
                ]
              },
              "then": "$this.votes",
              "else": "$value"
            }
          }
        }
      }
    }
  },
  {
    "$match": {
      partyCount: {
        $lte: 5
      }
    }
  }
])

Here is the Mongo playground for your reference.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文