将最胖的人从超载的飞机上扔下来。

发布于 2024-12-09 19:26:12 字数 293 浏览 0 评论 0原文

假设您有一架飞机，但燃油不足。除非飞机减掉3000磅的乘客重量，否则它将无法到达下一个机场。为了挽救尽可能多的生命，我们希望首先将最重的人从飞机上扔下来。

哦，是的，飞机上有数百万人，我们希望有一个最佳算法来找到最重的乘客，而不必对整个列表进行排序。

这是我尝试用 C++ 编写的代码的代理问题。我想按重量对乘客清单进行“部分排序”，但我不知道需要多少元素。我可以实现我自己的“partial_sort”算法（“partial_sort_accumulate_until”），但我想知道是否有更简单的方法可以使用标准 STL 来实现此目的。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

嘿咻 2024-12-16 19:26:12

然而，这对您的代理问题没有帮助：

对于 1,000,000 名乘客减重 3000 磅，每位乘客必须减重 (3000/1000000) = 每人 0.003 磅。这可以通过抛弃每个人的衬衫、鞋子，甚至指甲剪来实现，从而拯救每个人。这假设在飞机使用更多燃料所需的重量损失增加之前进行有效的收集和抛弃。

事实上，他们不再允许在飞机上使用指甲刀，所以这是不可能的。

回复收藏 0 原文

锦上情书 2024-12-16 19:26:12

一种方法是使用最小堆 (std::priority_queue（C++ 中）。假设您有一个 MinHeap 类，您可以按照以下方式进行操作。（是的，我的示例是用 C# 编写的。我想您已经明白了。）

int targetTotal = 3000;
int totalWeight = 0;
// this creates an empty heap!
var myHeap = new MinHeap<Passenger>(/* need comparer here to order by weight */);
foreach (var pass in passengers)
{
    if (totalWeight < targetTotal)
    {
        // unconditionally add this passenger
        myHeap.Add(pass);
        totalWeight += pass.Weight;
    }
    else if (pass.Weight > myHeap.Peek().Weight)
    {
        // If this passenger is heavier than the lightest
        // passenger already on the heap,
        // then remove the lightest passenger and add this one
        var oldPass = myHeap.RemoveFirst();
        totalWeight -= oldPass.Weight;
        myHeap.Add(pass);
        totalWeight += pass.Weight;
    }
}

// At this point, the heaviest people are on the heap,
// but there might be too many of them.
// Remove the lighter people until we have the minimum necessary
while ((totalWeight - myHeap.Peek().Weight) > targetTotal)
{
    var oldPass = myHeap.RemoveFirst();
    totalWeight -= oldPass.Weight; 
}
// The heap now contains the passengers who will be thrown overboard.

根据标准参考，运行时间应与 n log k 成正比，其中 n 是乘客数量，k 是堆上的最大项目数。如果我们假设乘客的体重通常为 100 磅或以上，那么该堆物品不可能在任何时候都超过 30 件。

最糟糕的情况是乘客按体重从轻到重的顺序排列。这需要将每个乘客添加到堆中，并将每个乘客从堆中删除。尽管如此，对于 100 万名乘客并假设最轻的重量为 100 磅，n log k 计算得出的数字相当小。

如果随机获取乘客的体重，性能会好得多。我使用类似这样的东西作为推荐引擎（我从数百万个列表中选择前 200 个项目）。我通常最终只会将 50,000 或 70,000 个项目实际添加到堆中。

我怀疑您会看到非常相似的情况：您的大多数候选人都会被拒绝，因为他们比现有的最轻的人更轻。 Peek 是一个O(1) 操作。

有关堆选择和快速选择性能的更多信息，请参阅当理论遇到实践。简短版本：如果您选择的项目少于总数的 1%，那么堆选择明显优于快速选择。超过 1%，则使用快速选择或 Introselect 等变体。

One way would be to use a min heap (std::priority_queue in C++). Here's how you'd do it, assuming you had a MinHeap class. (Yes, my example is in C#. I think you get the idea.)

int targetTotal = 3000;
int totalWeight = 0;
// this creates an empty heap!
var myHeap = new MinHeap<Passenger>(/* need comparer here to order by weight */);
foreach (var pass in passengers)
{
    if (totalWeight < targetTotal)
    {
        // unconditionally add this passenger
        myHeap.Add(pass);
        totalWeight += pass.Weight;
    }
    else if (pass.Weight > myHeap.Peek().Weight)
    {
        // If this passenger is heavier than the lightest
        // passenger already on the heap,
        // then remove the lightest passenger and add this one
        var oldPass = myHeap.RemoveFirst();
        totalWeight -= oldPass.Weight;
        myHeap.Add(pass);
        totalWeight += pass.Weight;
    }
}

// At this point, the heaviest people are on the heap,
// but there might be too many of them.
// Remove the lighter people until we have the minimum necessary
while ((totalWeight - myHeap.Peek().Weight) > targetTotal)
{
    var oldPass = myHeap.RemoveFirst();
    totalWeight -= oldPass.Weight; 
}
// The heap now contains the passengers who will be thrown overboard.

According to the standard references, running time should be proportional to n log k, where n is the number of passengers and k is the maximum number of items on the heap. If we assume that passengers' weights will typically be 100 lbs or more, then it's unlikely that the heap will contain more than 30 items at any time.

The worst case would be if the passengers are presented in order from lowest weight to highest. That would require that every passenger be added to the heap, and every passenger be removed from the heap. Still, with a million passengers and assuming that the lightest weighs 100 lbs, the n log k works out to a reasonably small number.

If you get the passengers' weights randomly, performance is much better. I use something quite like this for a recommendation engine (I select the top 200 items from a list of several million). I typically end up with only 50,000 or 70,000 items actually added to the heap.

I suspect that you'll see something quite similar: the majority of your candidates will be rejected because they're lighter than the lightest person already on the heap. And Peek is an O(1) operation.

For a more information about the performance of heap select and quick select, see When theory meets practice. Short version: if you're selecting fewer than 1% of the total number of items, then heap select is a clear winner over quick select. More than 1%, then use quick select or a variant like Introselect.

回复收藏 0 原文

单挑你×的.吻 2024-12-16 19:26:12

下面是简单解决方案的相当简单的实现。我认为没有一种更快、100%正确的方法。

size_t total = 0;
std::set<passenger> dead;
for ( auto p : passengers ) {
    if (dead.empty()) {
       dead.insert(p);
       total += p.weight;
       continue;
    }
    if (total < threshold || p.weight > dead.begin()->weight)
    {
        dead.insert(p);
        total += p.weight;
        while (total > threshold)
        {
            if (total - dead.begin()->weight < threshold)
                break;
            total -= dead.begin()->weight;
            dead.erase(dead.begin());
        }
    }
 }

这是通过填充“死人”集合直到达到阈值来实现的。一旦达到阈值，我们就会继续检查乘客名单，试图找到比最轻的死者重的人。当我们找到一个人时，我们将他们添加到列表中，然后开始“保存”列表中最轻的人，直到我们无法再保存为止。

在最坏的情况下，这将与整个列表的排序大致相同。但在最好的情况下（“死亡名单”被前 X 个人正确填满），它将执行 O(n)。

Below is a rather simple implementation of the straightforward solution. I don't think there is a faster way that is 100% correct.

size_t total = 0;
std::set<passenger> dead;
for ( auto p : passengers ) {
    if (dead.empty()) {
       dead.insert(p);
       total += p.weight;
       continue;
    }
    if (total < threshold || p.weight > dead.begin()->weight)
    {
        dead.insert(p);
        total += p.weight;
        while (total > threshold)
        {
            if (total - dead.begin()->weight < threshold)
                break;
            total -= dead.begin()->weight;
            dead.erase(dead.begin());
        }
    }
 }

This works by filling up the set of "dead people" until it meets the threshold. Once the threshold is met, we keep going through the list of passengers trying to find any that are heavier than the lightest dead person. When we have found one, we add them to the list and then start "Saving" the lightest people off the list until we can't save any more.

In the worst case, this will perform about the same as a sort of the entire list. But in the best case (the "dead list" is filled up properly with the first X people) it will perform O(n).

回复收藏 0 原文

⒈起吃苦の倖褔 2024-12-16 19:26:12

假设所有乘客都会合作：使用并行分拣网络。（另请参阅此）

~~这是现场演示~~

更新：另类视频（跳至 1:00）

要求两人进行比较-交换 -你不可能比这更快了。

回复收藏 0 原文

自演自醉 2024-12-16 19:26:12

@Blastfurnace 走在正确的轨道上。您可以使用快速选择，其中枢轴是权重阈值。每个分区将一组人分成几组，并返回每组人的总权重。你继续打破适当的桶，直到你的桶对应于最高体重的人超过 3000 磅，并且该组中你的最低桶有 1 个人（也就是说，它不能再分裂。）

这个算法是线性的时间摊销，但最坏情况是二次方。我认为它是唯一的线性时间算法。

下面是一个说明该算法的 Python 解决方案：

#!/usr/bin/env python
import math
import numpy as np
import random

OVERWEIGHT = 3000.0
in_trouble = [math.floor(x * 10) / 10
              for x in np.random.standard_gamma(16.0, 100) * 8.0]
dead = []
spared = []

dead_weight = 0.0

while in_trouble:
    m = np.median(list(set(random.sample(in_trouble, min(len(in_trouble), 5)))))
    print("Partitioning with pivot:", m)
    lighter_partition = []
    heavier_partition = []
    heavier_partition_weight = 0.0
    in_trouble_is_indivisible = True
    for p in in_trouble:
        if p < m:
            lighter_partition.append(p)
        else:
            heavier_partition.append(p)
            heavier_partition_weight += p
        if p != m:
            in_trouble_is_indivisible = False
    if heavier_partition_weight + dead_weight >= OVERWEIGHT and not in_trouble_is_indivisible:
        spared += lighter_partition
        in_trouble = heavier_partition
    else:
        dead += heavier_partition
        dead_weight += heavier_partition_weight
        in_trouble = lighter_partition

print("weight of dead people: {}; spared people: {}".format(
    dead_weight, sum(spared)))
print("Dead: ", dead)
print("Spared: ", spared)

输出：

Partitioning with pivot: 121.2
Partitioning with pivot: 158.9
Partitioning with pivot: 168.8
Partitioning with pivot: 161.5
Partitioning with pivot: 159.7
Partitioning with pivot: 158.9
weight of dead people: 3051.7; spared people: 9551.7
Dead:  [179.1, 182.5, 179.2, 171.6, 169.9, 179.9, 168.8, 172.2, 169.9, 179.6, 164.4, 164.8, 161.5, 163.1, 165.7, 160.9, 159.7, 158.9]
Spared:  [82.2, 91.9, 94.7, 116.5, 108.2, 78.9, 83.1, 114.6, 87.7, 103.0, 106.0, 102.3, 104.9, 117.0, 96.7, 109.2, 98.0, 108.4, 99.0, 96.8, 90.7, 79.4, 101.7, 119.3, 87.2, 114.7, 90.0, 84.7, 83.5, 84.7, 111.0, 118.1, 112.1, 92.5, 100.9, 114.1, 114.7, 114.1, 113.7, 99.4, 79.3, 100.1, 82.6, 108.9, 103.5, 89.5, 121.8, 156.1, 121.4, 130.3, 157.4, 138.9, 143.0, 145.1, 125.1, 138.5, 143.8, 146.8, 140.1, 136.9, 123.1, 140.2, 153.6, 138.6, 146.5, 143.6, 130.8, 155.7, 128.9, 143.8, 124.0, 134.0, 145.0, 136.0, 121.2, 133.4, 144.0, 126.3, 127.0, 148.3, 144.9, 128.1]

@Blastfurnace was on the right track. You use quickselect where the pivots are weight thresholds. Each partition splits one set of people into sets, and returns the total weight for each set of people. You continue breaking the appropriate bucket until your buckets corresponding to the highest weight people are over 3000 pounds, and your lowest bucket that is in that set has 1 person (that is, it can't be split any further.)

This algorithm is linear time amortized, but quadratic worst case. I think it is the only linear time algorithm.

Here's a Python solution that illustrates this algorithm:

#!/usr/bin/env python
import math
import numpy as np
import random

OVERWEIGHT = 3000.0
in_trouble = [math.floor(x * 10) / 10
              for x in np.random.standard_gamma(16.0, 100) * 8.0]
dead = []
spared = []

dead_weight = 0.0

while in_trouble:
    m = np.median(list(set(random.sample(in_trouble, min(len(in_trouble), 5)))))
    print("Partitioning with pivot:", m)
    lighter_partition = []
    heavier_partition = []
    heavier_partition_weight = 0.0
    in_trouble_is_indivisible = True
    for p in in_trouble:
        if p < m:
            lighter_partition.append(p)
        else:
            heavier_partition.append(p)
            heavier_partition_weight += p
        if p != m:
            in_trouble_is_indivisible = False
    if heavier_partition_weight + dead_weight >= OVERWEIGHT and not in_trouble_is_indivisible:
        spared += lighter_partition
        in_trouble = heavier_partition
    else:
        dead += heavier_partition
        dead_weight += heavier_partition_weight
        in_trouble = lighter_partition

print("weight of dead people: {}; spared people: {}".format(
    dead_weight, sum(spared)))
print("Dead: ", dead)
print("Spared: ", spared)

Output:

Partitioning with pivot: 121.2
Partitioning with pivot: 158.9
Partitioning with pivot: 168.8
Partitioning with pivot: 161.5
Partitioning with pivot: 159.7
Partitioning with pivot: 158.9
weight of dead people: 3051.7; spared people: 9551.7
Dead:  [179.1, 182.5, 179.2, 171.6, 169.9, 179.9, 168.8, 172.2, 169.9, 179.6, 164.4, 164.8, 161.5, 163.1, 165.7, 160.9, 159.7, 158.9]
Spared:  [82.2, 91.9, 94.7, 116.5, 108.2, 78.9, 83.1, 114.6, 87.7, 103.0, 106.0, 102.3, 104.9, 117.0, 96.7, 109.2, 98.0, 108.4, 99.0, 96.8, 90.7, 79.4, 101.7, 119.3, 87.2, 114.7, 90.0, 84.7, 83.5, 84.7, 111.0, 118.1, 112.1, 92.5, 100.9, 114.1, 114.7, 114.1, 113.7, 99.4, 79.3, 100.1, 82.6, 108.9, 103.5, 89.5, 121.8, 156.1, 121.4, 130.3, 157.4, 138.9, 143.0, 145.1, 125.1, 138.5, 143.8, 146.8, 140.1, 136.9, 123.1, 140.2, 153.6, 138.6, 146.5, 143.6, 130.8, 155.7, 128.9, 143.8, 124.0, 134.0, 145.0, 136.0, 121.2, 133.4, 144.0, 126.3, 127.0, 148.3, 144.9, 128.1]

回复收藏 0 原文

过期以后 2024-12-16 19:26:12

假设，就像人的体重一样，您很清楚最大值和最小值可能是什么，可以使用基数排序在 O(n) 中对它们进行排序。然后简单地从列表中最重的一端向最轻的一端进行操作。总运行时间：O(n)。不幸的是，STL 中没有基数排序的实现，但编写起来非常简单。

回复收藏 0 原文

莫多说 2024-12-16 19:26:12

为什么不使用具有与“已排序”不同的中止规则的部分快速排序。
您可以运行它，然后仅使用上半部分并继续，直到该上半部分中的权重不再包含至少必须被丢弃的权重，然后您在递归中返回一步并对列表进行排序。之后，您可以开始将人员从排序列表的高端剔除。

回复收藏 0 原文

新人笑 2024-12-16 19:26:12

大规模平行锦标赛排序：-

假设标准的过道两侧各有三个座位：-

如果靠窗座位的乘客比靠窗座位的人重，请要求他们移至中间座位。
如果中间座位的乘客较重，请与靠过道座位的乘客交换。
要求左过道座位上的乘客与右过道座位上的乘客交换，因为他们较重。
对右侧过道座位上的乘客进行冒泡排序。（n 行需要 n 步）。
-- 让右过道座位的乘客与前面的人交换n -1 次。

5 将他们踢出家门，直到你达到 3000 磅。

3 步 + n 步加上 30 步（如果乘客数量非常少）。

对于两通道飞机——指令更复杂，但性能大致相同。

回复收藏 0 原文

小嗷兮 2024-12-16 19:26:12

我可能会使用 std::nth_element 在线性时间内划分出 20 个最重的人。然后使用更复杂的方法找到并击落最重的物体。

回复收藏 0 原文

自控 2024-12-16 19:26:12

您可以遍历该列表以获得平均值和标准差，然后使用它来估算必须离开的人数。使用partial_sort 根据该数字生成列表。如果猜测值较低，请再次对剩余部分使用partial_sort 并进行新的猜测。

回复收藏 0 原文

蓝礼 2024-12-16 19:26:12

@James 在评论中有答案：a std::priority_queue< /a> 如果您可以使用任何容器，或 std::make_heap 和 std::pop_heap （和 std::push_heap) 如果你想使用类似 std::vector 的东西。

回复收藏 0 原文

七婞 2024-12-16 19:26:12

这是使用 Python 内置 heapq 模块的基于堆的解决方案。它是用 Python 编写的，所以不能回答最初的问题，但它比其他发布的 Python 解决方案更干净（恕我直言）。

import itertools, heapq

# Test data
from collections import namedtuple

Passenger = namedtuple("Passenger", "name seat weight")

passengers = [Passenger(*p) for p in (
    ("Alpha", "1A", 200),
    ("Bravo", "2B", 800),
    ("Charlie", "3C", 400),
    ("Delta", "4A", 300),
    ("Echo", "5B", 100),
    ("Foxtrot", "6F", 100),
    ("Golf", "7E", 200),
    ("Hotel", "8D", 250),
    ("India", "8D", 250),
    ("Juliet", "9D", 450),
    ("Kilo", "10D", 125),
    ("Lima", "11E", 110),
    )]

# Find the heaviest passengers, so long as their
# total weight does not exceeed 3000

to_toss = []
total_weight = 0.0

for passenger in passengers:
    weight = passenger.weight
    total_weight += weight
    heapq.heappush(to_toss, (weight, passenger))

    while total_weight - to_toss[0][0] >= 3000:
        weight, repreived_passenger = heapq.heappop(to_toss)
        total_weight -= weight


if total_weight < 3000:
    # Not enough people!
    raise Exception("We're all going to die!")

# List the ones to toss. (Order doesn't matter.)

print "We can get rid of", total_weight, "pounds"
for weight, passenger in to_toss:
    print "Toss {p.name!r} in seat {p.seat} (weighs {p.weight} pounds)".format(p=passenger)

如果 k = 要抛掷的乘客数量，N = 乘客数量，则该算法的最佳情况为 O(N)，最坏情况为 Nlog(N)。如果 k 长时间接近 N，就会出现最坏的情况。这是最差演员阵容的一个例子：

weights = [2500] + [1/(2**n+0.0) for n in range(100000)] + [3000]

但是，在这种情况下（将人从飞机上扔下来（我猜是用降落伞）），那么 k 必须小于 3000，即 << “数百万人”。因此，平均运行时间应该约为 Nlog(k)，它与人数呈线性关系。

Here's a heap-based solution using Python's built-in heapq module. It's in Python so doesn't answer the original question, but it's cleaner (IMHO) than the other posted Python solution.

import itertools, heapq

# Test data
from collections import namedtuple

Passenger = namedtuple("Passenger", "name seat weight")

passengers = [Passenger(*p) for p in (
    ("Alpha", "1A", 200),
    ("Bravo", "2B", 800),
    ("Charlie", "3C", 400),
    ("Delta", "4A", 300),
    ("Echo", "5B", 100),
    ("Foxtrot", "6F", 100),
    ("Golf", "7E", 200),
    ("Hotel", "8D", 250),
    ("India", "8D", 250),
    ("Juliet", "9D", 450),
    ("Kilo", "10D", 125),
    ("Lima", "11E", 110),
    )]

# Find the heaviest passengers, so long as their
# total weight does not exceeed 3000

to_toss = []
total_weight = 0.0

for passenger in passengers:
    weight = passenger.weight
    total_weight += weight
    heapq.heappush(to_toss, (weight, passenger))

    while total_weight - to_toss[0][0] >= 3000:
        weight, repreived_passenger = heapq.heappop(to_toss)
        total_weight -= weight


if total_weight < 3000:
    # Not enough people!
    raise Exception("We're all going to die!")

# List the ones to toss. (Order doesn't matter.)

print "We can get rid of", total_weight, "pounds"
for weight, passenger in to_toss:
    print "Toss {p.name!r} in seat {p.seat} (weighs {p.weight} pounds)".format(p=passenger)

If k = the number of passengers to toss and N = the number of passengers, then the best case for this algorithm is O(N) and the worst case for this algorithm is Nlog(N). The worst case occurs if k is near N for a long time. Here's an example of the worst cast:

weights = [2500] + [1/(2**n+0.0) for n in range(100000)] + [3000]

However, in this case (throwing people off the plane (with a parachute, I presume)) then k must be less than 3000, which is << "millions of people". The average runtime should therefore be about Nlog(k), which is linear to the number of people.

回复收藏 0 原文

~没有更多了~