使用集合而不是堆？这是更好的＆＃x27;解决方案？ [最大CPU负载问题]

发布于 2025-02-09 02:07:28 字数 2520 浏览 2 评论 0原文

这是代码挑战：

我们得到了工作清单。每个作业在运行时都有开始时间，结束时间和CPU负载。我们的目标是如果所有作业都在同一台计算机上运行。
。
示例1：
作业： [[1,3,3]，[2,5,4]，[7,9,6]]
输出： 7
说明：自[1,4,3]和[2,5,4]重叠以来，它们的最大CPU负载（3+4 = 7）将是两个作业同时运行即，在时间间隔（2,4）中。
示例2：
作业： [[6,7,10]，[2,4,11]，[8,12,15]]
输出： 15
说明：没有任何工作重叠，因此我们将承担15个工作的最大负担。
示例3：
作业： [[1,4,2]，[2,4,1]，[3,6,5]]
输出： 8
说明：最大CPU负载将为8个，因为在时间间隔内所有作业重叠[3,4]。

给出的解决方案，使用堆：

from heapq import heappush, heappop

class job:
    def __init__(self, start, end, cpu_load):
        self.start = start
        self.end = end
        self.cpu_load = cpu_load

    def __lt__(self, other):
        # min heap based on job.end 
        return self.end < other.end

def find_max_cpu_load(jobs):
    # sort the jobs by start time 
    jobs.sort(key=lambda x: x.start)
    max_cpu_load, current_cpu_load = 0, 0
    min_heap = []
    
    for j in jobs:
        # remove all the jobs that have ended 
        while(len(min_heap) > 0 and j.start >= min_heap[0].end):
            current_cpu_load -= min_heap[0].cpu_load
            heappop(min_heap), 
        # add the current job into min_heap 
        heappush(min_heap, j) 
        current_cpu_load += j.cpu_load
        max_cpu_load = max(max_cpu_load, current_cpu_load) 
        return max_cpu_load

我只是使用了一个集合（赦免！） - ＆gt;

class job:
    def __init__(self, start, end, cpu_load):
        self.start = start
        self.end = end
        self.cpu_load = cpu_load

def find_max_cpu_load(jobs):
    all_vals = set()
    jobs.sort(key=lambda x: x.start)
    currmax = 0

    for j in range(len(jobs) - 1): 
        all_vals.add(jobs[j].cpu_load)
        if jobs[j+1].start <= jobs[j].end:
            currmax = jobs[j].cpu_load
        upper = min(jobs[j].end, jobs[j+1].end)
        while jobs[j+1].start <= upper:
            currmax += jobs[j + 1].cpu_load
            if j < len(jobs) - 1:
                j += 1
            if j == len(jobs) - 1:
                break
        all_vals.add(currmax) 

    all_vals.add(jobs[len(jobs) - 1].cpu_load)
    return max(all_vals)

哪种方法更好？我应该尝试模仿他们的方法吗？在这里使用堆似乎对我来说是不必要的。在技术访谈中，哪种解决方案将受到哪些优势？

原文

Here is the code challenge:

We are given a list of Jobs. Each job has a Start time, an End time, and a CPU load when it is running. Our goal is to find the maximum CPU load at any time if all the jobs are running on the same machine.
Example 1:
Jobs: [[1,3,3], [2,5,4], [7,9,6]]
Output: 7
Explanation: Since [1,4,3] and [2,5,4] overlap, their maximum CPU load (3+4=7) will be when both the jobs are running at the same time i.e., during the time interval (2,4).
Example 2:
Jobs: [[6,7,10], [2,4,11], [8,12,15]]
Output: 15
Explanation: None of the jobs overlap, therefore we will take the maximum load of any job which is 15.
Example 3:
Jobs: [[1,4,2], [2,4,1], [3,6,5]]
Output: 8
Explanation: Maximum CPU load will be 8 as all jobs overlap during the time interval [3,4].

A solution that is given, uses heaps:

from heapq import heappush, heappop

class job:
    def __init__(self, start, end, cpu_load):
        self.start = start
        self.end = end
        self.cpu_load = cpu_load

    def __lt__(self, other):
        # min heap based on job.end 
        return self.end < other.end

def find_max_cpu_load(jobs):
    # sort the jobs by start time 
    jobs.sort(key=lambda x: x.start)
    max_cpu_load, current_cpu_load = 0, 0
    min_heap = []
    
    for j in jobs:
        # remove all the jobs that have ended 
        while(len(min_heap) > 0 and j.start >= min_heap[0].end):
            current_cpu_load -= min_heap[0].cpu_load
            heappop(min_heap), 
        # add the current job into min_heap 
        heappush(min_heap, j) 
        current_cpu_load += j.cpu_load
        max_cpu_load = max(max_cpu_load, current_cpu_load) 
        return max_cpu_load

I just used a set instead (pardon the indentation!)-->

class job:
    def __init__(self, start, end, cpu_load):
        self.start = start
        self.end = end
        self.cpu_load = cpu_load

def find_max_cpu_load(jobs):
    all_vals = set()
    jobs.sort(key=lambda x: x.start)
    currmax = 0

    for j in range(len(jobs) - 1): 
        all_vals.add(jobs[j].cpu_load)
        if jobs[j+1].start <= jobs[j].end:
            currmax = jobs[j].cpu_load
        upper = min(jobs[j].end, jobs[j+1].end)
        while jobs[j+1].start <= upper:
            currmax += jobs[j + 1].cpu_load
            if j < len(jobs) - 1:
                j += 1
            if j == len(jobs) - 1:
                break
        all_vals.add(currmax) 

    all_vals.add(jobs[len(jobs) - 1].cpu_load)
    return max(all_vals)

Which method is better? Should I try to emulate their approach? Using heaps here just seemed unnecessary to me. What solution would be preferred in tech interviews?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

淡水深流 2025-02-16 02:07:28

我只是使用了一套

您使用set的使用不会替换基于堆的逻辑。该SET只是收集CPU负载，目的是提取其最大值。为此，您不需要一套。您不妨使用列表，甚至可以在没有这种数据结构的情况下跟踪最大值。因此，如果您将进行以下替换，您的代码将返回相同的结果：

您当前的代码	替代
`all_vals = set（）`	`max_val = -1`
`all_vals.add （Jobs [J] .cpu_load）`	`max_val = max_val = max（max_val，jobs [j] .cpu_load）`
`all_vals.add（currmax）`	`max_val = max_val = max_val = max（max_val，currmax）`
`all_vals.add（jobs [len（jobs）-1] .cpu_load）`	`max_val = max_val = max（max_val，jobs [len（jobs）-1-1 ] .cpu_load）`
`return max（all_vals）`	`返回max_val`

您实际实际上替换了堆逻辑，就是向前看一项工作并结束定义上，然后迭代下一个作业。

哪种方法更好？

这是错误的问题，因为您的解决方案给出了错误的结果。您没有提供有效的选择。

这是我使用您的代码得到的结果：

输入	输出	预期输出
`[[1,3,3]，[2,5,4]，[7,9,6]]`	3	7
<代码> [[6,7,10]，[2,4,11]，[8,12,15]]	11	15
`[[1,4,2]，[2,4,1 ]，[3,6,5]]`	2	8

在技术访谈中首选哪种解决方案？

通过提出基于堆的算法，您肯定会给人留下深刻的印象。

事件

对该算法将工作分解为事件的。开始事件将增加当前负载，最终事件将减少负载。我们可以构建一个带有单个事件的列表，每个事件都有事件时间以及当时发生的负载变化。然后，我们可以按排序顺序迭代此列表，并保留当前负载的运行总和。这将使我们能够跟踪最大值：

def find_max_cpu_load3(jobs):
    # make a sorted list of events (time, load_change)
    events = [(j.start, j.cpu_load) for j in jobs] + [(j.end, -j.cpu_load) for j in jobs]
    events.sort()
    max_cpu_load, current_cpu_load = 0, 0

    for time, load_change in events:
        current_cpu_load += load_change
        max_cpu_load = max(max_cpu_load, current_cpu_load)
    return max_cpu_load

I just used a set instead

Your use of a set does not replace the logic that is based on a heap. That set just collects CPU loads with the purpose to extract their maximum. For that you don't need a set. You might as well use a list, or even just keep track of the maximum without such a data structure. So your code would return the same results if you would have made the following replacements:

Your current code	alternative
`all_vals = set()`	`max_val = -1`
`all_vals.add(jobs[j].cpu_load)`	`max_val = max(max_val, jobs[j].cpu_load)`
`all_vals.add(currmax)`	`max_val = max(max_val, currmax)`
`all_vals.add(jobs[len(jobs) - 1].cpu_load)`	`max_val = max(max_val, jobs[len(jobs) - 1].cpu_load)`
`return max(all_vals)`	`return max_val`

What you actually did to replace the heap logic, was to look ahead one job and take its end to define an upper and then iterate some of the next jobs.

Which method is better?

That is the wrong question, as your solution gives the wrong results. You have not given a valid alternative.

Here are the results I get with your code:

Input	Output	Expected output
`[[1,3,3], [2,5,4], [7,9,6]]`	3	7
`[[6,7,10], [2,4,11], [8,12,15]]`	11	15
`[[1,4,2], [2,4,1], [3,6,5]]`	2	8

What solution would be preferred in tech interviews?

You'd certainly make a good impression by coming up with the heap-based algorithm.

Sort the events

This algorithm breaks the jobs into events. A start event will increase the current load, and an end event will decrease the load. We can build a list with single events with each having an event time and the change in load that happens at that moment. We can then iterate over this list in sorted order and keep a running sum of what the current load is. This will allow us to track what the maximum is:

def find_max_cpu_load3(jobs):
    # make a sorted list of events (time, load_change)
    events = [(j.start, j.cpu_load) for j in jobs] + [(j.end, -j.cpu_load) for j in jobs]
    events.sort()
    max_cpu_load, current_cpu_load = 0, 0

    for time, load_change in events:
        current_cpu_load += load_change
        max_cpu_load = max(max_cpu_load, current_cpu_load)
    return max_cpu_load

回复收藏 0 原文

~没有更多了~