跟踪扩展数组的中位数

发布于 2024-10-15 09:08:09 字数 250 浏览 2 评论 0原文

面试问题:

编辑如下 给你一个数组。您可以从中创建 2 个堆,一个是最小堆,另一个是最大堆。现在使用这 2 个提供的堆在 O(nlog n) 时间内找到数组的中位数。

更正问题 数字是随机生成的并存储到(扩展)数组中。您将如何跟踪中位数?

解决方案 这个问题可以使用 2 个堆来解决,并且中位数总是可以在 O(1) 时间内访问到。

Interview Question:

Edited Below
You are given an array. You make 2 heaps out of it, one minheap and the other max heap. Now find the median of the array using these 2 provided heaps in O(nlog n) time.

Corrected Question
Numbers are randomly generated and stored into an (expanding) array. How would you keep track of the median?

Solution
This problem can be solved using 2 heaps and the median can always be accessed in O(1) time.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

把时间冻结 2024-10-22 09:08:09

以下是如何使用这两个堆。请注意,我假设您不知道元素的数量,这就是为什么我们必须弹出,直到我们从最小堆中弹出的内容大于或等于从最大堆中弹出的内容为止。请注意,我们返回平均值,因为在像 {1, 2, 3, 4} 这样的集合的情况下,中位数实际上是 2.5 (两个“中间”的平均值) “值)。我假设 double 作为值类型,但这显然可以是任何类型。这里:

double min = minheap.pop();
double max = maxheap.pop();
while(min < max) {
    min = minheap.pop();
    max = maxheap.pop();
}

return (min + max) / 2;

由于弹出的时间复杂度为O(log n),并且我们必须弹出O(n / 2)个值,所以这是O(n log n)代码>.

Here's how you use both heaps. Note that I'm assuming you don't know the number of elements, and this is why we must pop until we pop something from the min heap that is larger than or equal to what we pop from the max heap. Note that we return the average because in the case of a set like {1, 2, 3, 4} the median is actually 2.5 (the average of the two "middle" values). I'm assuming double as the value type, but this can obviously be anything. Here:

double min = minheap.pop();
double max = maxheap.pop();
while(min < max) {
    min = minheap.pop();
    max = maxheap.pop();
}

return (min + max) / 2;

Since popping is O(log n) and we have to pop O(n / 2) values, this is O(n log n).

黎夕旧梦 2024-10-22 09:08:09

java 中的一个有效实现,使用 2 个堆,O(n log n)。在任何时候,我都会保持两个堆的大小平衡(即,如果我们输入 n 个元素使得 n%2==1,它们最多相差 1)。获取中位数的时间复杂度为 O(1)。添加新元素的时间复杂度为 O(log n)。

public class MedianOfStream {

    private int count;
    private PriorityQueue<Integer> highs, lows;

    public MedianOfStream() {
        highs = new PriorityQueue<Integer>(11, new Comparator<Integer>() {
            @Override
            public int compare(Integer arg0, Integer arg1) {
                return arg0.compareTo(arg1);
            }
        });
        lows = new PriorityQueue<Integer>(11, new Comparator<Integer>() {
            @Override
            public int compare(Integer arg0, Integer arg1) {
                return arg1.compareTo(arg0);
            }
        });
    }

    private int getMedian() {
        if (count == 0)
            return 0;
        if (lows.size() == highs.size()) {
            return (lows.peek() + highs.peek()) / 2;
        } else if (lows.size() < highs.size()) {
            return highs.peek();
        }
        return lows.peek();
    }

    private void swap(){
        int h = highs.poll();
        int l = lows.poll();
        highs.add(l);
        lows.add(h);
    }

    public int updateMedian(int n) {
        count++;

        if (count == 1)
            lows.add(n);

        else if (count==2) {
            highs.add(n);
            if(highs.peek()<lows.peek()) {
                swap(); // O(log n)
            }
        }

        else {
            if (n > highs.peek()) {
                lows.add(highs.poll()); // O(log n)
                highs.add(n); // O(log n)
            } else {
                highs.add(lows.poll()); // O(log n)
                lows.add(n); // O(log n)
            }
            if(highs.peek()<lows.peek()) {
                swap(); // O(log n)
            }
        }

        // if we added an even # of items,
        // the heaps must be exactly the same size,
        // otherwise we tolerate a 1-off difference

        if (Math.abs(lows.size() - highs.size()) > (count % 2)) {
            if (lows.size() < highs.size()) {
                lows.add(highs.poll()); // O(log n)
            } else {
                highs.add(lows.poll()); // O(log n)
            }
        }

        return getMedian(); // O(1)
    }
}

A working implementation in java, using 2 heaps, O(n log n). At any time I keep the two heaps balanced in size (ie. they differ at most by 1, if we entered n elements such that n%2==1). Getting the median is O(1). Adding a new element is O(log n).

public class MedianOfStream {

    private int count;
    private PriorityQueue<Integer> highs, lows;

    public MedianOfStream() {
        highs = new PriorityQueue<Integer>(11, new Comparator<Integer>() {
            @Override
            public int compare(Integer arg0, Integer arg1) {
                return arg0.compareTo(arg1);
            }
        });
        lows = new PriorityQueue<Integer>(11, new Comparator<Integer>() {
            @Override
            public int compare(Integer arg0, Integer arg1) {
                return arg1.compareTo(arg0);
            }
        });
    }

    private int getMedian() {
        if (count == 0)
            return 0;
        if (lows.size() == highs.size()) {
            return (lows.peek() + highs.peek()) / 2;
        } else if (lows.size() < highs.size()) {
            return highs.peek();
        }
        return lows.peek();
    }

    private void swap(){
        int h = highs.poll();
        int l = lows.poll();
        highs.add(l);
        lows.add(h);
    }

    public int updateMedian(int n) {
        count++;

        if (count == 1)
            lows.add(n);

        else if (count==2) {
            highs.add(n);
            if(highs.peek()<lows.peek()) {
                swap(); // O(log n)
            }
        }

        else {
            if (n > highs.peek()) {
                lows.add(highs.poll()); // O(log n)
                highs.add(n); // O(log n)
            } else {
                highs.add(lows.poll()); // O(log n)
                lows.add(n); // O(log n)
            }
            if(highs.peek()<lows.peek()) {
                swap(); // O(log n)
            }
        }

        // if we added an even # of items,
        // the heaps must be exactly the same size,
        // otherwise we tolerate a 1-off difference

        if (Math.abs(lows.size() - highs.size()) > (count % 2)) {
            if (lows.size() < highs.size()) {
                lows.add(highs.poll()); // O(log n)
            } else {
                highs.add(lows.poll()); // O(log n)
            }
        }

        return getMedian(); // O(1)
    }
}
墨落成白 2024-10-22 09:08:09

从堆中弹出是一个 O(log N) 操作,因此您可以通过从其中一个堆中弹出一半元素并获取最后弹出的值来实现 O(N log N)(您必须处理边缘情况)。但这并没有利用其他堆。

您可以使用选择算法实现O(​​N),但常数因子非常高。如果您已经有堆,前一个建议可能会更好。

Popping from a heap is an O(log N) operation, so you can achieve O(N log N) by popping half the elements from one of the heaps and taking the last popped value (you'd have to handle edge cases). This doesn't take advantage of the other heap though.

You can achieve O(N) using the selection algorithm, but the constant factor is very high. The former suggestion is probably better if you already have a heap.

金兰素衣 2024-10-22 09:08:09

这是它的 python 实现。我测试了使用 Random.org 并在 中值查找器 其中大多数似乎都有效。

import heapq

def medianFinder(arr):
    minHeap = []
    maxHeap = []
    def handleMedianFinder(num: int):
        if not len(maxHeap):
            heapq.heappush(maxHeap, -num)
            return -maxHeap[0]
        elif not len(minHeap):
            heapq.heappush(minHeap, num)
            return (minHeap[0]-maxHeap[0])/2
        if num < minHeap[0]:
            if len(maxHeap) > len(minHeap):
                oldMedian = -heapq.heappop(maxHeap)
                heapq.heappush(minHeap, oldMedian)
                heapq.heappush(maxHeap, -num)
                return (minHeap[0]-maxHeap[0])/2 if (len(minHeap) + len(maxHeap))%2 == 0 else minHeap[0]
            heapq.heappush(maxHeap, -num)
        elif num > -maxHeap[0]:
            if len(maxHeap) < len(minHeap):
                oldMedian = heapq.heappop(minHeap)
                heapq.heappush(maxHeap, -oldMedian)
                heapq.heappush(minHeap, num)
                return (minHeap[0]-maxHeap[0])/2 if (len(minHeap) + len(maxHeap))%2 == 0 else -maxHeap[0]
            heapq.heappush(minHeap, num)
        else: # between the medians (new medians)
            if len(maxHeap) < len(minHeap):
                oldMedian = heapq.heappop(minHeap)
                heapq.heappush(maxHeap, -oldMedian)
                heapq.heappush(minHeap, num)
            else:
                oldMedian = -heapq.heappop(maxHeap)
                heapq.heappush(minHeap, oldMedian)
                heapq.heappush(maxHeap, -num)
        if len(minHeap) < len(maxHeap):
            return minHeap[0]
        elif len(maxHeap) < len(minHeap):
            return -maxHeap[0]
        return (minHeap[0]-maxHeap[0])/2
    for num in arr:
        # print('maxHeap => ', maxHeap)
        # print('minHeap => ', minHeap)
        print(handleMedianFinder(num))

        
arr1 = [11, 18, 16, 12, 14, 4, 15, 10, 19, 20]
medianFinder(arr1)

Here's a python implementation for it. I tested some of the examples generated using Random.org and tested them on median finder most of them seem to work.

import heapq

def medianFinder(arr):
    minHeap = []
    maxHeap = []
    def handleMedianFinder(num: int):
        if not len(maxHeap):
            heapq.heappush(maxHeap, -num)
            return -maxHeap[0]
        elif not len(minHeap):
            heapq.heappush(minHeap, num)
            return (minHeap[0]-maxHeap[0])/2
        if num < minHeap[0]:
            if len(maxHeap) > len(minHeap):
                oldMedian = -heapq.heappop(maxHeap)
                heapq.heappush(minHeap, oldMedian)
                heapq.heappush(maxHeap, -num)
                return (minHeap[0]-maxHeap[0])/2 if (len(minHeap) + len(maxHeap))%2 == 0 else minHeap[0]
            heapq.heappush(maxHeap, -num)
        elif num > -maxHeap[0]:
            if len(maxHeap) < len(minHeap):
                oldMedian = heapq.heappop(minHeap)
                heapq.heappush(maxHeap, -oldMedian)
                heapq.heappush(minHeap, num)
                return (minHeap[0]-maxHeap[0])/2 if (len(minHeap) + len(maxHeap))%2 == 0 else -maxHeap[0]
            heapq.heappush(minHeap, num)
        else: # between the medians (new medians)
            if len(maxHeap) < len(minHeap):
                oldMedian = heapq.heappop(minHeap)
                heapq.heappush(maxHeap, -oldMedian)
                heapq.heappush(minHeap, num)
            else:
                oldMedian = -heapq.heappop(maxHeap)
                heapq.heappush(minHeap, oldMedian)
                heapq.heappush(maxHeap, -num)
        if len(minHeap) < len(maxHeap):
            return minHeap[0]
        elif len(maxHeap) < len(minHeap):
            return -maxHeap[0]
        return (minHeap[0]-maxHeap[0])/2
    for num in arr:
        # print('maxHeap => ', maxHeap)
        # print('minHeap => ', minHeap)
        print(handleMedianFinder(num))

        
arr1 = [11, 18, 16, 12, 14, 4, 15, 10, 19, 20]
medianFinder(arr1)
携君以终年 2024-10-22 09:08:09

使用两个堆的 JavaScript 解决方案:

function addNewNumber(minHeap, maxHeap, randomNumber) {
  if (maxHeap.size() === minHeap.size()) {
    if (minHeap.peek() && randomNumber > minHeap.peek()) {
      maxHeap.insert(minHeap.remove());
      minHeap.insert(randomNumber);
    } else {
      maxHeap.insert(randomNumber);
    }
  } else {
    if (randomNumber < maxHeap.peek()) {
      minHeap.insert(maxHeap.remove());
      maxHeap.insert(randomNumber);
    } else {
      minHeap.insert(randomNumber);
    }
  }
}

function getMedian(minHeap, maxHeap) {
  if (!maxHeap.size()) {
    return 0;
  }
  if (minHeap.size() === maxHeap.size()) {
    return (minHeap.peek() + maxHeap.peek()) / 2;
  } else {
    return maxHeap.peek();
  }
}

JavaScript solution using two heaps:

function addNewNumber(minHeap, maxHeap, randomNumber) {
  if (maxHeap.size() === minHeap.size()) {
    if (minHeap.peek() && randomNumber > minHeap.peek()) {
      maxHeap.insert(minHeap.remove());
      minHeap.insert(randomNumber);
    } else {
      maxHeap.insert(randomNumber);
    }
  } else {
    if (randomNumber < maxHeap.peek()) {
      minHeap.insert(maxHeap.remove());
      maxHeap.insert(randomNumber);
    } else {
      minHeap.insert(randomNumber);
    }
  }
}

function getMedian(minHeap, maxHeap) {
  if (!maxHeap.size()) {
    return 0;
  }
  if (minHeap.size() === maxHeap.size()) {
    return (minHeap.peek() + maxHeap.peek()) / 2;
  } else {
    return maxHeap.peek();
  }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文