当前位置：文江博客话题详情

我可以在亚线性时间内找到未排序数组中的最大/最小值吗？

发布于 2024-12-19 10:48:51 字数 48 浏览 1 评论 0原文

是否可以？如果不是，给定一个大小为 n 的数组，我如何知道对数组进行排序是否更好？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

末蓝 2024-12-26 10:48:51

仅使用未排序的数组，无法在亚线性时间内完成此操作。由于您不知道哪个元素最大和最小，因此您必须查看所有元素，因此需要线性时间。

您会发现的最佳排序会比这更糟糕，可能相对于n log n，因此进行线性扫描会“更好”。

如果允许您存储更多信息，还有其他方法可以加快该过程。您可以使用以下规则存储最小值和最大值：

将值添加到空列表时，将最小值和最大值设置为该值。常数时间 O(1)。
将值添加到非空列表时，如果合适，请将 min 或 max 设置为该值。常数时间 O(1)。
从列表中删除值时，如果要删除的值等于当前的最小值或最大值，请将最小值或最大值设置为“未知”。常数时间 O(1)。如果您同时存储最小值/最大值及其计数，您还可以提高效率。换句话说，如果您的列表有当前最大值的七个副本，并且您删除了一个，则无需将最大值设置为未知，只需减少计数即可。仅当计数达到零时才应将其标记为未知。
如果您要求空列表的最小值或最大值，请返回一些特殊值。常数时间 O(1)。
如果您要求已知值的非空列表的最小值或最大值，请返回相关值。常数时间 O(1)。
如果您要求值未知的非空列表的最小值或最大值，请执行线性搜索来发现它们，然后返回相关值。线性时间 O(n)。

通过这样做，可能绝大多数检索最小值/最大值都是恒定时间。仅当您删除了最小值或最大值的值时，下一次检索才需要一次检索的线性时间。

此后的下一次检索将再次是常数时间，因为您已经计算并存储了它们，假设您没有再次删除中间的最小/最大值。

仅求最大值的伪代码可能很简单：

def initList ():
    list = []
    maxval = 0
    maxcount = 0

在上面的初始化代码中，我们只需创建列表以及最大值和计数。添加最小值并进行计数也很容易。

要添加到列表中，我们遵循上述规则：

def addToList (val):
    list.add (val) error on failure

    # Detect adding to empty list.
    if list.size = 1:
        maxval = val
        maxcount = 1
        return

    # If no maximum known at this point, calc later.
    if maxcount = 0:
        return

    # Adding less than current max, ignore.
    if val < maxval:
        return

    # Adding another of current max, bump up count.
    if val = maxval:
        maxcount += 1
        return

    # Otherwise, new max, set value and count.
    maxval = val
    maxcount = 1

删除非常简单。只需删除该值即可。如果是最大值，则减少这些最大值的计数。请注意，只有当您知道当前最大值时，这才有意义 - 如果不知道，则您已经处于必须计算它的状态，因此只需保持该状态即可。

计数变为零将表明最大值现在未知（您已将它们全部删除）：

def delFromList (val):
    list.del (val) error on failure

    # Decrement count if max is known and the value is max.
    # The count will become 0 when all maxes deleted.
    if maxcount > 0 and val = maxval:
        maxcount -= 1

获得最大值就是知道何时需要计算它（当 maxcount 为零时）。如果不需要计算，只需返回它：

def getMax ():
    # raise exception if list empty.
    error if list.size = 0

    # If maximum unknown, calculate it on demand.
    if maxcount = 0:
        maxval = list[0]
        for each val in list:
            if val = maxval:
                maxcount += 1
            elsif val > maxval:
                maxval = val
                maxcount = 1

    # Now it is known, just return it.
    return maxval

所有伪代码都使用看似全局变量，list、maxval和maxcount 。在正确设计的系统中，它们当然是实例变量，以便您可以并排运行多个列表。

With just the unsorted array, there is no way to do this in sub-linear time. Since you don't know which element is the largest and smallest, you have to look at them all, hence linear time.

The best sort you'll find will be worse than that, probably relative to n log n so it will be "better" to do the linear scan.

There are other ways to speed up the process if you're allowed to store more information. You can store the minimum and maximum using the following rules:

When adding a value to an empty list, set min and max to that value. Constant time O(1).
When adding a value to a non-empty list, set min or max to that value if appropriate. Constant time O(1).
When deleting a value from the list, set min or max to 'unknown' if the value being deleted is equal to the current min or max. Constant time O(1). You can also make this more efficient if you store both the min/max and the counts of them. In other words, if your list has seven copies of the current maximum and you delete one, there's no need to set the maximum to unknown, just decrement the count. Only when the count reaches zero should you mark it unknown.
If you ask for the minimum or maximum for an empty list, return some special value. Constant time O(1).
If you ask for the minimum or maximum for a non-empty list where the values are known, return the relevant value. Constant time O(1).
If you ask for the minimum or maximum for a non-empty list where the values are unknown, do a linear search to discover them then return the relevant value. Linear time O(n).

By doing it that way, probably the vast majority of retrieving min/max are constant time. It's only when you've removed a value which was the min or max does the next retrieval require linear time for one retrieval.

The next retrieval after that will again be constant time since you've calculated and stored them, assuming you don't remove the min/max value in the interim again.

Pseudo-code for just the maximum could be as simple as:

def initList ():
    list = []
    maxval = 0
    maxcount = 0

In that initialisation code above, we simply create the list and a maximum value and count. It would be easy to also add the minimum value and count as well.

To add to the list, we follow the rules above:

def addToList (val):
    list.add (val) error on failure

    # Detect adding to empty list.
    if list.size = 1:
        maxval = val
        maxcount = 1
        return

    # If no maximum known at this point, calc later.
    if maxcount = 0:
        return

    # Adding less than current max, ignore.
    if val < maxval:
        return

    # Adding another of current max, bump up count.
    if val = maxval:
        maxcount += 1
        return

    # Otherwise, new max, set value and count.
    maxval = val
    maxcount = 1

Deleting is quite simple. Just delete the value. If it was the maximum value, decrement the count of those maximum values. Note that this only makes sense if you know the current maximum - if not, you were already in the state where you were going to have to calculate it so just stay in that state.

The count becoming zero will indicate the maximum is now unknown (you've deleted them all):

def delFromList (val):
    list.del (val) error on failure

    # Decrement count if max is known and the value is max.
    # The count will become 0 when all maxes deleted.
    if maxcount > 0 and val = maxval:
        maxcount -= 1

Getting the maximum is then a matter of knowing when it needs to be calculated (when maxcount is zero). If it doesn't need to be calculated, just return it:

def getMax ():
    # raise exception if list empty.
    error if list.size = 0

    # If maximum unknown, calculate it on demand.
    if maxcount = 0:
        maxval = list[0]
        for each val in list:
            if val = maxval:
                maxcount += 1
            elsif val > maxval:
                maxval = val
                maxcount = 1

    # Now it is known, just return it.
    return maxval

All that pseudo-code uses seemingly global variables, list, maxval and maxcount. In a properly engineered system, they would of course be instance variables so that you can run multiple lists side-by-side.

回复收藏 0 原文