python多线程“超过最大递归深度”

发布于 2024-08-30 02:26:09 字数 796 浏览 6 评论 0原文

我使用Python多线程来实现快速排序。 快速排序是在函数中实现的。它是一个递归函数。 每个线程调用 Quicksort 对其拥有的数组进行排序。每个线程都有自己的数组,用于存储需要排序的数字。 如果数组大小较小(<10,000)。运行正常。 但是,如果数组大小较大,则会显示“最大递归深度超出”。 于是,我使用setrecursionlimit()函数将递归深度重置为1500。但是程序直接崩溃了... 以下是快速排序代码。如果不是在多线程环境中,它也能很好地工作。 看来多线程是递归深度问题的原因。

def partition (array, p, r):
    x = array[r]
    i = (p-1)
    j = p
    while (1):
        if array[j] <= x:
            i = (i+1)
            temp = array[j]
            array[j] = array[i]
            array[i] = temp
        j+=1
        if j == r:
            break
    temp = array[i+1]
    array[i+1] = array[r]
    array[r] = temp
    return i+1

def quicksort (array, p, r):
    if p < r:
        q = partition (array, p, r)
        quicksort (array, p, q-1)
        quicksort (array, q+1, r)

I use Python multithread to realize Quicksort.
Quicksort is implement in a function. It is a recursive function.
Each thread calls Quicksort to sort the array it has. Each thread has its own array that stores the numbers needs to be sorted.
If the array size is smaller (<10,000). It runs ok.
However, if the array size is larger, it shows the "maximum recursion depth exceed".
So, I use setrecursionlimit () function to reset the recursion depth to 1500. But the program crash directly...
The following is quicksort code. It works well if not in multi-thread environment.
It seems multiple threads is the cause of recursion depth problem.

def partition (array, p, r):
    x = array[r]
    i = (p-1)
    j = p
    while (1):
        if array[j] <= x:
            i = (i+1)
            temp = array[j]
            array[j] = array[i]
            array[i] = temp
        j+=1
        if j == r:
            break
    temp = array[i+1]
    array[i+1] = array[r]
    array[r] = temp
    return i+1

def quicksort (array, p, r):
    if p < r:
        q = partition (array, p, r)
        quicksort (array, p, q-1)
        quicksort (array, q+1, r)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

如歌彻婉言 2024-09-06 02:26:10

你为什么要编写自己的快速排序例程?这是作业吗?

如果没有,我建议使用内置的排序机制;它们对于绝大多数情况都很好,并且不会遇到递归深度问题。如果您正在查看非常大的数据集,我建议您查看 scipy 和 numpy 中可用的各种容器和算法。

如果纯粹是出于对实现例程的好奇,正如马塞洛在评论中建议的那样,我们将需要查看代码。

Why are you writing your own quicksort routine? Is this homework?

If not, I'd suggest using the built in sorting mechanisms; they're quite good for the vast majority of cases and don't suffer from a recursion depth problem. If you're looking at extremely large datasets, I'd suggest looking at the various containers and algorithms available from scipy and numpy.

If it's purely for curiosity of implementing the routine, as Marcelo suggests in the comments, we will need to see code.

若无相欠,怎会相见 2024-09-06 02:26:10

您遇到的问题是递归函数使用内存,并且由于存在大量元素,因此存在大量递归,因此内存不足。这解释了为什么提高递归限制会导致程序崩溃——您要求的内存比您拥有的内存多。

如果您确实想对大量元素实现快速排序,则需要阅读此内容sorted() 函数。除非这是家庭作业或好奇心,否则我强烈建议使用它。

The problem you are having is a recursive function uses memory, and with a large number of elements and thus a large number of recursions, you're running out of memory. This explains why raising the recursion limit crashes your program – you're asking for more memory than you have.

If you really want to implement quicksort for a large number of elements, you will want to read this article on Wikipedia on memory usage specifically using quicksort. Otherwise as Nathan suggested, Python already has a built in sorted() function. Unless this is homework or curiosity, I would highly suggest using that.

迷离° 2024-09-06 02:26:10

这是快速排序的迭代代码

    import time
    import random

    stack = []

    def partition(data,p,q):
        global stack
        pivot = p
        pivotvalue = data[q]
        for index in range(p,q+1):
            if data[index] < pivotvalue:
                temp = data[index]
                data[index] = data[pivot]
                data[pivot] = temp
                pivot = pivot + 1
        temp = data[q]
        data[q] = data[pivot]
        data[pivot] = temp
        return pivot

    def qSort(data,p,q):
        global stack
        push(stack,p,q)
        while isEmpty(stack) == False:
            q = pop(stack)
            p = pop(stack)
            pivot = partition(data,p,q)
            if pivot-1 > p:
                push(stack,p,pivot-1)
            if pivot+1 < q:
                push(stack,pivot+1,q)


    def push(stack,p,q):
        stack.append(p)
        stack.append(q)

    def pop(stack):
        global top
        if(len(stack)==0):
            return -1
        element = stack.pop()
        return element

    def isEmpty(stack):
        return len(stack) == 0

    if __name__ == '__main__':
        start_time = time.time()
        data = (range(1000000,0,-1))
        random.shuffle(data)
        #print data
        qSort(data,0,len(data)-1)
        #print data
        print time.time() - start_time, "seconds"

Here is the Iterative code for QuickSort

    import time
    import random

    stack = []

    def partition(data,p,q):
        global stack
        pivot = p
        pivotvalue = data[q]
        for index in range(p,q+1):
            if data[index] < pivotvalue:
                temp = data[index]
                data[index] = data[pivot]
                data[pivot] = temp
                pivot = pivot + 1
        temp = data[q]
        data[q] = data[pivot]
        data[pivot] = temp
        return pivot

    def qSort(data,p,q):
        global stack
        push(stack,p,q)
        while isEmpty(stack) == False:
            q = pop(stack)
            p = pop(stack)
            pivot = partition(data,p,q)
            if pivot-1 > p:
                push(stack,p,pivot-1)
            if pivot+1 < q:
                push(stack,pivot+1,q)


    def push(stack,p,q):
        stack.append(p)
        stack.append(q)

    def pop(stack):
        global top
        if(len(stack)==0):
            return -1
        element = stack.pop()
        return element

    def isEmpty(stack):
        return len(stack) == 0

    if __name__ == '__main__':
        start_time = time.time()
        data = (range(1000000,0,-1))
        random.shuffle(data)
        #print data
        qSort(data,0,len(data)-1)
        #print data
        print time.time() - start_time, "seconds"
佼人 2024-09-06 02:26:09

听起来你真正的问题是“为什么使用线程时递归深度更短”?我将尝试回答这个问题。

首先,背景。每级递归都存储在称为堆栈的内存区域中。不幸的是,系统必须提前分配堆栈空间,并且它事先并不知道您的程序可能需要多少堆栈空间。这就是为什么过多的递归会导致“最大递归深度”错误:您的程序已用完所有堆栈空间。

每个线程都需要自己的堆栈来存储当前正在该线程中执行的函数列表。在单线程程序中,系统可以为该线程的堆栈提供一大块内存。在多线程程序中,系统必须更加保守一点,它只为每个线程提供一个小的堆栈。否则,具有多个线程的程序可能会很快用完所有系统内存,仅包括堆栈空间(其中大部分不会被使用)。

所有这些都是由操作系统和/或 C 库完成的,Python(更准确地说,CPython)在 C 库上运行。 Python 尽力阻止您使用整个 C 堆栈,因为这会导致严重崩溃而不仅仅是异常。您可以告诉 Python 如何使用 setrecursionlimit 函数进行操作,但这不会改变实际可用的堆栈空间量。

在具有 bash shell 的 UNIX 系统上,您可以使用 ulimit -s 命令更改堆栈大小。在 bash shell 提示符处输入 help ulimit 以获取更多信息。

It sounds like your real question is "Why is the recursion depth shorter when using threads"? I will try to answer that question.

First, background. Each level of recursion is stored an area of memory known as the stack. Unfortunately, the system has to allocate the stack space in advance and it doesn't know in advance how much stack space your program might need. That's why too much recursion causes a "maximum recursion depth" error: your program has used up all of that stack space.

Each thread needs its own stack to store the list of functions that are currently executing in that thread. In a single threaded program, the system can afford to give a big chunk of memory to the stack for that one thread. In a multi-threaded program, the system has to be a bit more conservative and it gives only a small stack to each thread. Otherwise, a program with many threads could quickly use up all system memory just with stack space (most of which won't be used).

All of this is done by the operating system and/or by the C library, which Python (more precisely, CPython) runs on top of. Python tries hard to prevent you from using the entire C stack, because that would cause a hard crash rather than simply an exception. You can tell Python how to behave with the setrecursionlimit function, but that doesn't change the actual amount of stack space available.

On a unix-ish system with a bash shell, you may be able to change the stack size with the ulimit -s command. Type help ulimit at your bash shell prompt for more information.

﹏半生如梦愿梦如真 2024-09-06 02:26:09
  • 您正在使用快速排序的递归实现。 您想使用迭代来实现快速排序。

    递归在 Python 中不可扩展(至少在 CPython 中),因此对于较大的输入它将失败。您可以增加递归限制,但这只会让您在更大的范围内扩展,而不会使您的实现真正扩展。如果你有太多的递归,它也会以允许出现段错误的可能性为代价。这种方法对于多线程代码也有效(或者说实际上不起作用),您只需做得更多,因为每个线程的递归限制会更低。总而言之,这是一个失败的主张:改用迭代。

  • 您正在使用线程(或计划使用线程),这通常是一个不好的迹象。线程是混乱的、危险的、困难的。此外,Python 中的线程不会为您提供并行执行(如果这是您所期望的)。使用线程进行快速排序实现,尤其是在 Python 中,可能不太理想。 (如果您被要求这样做,您至少应该退一步并了解这可能不是最好的方法。)

  • You are using a recursive implementation of quicksort. You want to implement quicksort using iteration instead.

    Recursion isn't scalable in Python (in CPython at least), so for larger input it will fail. You can increase the recursion limit, but this will only let you scale over a larger range, not make your implementation really scale. It also comes at the cost of allowing the possibility of a segfault if you have too much recursion. This approach works (or rather doesn't really work) as well for multithreaded code, you just have to do it more because the recursion limit for each thread will be lower. All in all, it's a losing proposition: use iteration instead.

  • You are using threads (or planning to), which is usually a bad sign. Threads are confusing and dangerous and hard. What's more, threads in Python do not give you parallel execution, if that's what you were expecting. Using threads for a quicksort implementation, especially in Python, will probably prove less than ideal. (If you are required to do it, you should at least step back and understand that it might not be the best approach.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文