我对如何在 python/twisted 中编写异步代码有点困惑。假设(为了论证)我向世界公开一个函数,该函数将接受一个数字,如果它是素数/非素数,则返回 True/False,所以它看起来隐约像这样:(
def IsPrime(numberin):
for n in range(2,numberin):
if numberin % n == 0: return(False)
return(True)
只是为了说明)。
现在假设有一个网络服务器需要根据提交的值调用 IsPrime。对于较大的numberin
,这将需要很长时间。
如果同时另一个用户要求一个小数的素数,是否有办法使用reactor/deferreds架构异步运行两个函数调用,以便短计算的结果在长计算的结果之前返回?
如果 IsPrime 功能来自其他某个网络服务器,我的网络服务器将对其执行延迟的 getPage,我知道如何执行此操作,但如果它只是本地函数怎么办?
即,Twisted 能否以某种方式在对 IsPrime 的两次调用之间共享时间,或者是否需要显式调用新线程?
或者,是否需要将 IsPrime 循环分成一系列较小的循环,以便可以将控制快速传回反应堆?
还是别的什么?
I'm a bit puzzled about how to write asynchronous code in python/twisted. Suppose (for arguments sake) I am exposing a function to the world that will take a number and return True/False if it is prime/non-prime, so it looks vaguely like this:
def IsPrime(numberin):
for n in range(2,numberin):
if numberin % n == 0: return(False)
return(True)
(just to illustrate).
Now lets say there is a webserver which needs to call IsPrime based on a submitted value. This will take a long time for large numberin
.
If in the meantime another user asks for the primality of a small number, is there a way to run the two function calls asynchronously using the reactor/deferreds architecture so that the result of the short calc gets returned before the result of the long calc?
I understand how to do this if the IsPrime functionality came from some other webserver to which my webserver would do a deferred getPage, but what if it's just a local function?
i.e., can Twisted somehow time-share between the two calls to IsPrime, or would that require an explicit invocation of a new thread?
Or, would the IsPrime loop need to be chunked into a series of smaller loops so that control can be passed back to the reactor rapidly?
Or something else?
发布评论
评论(1)
我认为你目前的理解基本上是正确的。 Twisted 只是一个 Python 库,您编写的使用它的 Python 代码会像您期望的 Python 代码一样正常执行:如果您只有一个线程(和一个进程),那么一次只会发生一件事。 Twisted 提供的 API 几乎不会创建新的线程或进程,因此在正常情况下,您的代码会按顺序运行;
isPrime
在第一次执行完成之前无法执行第二次。仍然只考虑单个线程(和单个进程),Twisted 的所有“并发”或“并行性”都来自这样一个事实:Twisted 不执行阻塞网络 I/O(以及某些其他阻塞操作),而是提供了以下工具:以非阻塞方式执行操作。这可以让您的程序继续执行其他工作,否则它可能会陷入困境,等待阻塞 I/O 操作(例如从套接字读取或写入)完成。
通过将事物分割成小块并让事件处理程序在这些块之间运行,可以使事物变得“异步”。如果转换不会使代码变得更加难以理解和维护,那么有时这是一种有用的方法。 Twisted 提供了一个帮助程序来安排这些工作块,
cooperate
< /a>.使用此帮助器是有益的,因为它可以根据所有不同的工作源做出调度决策,并确保有剩余时间来服务事件源,而不会产生明显的额外延迟(换句话说,您添加到其中的作业越多) ,每项工作获得的时间越少,反应堆才能继续完成其工作)。Twisted 还提供了一些用于处理线程和进程的 API。如果不明显如何将工作分成多个块,这些可能会很有用。您可以使用
deferToThread
运行线程池中的(线程安全!)函数。方便的是,此 API 返回Deferred
最终将随着函数的返回值触发(或通过如果函数引发异常,则为Failure
)。这些 Deferred 看起来和其他的一样,就使用它们的代码而言,它也可以从像getPage
- 一个不使用额外线程、仅使用非阻塞 I/O 和事件处理程序的函数。由于 Python 不太适合在单个进程中运行多个 CPU 绑定线程,因此 Twisted 还提供了非阻塞 API 用于启动子进程并与子进程通信。您可以将计算卸载到此类进程,以利用额外的 CPU 或内核,而不必担心 GIL 会减慢您的速度,而这是分块策略和线程方法都无法提供的。处理此类进程的最低级别 API 是
reactor.spawnProcess
。还有 Ampoule,一个可以为您管理进程池的包,并为进程提供类似于deferToThread
的功能,deferToAMPProcess
。I think your current understanding is basically correct. Twisted is just a Python library and the Python code you write to use it executes normally as you would expect Python code to: if you have only a single thread (and a single process), then only one thing happens at a time. Almost no APIs provided by Twisted create new threads or processes, so in the normal course of things your code runs sequentially;
isPrime
cannot execute a second time until after it has finished executing the first time.Still considering just a single thread (and a single process), all of the "concurrency" or "parallelism" of Twisted comes from the fact that instead of doing blocking network I/O (and certain other blocking operations), Twisted provides tools for performing the operation in a non-blocking way. This lets your program continue on to perform other work when it might otherwise have been stuck doing nothing waiting for a blocking I/O operation (such as reading from or writing to a socket) to complete.
It is possible to make things "asynchronous" by splitting them into small chunks and letting event handlers run in between these chunks. This is sometimes a useful approach, if the transformation doesn't make the code too much more difficult to understand and maintain. Twisted provides a helper for scheduling these chunks of work,
cooperate
. It is beneficial to use this helper since it can make scheduling decisions based on all of the different sources of work and ensure that there is time left over to service event sources without significant additional latency (in other words, the more jobs you add to it, the less time each job will get, so that the reactor can keep doing its job).Twisted does also provide several APIs for dealing with threads and processes. These can be useful if it is not obvious how to break a job into chunks. You can use
deferToThread
to run a (thread-safe!) function in a thread pool. Conveniently, this API returns aDeferred
which will eventually fire with the return value of the function (or with aFailure
if the function raises an exception). These Deferreds look like any other, and as far as the code using them is concerned, it could just as well come back from a call likegetPage
- a function that uses no extra threads, just non-blocking I/O and event handlers.Since Python isn't ideally suited for running multiple CPU-bound threads in a single process, Twisted also provides a non-blocking API for launching and communicating with child processes. You can offload calculations to such processes to take advantage of additional CPUs or cores without worrying about the GIL slowing you down, something that neither the chunking strategy nor the threading approach offers. The lowest level API for dealing with such processes is
reactor.spawnProcess
. There is also Ampoule, a package which will manage a process pool for you and provides an analog todeferToThread
for processes,deferToAMPProcess
.