python中的调度问题
我正在使用 python 将硬件 USB 嗅探器设备与供应商提供的 python API 连接起来,并且我尝试在无限循环中的单独线程中从设备读取(USB 数据包)(工作正常)。 问题是我的主循环似乎不再被安排(我的读取循环得到了所有的关注)。
代码看起来很像这样:(
from threading import Thread
import time
usb_device = 0
def usb_dump(usb_device):
while True:
#time.sleep(0.001)
packet = ReadUSBDevice(usb_device)
print "packet pid: %s" % packet.pid
class DumpThread(Thread):
def run(self):
usb_dump()
usb_device = OpenUSBDevice()
t = DumpThread()
t.start()
print "Sleep 1"
time.sleep(1)
print "End"
CloseUSBDevice(usb_device)
sys.exit(0)
我可以粘贴实际代码,但由于您需要硬件设备,我认为它不会有太大帮助)。
我期望这段代码在主线程终止整个程序之前开始转储 USB 数据包大约一秒钟。 但是,我看到的只是“Sleep 1”,然后 usb_dump()
过程永远运行。 如果我在 usb_dump()
过程的内部循环中取消注释“time.sleep(0.001)”语句,事情就会开始按我预期的方式工作,但随后 python 代码将无法跟上所有操作传入的数据包:-(
供应商告诉我这是一个 python 调度程序问题,而不是他们的 api 的错误,因此不会帮助我:
«但是,在 Python 中使用线程时,您似乎遇到了一些细微差别。 通过将 time.sleep 放入 DumpThread 线程中,您可以显式向 Python 线程系统发出信号以放弃控制。 否则,由 Python 解释器决定何时切换线程,通常在执行一定数量的字节码指令后执行。»
有人可以确认 python 是这里的问题吗? 还有其他方法可以让DumpThread释放控制权吗? 还有其他想法吗?
I'm using python to interface a hardware usb sniffer device with the python API provided by the vendor and I'm trying to read (usb packets) from the device in a separate thread in an infinite loop (which works fine). The problem is that my main loop does not seem to ever get scheduled again (my read loop gets all the attention).
The code looks much like this:
from threading import Thread
import time
usb_device = 0
def usb_dump(usb_device):
while True:
#time.sleep(0.001)
packet = ReadUSBDevice(usb_device)
print "packet pid: %s" % packet.pid
class DumpThread(Thread):
def run(self):
usb_dump()
usb_device = OpenUSBDevice()
t = DumpThread()
t.start()
print "Sleep 1"
time.sleep(1)
print "End"
CloseUSBDevice(usb_device)
sys.exit(0)
(I could paste actual code, but since you need the hardware device I figure it won't help much).
I'm expecting this code to start dumping usb packets for about a second before the main thread terminates the entire program. However, all I see is "Sleep 1" and then the usb_dump()
procedure runs forever. If I uncomment the "time.sleep(0.001)" statement in the inner loop of the usb_dump()
procedure things start working the way I expect, but then the python code becomes unable to keep up with all the packets coming in :-(
The vendor tells me that this is an python scheduler problem and not their api's fault and therefor won't help me:
«However, it seems like you are experiencing some nuances when using threading in Python. By putting the time.sleep in the DumpThread thread, you are explicitly signaling to the Python threading system to give up control. Otherwise, it is up the Python interpreter to determine when to switch threads and it usually does that after a certain number of byte code instructions have been executed.»
Can somebody confirm that python is the problem here? Is there another way to make the DumpThread release control? Any other ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您的供应商是纯Python代码,那么您的供应商就是对的; 但是,C 扩展可能会释放 GIL,并且因此允许实际的多线程。
特别是,time.sleep确实释放了GIL(您可以直接从源代码中查看它,此处 - 查看
floatsleep
实现); 所以你的代码应该不会有任何问题。作为进一步的证明,我还做了一个简单的测试,只是删除了对 USB 的调用,它实际上按预期工作:
最后,只有关于您发布的代码的一些注释:
[更新]关于最新的一点:正如评论中所述,我认为
Timer
更适合您的函数的语义(定期轮询),并且会自动避免问题供应商代码未发布 GIL。Your vendor would be right if yours was pure python code; however, C extensions may release the GIL, and therefore allows for actual multithreading.
In particular, time.sleep does release the GIL (you can check it directly from the source code, here - look at
floatsleep
implementation); so your code should not have any problem.As a further proof, I have made also a simple test, just removing the calls to USB, and it actually works as expected:
Finally, just a couple of notes on the code you posted:
[Update] About the latest point: as told in the comment, I think a
Timer
would better fit the semantic of your function (a periodic poll) and would automatically avoid issues with the GIL not being released by the vendor code.我假设您编写了一个公开 ReadUSBDevice 函数的 Python C 模块,并且它的目的是阻塞直到收到 USB 数据包,然后将其返回。
本机 ReadUSBDevice 实现需要在等待 USB 数据包时释放 Python GIL,然后在收到 USB 数据包时重新获取它。 这允许其他 Python 线程在您执行本机代码时运行。
http://docs. python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock
当您解锁 GIL 时,您将无法访问 Python。 释放 GIL,运行阻塞函数,然后当您知道有东西要返回给 Python 时,重新获取它。
如果您不这样做,那么当您的本机阻塞正在进行时,其他 Python 线程就无法执行。 如果这是供应商提供的 Python 模块,则在本机阻塞活动期间未能释放 GIL 是一个错误。
请注意,如果您收到许多数据包,并实际在 Python 中处理它们,那么其他线程仍应运行。 实际运行Python代码的多个线程不会并行运行,但它会频繁地在线程之间切换,让它们都有运行的机会。 如果本机代码在未释放 GIL 的情况下发生阻塞,则此方法不起作用。
编辑:我看到你提到这是一个供应商提供的库。 如果您没有源代码,可以快速查看他们是否正在释放 GIL:在没有 USB 活动发生时启动 ReadUSBDevice 线程,这样 ReadUSBDevice 就只是等待数据。 如果他们释放 GIL,其他线程应该不受阻碍地运行。 如果不是,它将阻止整个解释器。 那将是一个严重的错误。
I'm assuming you wrote a Python C module that exposes the ReadUSBDevice function, and that it's intended to block until a USB packet is received, then return it.
The native ReadUSBDevice implementation needs to release the Python GIL while it's waiting for a USB packet, and then reacquire it when it receives one. This allows other Python threads to run while you're executing native code.
http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock
While you've unlocked the GIL, you can't access Python. Release the GIL, run the blocking function, then when you know you have something to return back to Python, re-acquire it.
If you don't do this, then no other Python threads can execute while your native blocking is going on. If this is a vendor-supplied Python module, failing to release the GIL during native blocking activity is a bug.
Note that if you're receiving many packets, and actually processing them in Python, then other threads should still run. Multiple threads which are actually running Python code won't run in parallel, but it'll frequently switch between threads, giving them all a chance to run. This doesn't work if native code is blocking without releasing the GIL.
edit: I see you mentioned this is a vendor-supplied library. If you don't have source, a quick way to see if they're releasing the GIL: start the ReadUSBDevice thread while no USB activity is happening, so ReadUSBDevice simply sits around waiting for data. If they're releasing the GIL, the other threads should run unimpeded. If they're not, it'll block the whole interpreter. That would be a serious bug.
我认为卖家是正确的。 假设这是 CPython,没有真正的并行线程; 一次只能执行一个线程。 这是因为全局解释器锁的实现。
您可以通过使用 multiprocessing 模块来实现可接受的解决方案,该模块有效地回避了通过生成真正的子进程来锁定垃圾收集器。
另一种可能有帮助的可能性是修改调度程序的切换行为。
I think the vendor is correct. Assuming this is CPython, there is no true parallel threading; only one thread can execute at a time. This is because of the implementation of the global interpreter lock.
You may be able to achieve an acceptable solution by using the multiprocessing module, which effectively sidesteps the garbage collector's lock by spawning true sub-processes.
Another possibility that may help is to modify the scheduler's switching behaviour.