这是生产者消费者类型的疯狂实现吗?
# file1.py
class _Producer(self):
def __init__(self):
self.chunksize = 6220800
with open('/dev/zero') as f:
self.thing = f.read(self.chunksize)
self.n = 0
self.start()
def start(self):
import subprocess
import threading
def produce():
self._proc = subprocess.Popen(['producer_proc'], stdout=subprocess.PIPE)
while True:
self.thing = self._proc.stdout.read(self.chunksize)
if len(self.thing) != self.chunksize:
msg = 'Expected {0} bytes. Read {1} bytes'.format(self.chunksize, len(self.thing))
raise Exception(msg)
self.n += 1
t = threading.Thread(target=produce)
t.daemon = True
t.start()
self._thread = t
def stop(self):
if self._thread.is_alive():
self._proc.terminate()
self._thread.join(1)
producer = _Producer()
producer.start()
我编写了一些或多或少类似于上述设计的代码,现在我希望能够通过以下方式使用其他文件中的 Producer_proc 的输出:
# some_other_file.py
import file1
my_thing = file1.producer.thing
多个其他消费者可能会获取对 的引用>file. Producer.thing ,它们都需要从同一个
Producer_proc 使用。并且
Producer_proc 永远不应该被阻塞。这是一个合理的实施吗? python GIL 是否使其线程安全,或者我是否需要使用队列重新实现来获取工作线程的数据?消费者是否需要明确复制该物品?
我想我正在尝试实现诸如生产者/消费者模式或观察者模式之类的东西,但我不太清楚设计模式的所有技术细节。
- 单个生产者不断地制造东西
- 多个消费者在任意时间使用东西
Producer.thing 一旦新的东西可用,就应该用新的东西替换,大多数东西都会被闲置,但这没关系,
- 没关系让多个消费者阅读同一个内容,或者连续阅读同一个内容两次。他们只想确保在要求时得到的是最新的东西,而不是一些过时的旧东西。
- 消费者应该能够继续使用一个东西,只要他们在范围内,即使生产者可能已经用一个新的东西覆盖了他的
self.thing
。
# file1.py
class _Producer(self):
def __init__(self):
self.chunksize = 6220800
with open('/dev/zero') as f:
self.thing = f.read(self.chunksize)
self.n = 0
self.start()
def start(self):
import subprocess
import threading
def produce():
self._proc = subprocess.Popen(['producer_proc'], stdout=subprocess.PIPE)
while True:
self.thing = self._proc.stdout.read(self.chunksize)
if len(self.thing) != self.chunksize:
msg = 'Expected {0} bytes. Read {1} bytes'.format(self.chunksize, len(self.thing))
raise Exception(msg)
self.n += 1
t = threading.Thread(target=produce)
t.daemon = True
t.start()
self._thread = t
def stop(self):
if self._thread.is_alive():
self._proc.terminate()
self._thread.join(1)
producer = _Producer()
producer.start()
I have written some code more or less like the above design, and now I want to be able to consume the output of producer_proc
in other files by going:
# some_other_file.py
import file1
my_thing = file1.producer.thing
Multiple other consumers might be grabbing a reference to file.producer.thing
, they all need to use from the same producer_proc
. And the producer_proc
should never be blocked. Is this a sane implementation? Does the python GIL make it thread safe, or do I need to reimplement using a Queue for getting data of the worker thread? Do consumers need to explicitly make a copy of the thing?
I guess am trying to implement something like Producer/Consumer pattern or Observer pattern, but I'm not really clear on all the technical details of design patterns.
- A single producer is constantly making things
- Multiple consumers using things at arbitrary times
producer.thing
should be replaced by a fresh thing as soon as the new one is available, most things will go unused but that's ok- It's OK for multiple consumers to read the same thing, or to read the same thing twice in succession. They only want to be sure they have got the most recent thing when asked for it, not some stale old thing.
- A consumer should be able to keep using a thing as long as they have it in scope, even though the producer may have already overwritten his
self.thing
with a fresh new thing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
鉴于您的(不寻常!)要求,您的实现似乎是正确的。特别是,
self.n
因为它是一个“有吸引力的麻烦”(很容易被误用),或者至少添加带有此警告的注释/文档字符串。self.thing
保持活动状态的大量内存。我有点好奇你的要求从何而来。特别是,您不关心
事物
是否从未使用过或使用过多次。Given your (unusual!) requirements, your implementation seems correct. In particular,
self.thing
andself.n
in this code are updated in a separate bytecode instructions. The GIL could be released/acquired between, so you can't get a consistent view of the two of them unless you add locking. If you're not going to do that, I'd suggest removingself.n
as it's an "attractive nuisance" (easily misused) or at least adding a comment/docstring with this caveat.self.thing
(and couldn't with string objects; they're immutable) and Python is garbage-collected, so as long as a consumer grabbed a reference to it, it can keep accessing it without worrying too much about what other threads are doing. The worst that could happen is your program using a lot of memory from several generations ofself.thing
being kept alive.I'm a bit curious where your requirements came from. In particular, that you don't care if a
thing
is never used or used many times.