我需要进行哪些分析来优化多步骤生产者-消费者模型?
我有一个三步生产者/消费者设置。
Client
创建 JSON 编码的字典并通过命名管道将它们发送到 PipeServer
以下是我的 threading.Thread 子类:
PipeServer
创建一个命名管道并将消息放入队列未处理的消息
处理器
从未处理的消息
获取项目,处理它们(通过 lambda 函数参数),然后将它们放入队列已处理的消息
Printers
从已处理的消息
获取项目,获取锁,打印消息,然后释放锁。
在测试脚本中,我有一台 PipeServer、一台处理器和 4 台打印机:
pipe_name = '\\\\.\\pipe\\testpipe'
pipe_server = pipetools.PipeServer(pipe_name, unprocessed_messages)
json_loader = lambda x: json.loads(x.decode('utf-8'))
processor = threadedtools.Processor(unprocessed_messages,
processed_messages,
json_loader)
print_servers = []
for i in range(4):
print_servers.append(threadedtools.Printer(processed_messages,
output_lock,
'PRINTER {0}'.format(i)))
pipe_server.start()
processor.start()
for print_server in print_servers:
print_server.start()
问题:在这种多步骤设置中,我如何考虑优化我应该拥有的打印机与处理器线程的数量?例如,我如何知道 4 是否是最佳的打印机线程数?我应该有更多的处理器吗?
我通读了 Python Profilers 文档,但没有看到任何可以帮助我思考此类权衡的内容。
I have a 3-step producer/consumer setup.
Client
creates JSON-encoded dictionaries and sends them to PipeServer
via a named pipe
Here are my threading.Thread subclasses:
PipeServer
creates a named pipe and places messages into a queue unprocessed messages
Processor
gets items from unprocessed messages
, processes them (via a lambda function argument), and puts them into a queue processed messages
Printers
gets items from processed messages
, acquires a lock, prints the message, and releases the lock.
In the test script, I have one PipeServer, one Processor, and 4 Printers:
pipe_name = '\\\\.\\pipe\\testpipe'
pipe_server = pipetools.PipeServer(pipe_name, unprocessed_messages)
json_loader = lambda x: json.loads(x.decode('utf-8'))
processor = threadedtools.Processor(unprocessed_messages,
processed_messages,
json_loader)
print_servers = []
for i in range(4):
print_servers.append(threadedtools.Printer(processed_messages,
output_lock,
'PRINTER {0}'.format(i)))
pipe_server.start()
processor.start()
for print_server in print_servers:
print_server.start()
Question: in this kind of multi-step setup, how do I think through optimizing the number of Printer vs. Processor threads I should have? For example, how do I know if 4 is the optimal number of Printer threads to have? Should I have more processors?
I read through the Python Profilers docs, but didn't see anything that would help me think through these kinds of tradeoffs.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一般来说,您希望优化最慢组件的最大吞吐量。在这种情况下,它听起来像是客户端或打印机。如果是客户端,您需要足够的打印机和处理器来跟上新消息(也许这只是其中之一!)。否则,您将在不需要的线程上浪费资源。
如果是打印机,那么您需要针对正在发生的 IO 进行优化。需要考虑的一些变量:
如果你只能拥有一把锁,那么你就应该只有一个线程,依此类推。
然后,您想要测试真实世界的操作(很难预测 RAM、磁盘和网络活动的哪种组合会减慢您的速度)。检测您的代码,以便您可以查看在任何给定时间有多少线程处于空闲状态。然后创建一个测试用例,以最大吞吐量将数据处理到系统中。从每个组件的任意数量的线程开始。如果客户端、处理器或打印机线程始终繁忙,请添加更多线程。如果某些线程始终处于空闲状态,请删除一些线程。
如果将代码移动到不同的硬件环境,您可能需要重新调整 - 不同数量的处理器、更多的内存、不同的磁盘都会产生影响。
Generally speaking, you want to optimize for the maximum throughput of your slowest component. In this case, it sounds like either Client or Printer. If it's the Client, you want just enough Printers and Processors to be able to keep up with new messages (maybe that's just one!). Otherwise you'll be wasting resources on threads you don't need.
If it's Printers, then you need to optimize for the IO that's occurring. A few variables to take into account:
If you can only have one lock, then you should only have one thread, so on and so forth.
You then want to test with real world operation (it's difficult to predict what combination of RAM, disk and network activity will slow you down). Instrument your code so you can see how many threads are idle at any given time. Then create a test case that processes data into the system at maximum throughput. Start with an arbitrary number of threads for each component. If Client, Processor, or Printer threads are always busy, add more threads. If some threads are always idle, take some away.
You may need to retune if you move the code to a different hardware environment - different number of processors, more memory, different disk can all have an effect.