运行尽可能多的程序实例
我正在尝试实现一些代码,以通过服务的 API 从另一个服务导入用户数据。我要设置的方法是将所有请求作业保存在一个队列中,我的简单导入程序将从该队列中提取。一次处理一项任务不会接近耗尽计算机的任何资源,所以我想知道构造一个程序同时运行多个“作业”的标准方法是什么?我是否应该研究线程或可能是一个从队列中提取作业并启动导入程序实例的程序?感谢您的帮助。
编辑:我现在所拥有的是Python,尽管如果需要的话我愿意用另一种语言重写它。
I'm trying to implement some code to import user's data from another service via the service's API. The way I'm going to set it up is all the request jobs will be kept in a queue which my simple importer program will draw from. Handling one task at a time won't come anywhere close to maxing out any of the computer's resources so I'm wondering what is the standard way to structure a program to run multiple "jobs" at once? Should I be looking into threading or possibly a program that pulls the jobs from the queue and launches instances of the importer program? Thanks for the help.
EDIT: What I have right now is in Python although I'm open to rewriting it in another language if need be.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用生产者-消费者队列,根据需要使用尽可能多的消费者线程来优化资源使用在主机上(抱歉 - 这是非常模糊的建议,但“正确的数字”取决于问题)。
如果请求是轻量级的,您很可能只需要一个生产者线程来处理它们。
启动多个进程也可以工作 - 最佳选择取决于您的要求。您是否需要生产者知道该操作是否有效,或者是“一劳永逸”?失败时是否需要重试逻辑?您如何统计该模型中并发消费者的数量?等等。
对于 Python,请查看此。
Use a Producer-Consumer queue, with as many Consumer threads as you need to optimize resource usage on the host (sorry - that's very vague advice, but the "right number" is problem-dependent).
If requests are lightweight you may well only need one Producer thread to handle them.
Launching multiple processes could work too - best choice depends on your requirements. Do you need the Producer to know whether the operation worked, or is it 'fire-and-forget'? Do you need retry logic in the event of failure? How do you keep count of concurrent Consumers in this model? And so on.
For Python, take a look at this.