Python 2.6:使用 multiprocessing.Pool 时处理本地存储
我正在尝试构建一个 python 脚本,该脚本具有跨大量数据的工作进程池(使用 mutiprocessing.Pool)。
我希望每个进程都有一个唯一的对象,可以在该进程的多次执行中使用。
伪代码:
def work(data):
#connection should be unique per process
connection.put(data)
print 'work done with connection:', connection
if __name__ == '__main__':
pPool = Pool() # pool of 4 processes
datas = [1..1000]
for process in pPool:
#this is the part i'm asking about // how do I really do this?
process.connection = Connection(conargs)
for data in datas:
pPool.apply_async(work, (data))
I'm attempting to build a python script that has a pool of worker processes (using mutiprocessing.Pool) across a large set of data.
I want each process to have a unique object that gets used across multiple executes of that process.
Psudo code:
def work(data):
#connection should be unique per process
connection.put(data)
print 'work done with connection:', connection
if __name__ == '__main__':
pPool = Pool() # pool of 4 processes
datas = [1..1000]
for process in pPool:
#this is the part i'm asking about // how do I really do this?
process.connection = Connection(conargs)
for data in datas:
pPool.apply_async(work, (data))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我认为类似的东西应该有效(未经测试)
I think something like that should work (not tested)
直接创建
mp.Process
es 可能是最简单的(没有mp.Pool
):产量
注意
Proc
数字对应于相同的 < code>Conn 每次编号。It may be easiest to create the
mp.Process
es directly (withoutmp.Pool
):yields
Notice the
Proc
numbers correspond to the sameConn
number each time.进程本地存储作为映射容器很容易实现,对于任何从 Google 到这里寻找类似东西的人来说(注意这是 Py3,但很容易转换为 2 的语法(只需从
object
继承):查看更多@ https://github.com/akatrevorjay/pytutils/blob /develop/pytutils/mappings.py
Process local storage is pretty easy to implement as a mapping container, for anyone else getting here from Google looking for something similar (note this is Py3, but easily convertible to 2's syntax (just inherit from
object
):See more @ https://github.com/akatrevorjay/pytutils/blob/develop/pytutils/mappings.py
您希望有一个对象驻留在共享内存中,对吗?
Python 在其标准库中对此有一些支持,但它有点差。据我记得,只能存储整数和其他一些基本类型。
尝试 POSH(Python 对象共享):http://poshmodule.sourceforge.net/
You want to have an object residing in shared memory, right?
Python has some support for that in its standard library, but it's kinda poor. As far as I recall, only Integers and some other primitive types can be stored.
Try POSH (Python Object Sharing): http://poshmodule.sourceforge.net/