线程化快速创建大量图表
我一直在尝试寻找使以下代码执行得更快的方法:
def do_chart(target="IMG_BACK", xlabel="xlabel", ylabel="ylabel", title="title", ydata=pylab.arange(1961, 2031, 1)):
global MYRAMDICT
MYRAMDICT = {}
print "here"
for i in range(70):
MYRAMDICT[i] = cStringIO.StringIO()
xdata = pylab.arange(1961, 2031, 1)
pylab.figure(num=None, figsize=(10.24, 5.12), dpi=1, facecolor='w', edgecolor='k')
pylab.plot(xdata, ydata, linewidth=3.0)
pylab.xlabel(xlabel); pylab.ylabel(ylabel); pylab.title(i)
pylab.grid(True)
pylab.savefig(MYRAMDICT[i], format='png')
pylab.close()
此函数(请忽略 pylab 命令,它们在这里只是为了说明)创建一个字典(MYTAMDICT),我用 cString 对象填充该字典,用于存储内存图表。这些图表稍后会动态呈现给用户。
有人可以帮助我利用线程,以便我可以使用所有核心并使该功能执行得更快吗?或者给我指出改进的想法?
I have been trying to find ways to make the following piece of code perform faster:
def do_chart(target="IMG_BACK", xlabel="xlabel", ylabel="ylabel", title="title", ydata=pylab.arange(1961, 2031, 1)):
global MYRAMDICT
MYRAMDICT = {}
print "here"
for i in range(70):
MYRAMDICT[i] = cStringIO.StringIO()
xdata = pylab.arange(1961, 2031, 1)
pylab.figure(num=None, figsize=(10.24, 5.12), dpi=1, facecolor='w', edgecolor='k')
pylab.plot(xdata, ydata, linewidth=3.0)
pylab.xlabel(xlabel); pylab.ylabel(ylabel); pylab.title(i)
pylab.grid(True)
pylab.savefig(MYRAMDICT[i], format='png')
pylab.close()
This function (please ignore the pylab commands, they are here just for illustration) creates a dictionary (MYTAMDICT) which i populated with cString objects that are used to store charts on memmory. These charts are later dynamically presented to the user.
Would somebody please help me to make use of threading so that I can use all of my cores and make this function perform faster? Or point me towards ideas to improve it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
对于描述,使用多处理比线程要好得多...你有一个“令人尴尬的并行”问题,并且没有磁盘 IO 限制(你正在写入内存)当然,在之间来回传递大的东西这个过程会变得很昂贵,但是返回一个表示 .png 的字符串应该不会太糟糕。
它可以很简单地完成:
如果不使用多重处理,这在我的机器上大约需要 250 秒。对于多处理(8 核),大约需要 40 秒。
希望能有点帮助...
For the description, you'd be far better off using multiprocessing than threading... You have an "embarrassingly parallel" problem, and no disk IO constraints (you're writing to memory) Of course, passing large stuff back and forth between the processes will get expensive, but returning a string representing a .png shouldn't be too bad..
It can be done quite simply:
Without using multiprocessing, this takes ~250 secs on my machine. With multiprocessing (8 cores), it takes ~40 secs.
Hope that helps a bit...
当且仅当 pylab 在执行时释放 gil 时,线程才会为您提供帮助。
而且,pylib 必须是线程安全的,并且您的代码必须以线程安全的方式使用它,但情况可能并不总是如此。
也就是说,如果您要使用线程,我认为这是作业队列的经典案例;因此,我会使用 队列对象,这已经足够好了照顾这个模式。
这是我通过干预您的代码和队列文档中给出的示例给出的一个示例。我什至没有彻底检查它,所以它会有错误;它更重要的是提供一个想法而不是其他任何东西。
Threading will help you if and only if pylab is releasing the gil while executing.
Moreover, pylib must be thread-safe, and your code must use it in a thread-safe way, and this may not be always the case.
That said, if you are going to use threads, I think this is a classical case of job queue; therefore, I would use a queue object, that is nice enough to take care of this pattern.
Here is an example I have put out just by meddling with your code and the example given in the queue documentation. I did not even checked it thoroughly, so it WILL have bugs; it is more to give an idea than anything else.