Matplotlib，保存到 CString 对象时可以替代 savefig() 来提高性能吗？

发布于 2024-10-24 08:28:39 字数 438 浏览 9 评论 0原文

我正在尝试加快将图表保存为图像的过程。现在我正在创建一个 cString 对象，我使用 savefig 将图表保存到其中；但我真的非常非常感谢任何帮助改进这种保存图像的方法。我必须执行数十次此操作，并且 savefig 命令非常非常慢；一定有更好的方法来做到这一点。我读过一些关于将其保存为未压缩的原始图像的内容，但我不知道如何做到这一点。如果我也可以切换到另一个更快的后端，我真的不在乎 agg。

即：

RAM = cStringIO.StringIO()

CHART = plt.figure(.... 
**code for creating my chart**

CHART.savefig(RAM, format='png')

我一直在使用 matplotlib 和FigureCanvasAgg 后端。

谢谢！

原文

I am trying to speed up the process of saving my charts to images. Right now I am creating a cString Object where I save the chart to by using savefig; but I would really, really appreciate any help to improve this method of saving the image. I have to do this operation dozens of times, and the savefig command is very very slow; there must be a better way of doing it. I read something about saving it as uncompressed raw image, but I have no clue of how to do it. I don't really care about agg if I can switch to another faster backend too.

ie:

RAM = cStringIO.StringIO()

CHART = plt.figure(.... 
**code for creating my chart**

CHART.savefig(RAM, format='png')

I have been using matplotlib with FigureCanvasAgg backend.

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

绝不服输 2024-10-31 08:28:39

如果您只想要一个原始缓冲区，请尝试 fig.canvas.print_rgb、fig.canvas.print_raw 等（两者之间的区别在于 raw code> 是 rgba，而 rgb 是 rgb。还有 print_png、print_ps 等）

这将使用 fig.dpi 而不是 savefig 的默认 dpi 值 (100 dpi)。尽管如此，即使比较 fig.canvas.print_raw(f) 和 fig.savefig(f, format='raw', dpi=fig.dpi) print_canvas 版本~~稍微快一点~~ 快得多，因为它不需要重置轴补丁的颜色等，而 savefig 默认情况下会这样做。

但无论如何，以原始格式保存图形所花费的大部分时间只是绘制图形，这是无法避免的。

无论如何，作为一个毫无意义但有趣的示例，请考虑以下内容：

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

plt.ion()
fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    fig.canvas.draw()

Brownian walk Animation

如果我们看一下原始图像绘制时间：

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    fig.canvas.draw()

在我的机器上大约需要 25 秒。

如果我们将原始 RGBA 缓冲区转储到 cStringIO 缓冲区，它实际上会稍微快一些，大约 22 秒（这只是因为我使用的是交互式后端！否则它是等效的。）：

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    ram = cStringIO.StringIO()
    fig.canvas.print_raw(ram)
    ram.close()

如果我们将此与使用 < code>savefig，具有相对设置的 dpi：

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    ram = cStringIO.StringIO()
    fig.savefig(ram, format='raw', dpi=fig.dpi)
    ram.close()

这大约需要 23.5 秒。基本上，在本例中，savefig 只是设置一些默认参数并调用 print_raw，因此几乎没有什么区别。

现在，如果我们将原始图像格式与压缩图像格式 (png) 进行比较，我们会发现更显着的差异：

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    ram = cStringIO.StringIO()
    fig.canvas.print_png(ram)
    ram.close()

这大约需要 52 秒！显然，压缩图像会产生大量开销。

无论如何，这可能是一个不必要的复杂示例......我想我只是想避免实际工作......

If you just want a raw buffer, try fig.canvas.print_rgb, fig.canvas.print_raw, etc (the difference between the two is that raw is rgba, whereas rgb is rgb. There's also print_png, print_ps, etc)

This will use fig.dpi instead of the default dpi value for savefig (100 dpi). Still, even comparing fig.canvas.print_raw(f) and fig.savefig(f, format='raw', dpi=fig.dpi) the print_canvas version is ~~marginally faster~~ insignificantly faster, since it doesn't bother resetting the color of the axis patch, etc, that savefig does by default.

Regardless, though, most of the time spent saving a figure in a raw format is just drawing the figure, which there's no way to get around.

At any rate, as a pointless-but-fun example, consider the following:

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

plt.ion()
fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    fig.canvas.draw()

Brownian walk animation

If we look at the raw draw time:

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    fig.canvas.draw()

This takes ~25 seconds on my machine.

If we instead dump a raw RGBA buffer to a cStringIO buffer, it's actually marginally faster at ~22 seconds (This is only true because I'm using an interactive backend! Otherwise it would be equivalent.):

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    ram = cStringIO.StringIO()
    fig.canvas.print_raw(ram)
    ram.close()

If we compare this to using savefig, with a comparably set dpi:

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    ram = cStringIO.StringIO()
    fig.savefig(ram, format='raw', dpi=fig.dpi)
    ram.close()

This takes ~23.5 seconds. Basically, savefig just sets some default parameters and calls print_raw, in this case, so there's very little difference.

Now, if we compare a raw image format with a compressed image format (png), we see a much more significant difference:

import matplotlib.pyplot as plt
import numpy as np
import cStringIO

fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)

for i in xrange(1000):
    xy = np.random.random(2*num).reshape(num,2) - 0.5
    offsets = scat.get_offsets() + 0.3 * xy
    offsets.clip(0, max_dim, offsets)
    scat.set_offsets(offsets)
    scat._sizes += 30 * (np.random.random(num) - 0.5)
    scat._sizes.clip(1, 300, scat._sizes)
    ram = cStringIO.StringIO()
    fig.canvas.print_png(ram)
    ram.close()

This takes ~52 seconds! Obviously, there's a lot of overhead in compressing an image.

At any rate, this is probably a needlessly complex example... I think I just wanted to avoid actual work...

回复收藏 0 原文