减少数值计算大量积分的冗余

发布于 2025-01-09 10:19:16 字数 1202 浏览 2 评论 0原文

我需要在 2D 网格（x，y 位置）上计算以下积分：

$\int_{0}^ {r}\sqrt{1-e^{-s^2}} ds$

与 r = sqrt(x^2 + y^2) 且二维网格以 x=y=0 为中心。实现很简单：

import numpy as np
from scipy import integrate

def integralFunction(x):
    def squareSaturation(y):
        return np.sqrt(1-np.exp(-y**2))
    return integrate.quad(squareSaturation,0,x)[0]

#vectorize function to apply function with integrals on np-array
integralFunctionVec = np.vectorize(integralFunction)

xmax = ymax = 5
Nx = Ny = 1024

X, Y = np.linspace(-xmax, xmax, Nx), np.linspace(-ymax, ymax, Ny)
X, Y = np.meshgrid(X, Y)

R = np.sqrt(X**2+Y**2)
Z = integralFunctionVec(R)

但是，我目前正在处理 1024x1024 网格，计算大约需要 1.5 分钟。现在这些计算中有一些冗余，我想减少这些冗余以加快计算速度。即：

由于网格以 r = 0 为中心，因此网格上的许多 r 值是相同的。由于对称性，所有值中只有约 1/8 是唯一的（对于方形网格）。一种想法是仅计算唯一值的积分（通过 np.unique 找到），然后将它们保存在查找表中（哈希图？）或者我可以缓存函数值，以便只计算新值（通过 @ lru_缓存）。但是当我之后对函数进行矢量化时，这实际上有效吗？
当积分从 0 到 r 时，积分通常会在已计算的区间内计算积分。例如，如果您从 0 到 1 进行计算，然后再从 0 到 2 进行计算，则只有从 1 到 2 的区间是“新的”。但利用它的最佳方法是什么？使用 scipy.integrate.quad 是否能真正提升性能？

您有任何反馈或其他想法来优化此计算吗？

原文

I need to calculate the following integral on a 2D-grid (x,y positions):

$\int_{0}^{r}\sqrt{1-e^{-s^2}} ds$

with r = sqrt(x^2 + y^2) and the 2D-grid centered at x=y=0.
The implementation is straightforward:

import numpy as np
from scipy import integrate

def integralFunction(x):
    def squareSaturation(y):
        return np.sqrt(1-np.exp(-y**2))
    return integrate.quad(squareSaturation,0,x)[0]

#vectorize function to apply function with integrals on np-array
integralFunctionVec = np.vectorize(integralFunction)

xmax = ymax = 5
Nx = Ny = 1024

X, Y = np.linspace(-xmax, xmax, Nx), np.linspace(-ymax, ymax, Ny)
X, Y = np.meshgrid(X, Y)

R = np.sqrt(X**2+Y**2)
Z = integralFunctionVec(R)

However, I'm currently working on a 1024x1024 grid and the calculation takes ~1.5 minutes. Now there is some redundancy in those calculations that I want to reduce to speed up the calculation. Namely:

As the grid is centered around r = 0, many values for r on the grid are the same. Due to symmetry only ~1/8 of all values are unique (for a square grid). One idea was to calculate the integral only for the unique values (found via np.unique) and then save them in a look-up table (hashmap?) Or I could cache the function values so that only new values are calculated (via @lru_cache). But does that actually work when I vectorize the function afterwards?
As the integral goes from 0 to r, the integral is often calculating integrals over intervals it has already calculated. E.g. if you calculate from 0 to 1 and afterwards from 0 to 2, only the interval from 1 to 2 is "new". But what would be the best way to utilize that? And would that even be a real performance boost using scipy.integrate.quad?

Do you have any feedback or other ideas to optimize this calculation?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

岁月静好 2025-01-16 10:19:16

您可以使用Numba来加速quad的计算。下面是一个例子：

import numpy as np
import numba as nb
from scipy import integrate

@nb.cfunc('float64(float64)')
def numbaSquareSaturation(y):
    return np.sqrt(1-np.exp(-y**2))

squareSaturation = scipy.LowLevelCallable(numbaSquareSaturation.ctypes)

def integralFunction(x):
    return integrate.quad(squareSaturation,0,x)[0]

integralFunctionVec = np.vectorize(integralFunction)

xmax = ymax = 5
Nx = Ny = 1024

X, Y = np.linspace(-xmax, xmax, Nx), np.linspace(-ymax, ymax, Ny)
X, Y = np.meshgrid(X, Y)

R = np.sqrt(X**2+Y**2)
Z = integralFunctionVec(R)

在我的机器上这大约快了 25 倍。该代码仍然不是最佳的，因为 squareSaturation 调用会带来很大的开销，但 SciPy 似乎没有提供一种针对您的情况有效矢量化 quad 的方法。请注意，使用 nb.cfunc+scipy.LowLevelCallable 可以显着加快执行速度，如 @max9111 所指出的。

由于网格以 r = 0 为中心，因此网格上的许多 r 值是相同的。由于对称性，所有值中只有约 1/8 是唯一的（对于方形网格）。一种想法是仅计算唯一值的积分（通过 np.unique 找到），然后将它们保存在查找表中（哈希图？）或者我可以缓存函数值，以便只计算新值（通过 @ lru_缓存）。但是当我之后对函数进行矢量化时，这实际上有效吗？

尽管不重新计算值确实是一个好主意，但我并不认为这种方法会明显更快。请注意，hashmap 和 np.unique 一样非常慢。我建议您只选择输入数组 R 的四分之一。类似于 R[0:R.shape[0]//2, 0:R.shape[1]//2]。如果形状奇怪，请小心。

当积分从 0 到 r 时，积分通常会在已计算的区间内计算积分。例如，如果您从 0 到 1 进行计算，然后再从 0 到 2 进行计算，则只有从 1 到 2 的区间是“新的”。但利用它的最佳方法是什么？使用 scipy.integrate.quad 是否能真正提升性能？

这可能会有所帮助，因为积分的域更小并且函数应该更平滑。这意味着 Scipy 的计算速度应该更快。即使它不会自动执行此操作，您也可以使用 quad。

You can use Numba to speed up the computation of quad. Here is an example:

import numpy as np
import numba as nb
from scipy import integrate

@nb.cfunc('float64(float64)')
def numbaSquareSaturation(y):
    return np.sqrt(1-np.exp(-y**2))

squareSaturation = scipy.LowLevelCallable(numbaSquareSaturation.ctypes)

def integralFunction(x):
    return integrate.quad(squareSaturation,0,x)[0]

integralFunctionVec = np.vectorize(integralFunction)

xmax = ymax = 5
Nx = Ny = 1024

X, Y = np.linspace(-xmax, xmax, Nx), np.linspace(-ymax, ymax, Ny)
X, Y = np.meshgrid(X, Y)

R = np.sqrt(X**2+Y**2)
Z = integralFunctionVec(R)

This is about 25 times faster on my machine. The code is still suboptimal since squareSaturation calls introduces a big overhead but is seems SciPy does not provide a way to vectorize quad efficiently for your case. Note that using nb.cfunc+scipy.LowLevelCallable significantly speed up the execution as pointed out by @max9111.

As the grid is centered around r = 0, many values for r on the grid are the same. Due to symmetry only ~1/8 of all values are unique (for a square grid). One idea was to calculate the integral only for the unique values (found via np.unique) and then save them in a look-up table (hashmap?) Or I could cache the function values so that only new values are calculated (via @lru_cache). But does that actually work when I vectorize the function afterwards?

I do not expect this approach to be significantly faster although not recomputing the values is indeed a good idea. Note that hashmap are pretty slow as well as np.unique. I suggest you to just select the quarter of the input array R. Something like R[0:R.shape[0]//2, 0:R.shape[1]//2]. Be careful if the shape is odd.

As the integral goes from 0 to r, the integral is often calculating integrals over intervals it has already calculated. E.g. if you calculate from 0 to 1 and afterwards from 0 to 2, only the interval from 1 to 2 is "new". But what would be the best way to utilize that? And would that even be a real performance boost using scipy.integrate.quad?

This could help since the domain of a integral is smaller and the function should be smoother. This means Scipy should be faster to compute it. Even if it would not do that automatically, you can reduce the precision of the computed sub-intervals using optional parameters of quad.

回复收藏 0 原文

~没有更多了~