使用 scipy.interpolate 进行样条表示:低振幅、快速振荡函数的插值效果较差
我需要(以数字方式)计算一个函数的一阶和二阶导数,我尝试使用 splrep 和 UnivariateSpline 来创建样条线以进行插值函数取导数。
然而,对于幅度为 10^-1 或更低的函数(快速)振荡,样条表示本身似乎存在固有问题。
作为示例,请考虑使用以下代码在区间 (0,6*pi) 上创建正弦函数的样条表示(因此该函数仅振荡 3 次):
import scipy
from scipy import interpolate
import numpy
from numpy import linspace
import math
from math import sin
k = linspace(0, 6.*pi, num=10000) #interval (0,6*pi) in 10'000 steps
y=[]
A = 1.e0 # Amplitude of sine function
for i in range(len(k)):
y.append(A*sin(k[i]))
tck =interpolate.UnivariateSpline(x, y, w=None, bbox=[None, None], k=5, s=2)
M=tck(k)
以下是 A = 1.e0 时 M 的结果A = 1.e-2
https://i.sstatic.net/HOmtf.png 幅度= 1
https://i.sstatic.net/W8r3l.png 振幅 = 1/100
显然由样条线创建的插值函数完全不正确!第二张图甚至没有振荡正确的频率。
有人对这个问题有任何见解吗?或者知道在 numpy/scipy 中创建样条线的另一种方法吗?
干杯, 罗里
I need to (numerically) calculate the first and second derivative of a function for which I've attempted to use both splrep
and UnivariateSpline
to create splines for the purpose of interpolation the function to take the derivatives.
However, it seems that there's an inherent problem in the spline representation itself for functions who's magnitude is order 10^-1 or lower and are (rapidly) oscillating.
As an example, consider the following code to create a spline representation of the sine function over the interval (0,6*pi) (so the function oscillates three times only):
import scipy
from scipy import interpolate
import numpy
from numpy import linspace
import math
from math import sin
k = linspace(0, 6.*pi, num=10000) #interval (0,6*pi) in 10'000 steps
y=[]
A = 1.e0 # Amplitude of sine function
for i in range(len(k)):
y.append(A*sin(k[i]))
tck =interpolate.UnivariateSpline(x, y, w=None, bbox=[None, None], k=5, s=2)
M=tck(k)
Below are the results for M for A = 1.e0 and A = 1.e-2
https://i.sstatic.net/HOmtf.png Amplitude = 1
https://i.sstatic.net/W8r3l.png Amplitude = 1/100
Clearly the interpolated function created by the splines is totally incorrect! The 2nd graph does not even oscillate the correct frequency.
Does anyone have any insight into this problem? Or know of another way to create splines within numpy/scipy?
Cheers,
Rory
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
<罢工>
我猜你的问题是由于别名造成的。
您的示例中的x
是什么?如果您插值的
x
值的间距小于原始点的间距,您本质上就会丢失频率信息。这完全独立于任何类型的插值。这是下采样所固有的。别介意上面关于别名的事情。它不适用于这种情况(尽管我仍然不知道您的示例中的 x 是什么...
我刚刚意识到您正在评估您的观点at当您使用非零平滑因子 (
s
) 时,平滑不会完全适合数据。请尝试将
s=0
放入。作为一个简单的例子:
您只能清楚地看到低幅度平滑效果的原因是平滑的方式因子已定义。有关更多详细信息,请参阅 scipy.interpolate.UnivariateSpline 的文档。
使用平滑,则即使幅度较高,插值数据也不会与原始数据匹配。
例如,如果 只需更改在上面的代码示例中,将幅度 (
A
) 更改为1.0
,我们仍然会看到平滑的效果...I'm guessing that your problem is due to aliasing.
What isx
in your example?If the
x
values that you're interpolating at are less closely spaced than your original points, you'll inherently lose frequency information. This is completely independent from any type of interpolation. It's inherent in downsampling.Nevermind the above bit about aliasing. It doesn't apply in this case (though I still have no idea what
x
is in your example...I just realized that you're evaluating your points at the original input points when you're using a non-zero smoothing factor (
s
).By definition, smoothing won't fit the data exactly. Try putting
s=0
in instead.As a quick example:
The reason that you're only clearly seeing the effects of smoothing with a low amplitude is due to the way the smoothing factor is defined. See the documentation for
scipy.interpolate.UnivariateSpline
for more details.Even with a higher amplitude, the interpolated data won't match the original data if you use smoothing.
For example, if we just change the amplitude (
A
) to1.0
in the code example above, we'll still see the effects of smoothing...问题在于为
s
参数选择合适的值。其值取决于数据的缩放比例。仔细阅读文档,可以推断出应该围绕 s = len(y) * np.var(y) 来选择参数,即数据点数量 * 方差。以 s = 0.05 * len(y) * np.var(y) 为例,给出一个不依赖于数据缩放或数据点数量的平滑样条曲线。
编辑:
s
的合理值当然还取决于数据中的噪声水平。文档似乎建议在(m - sqrt(2*m)) * std**2 <= s <= (m + sqrt(2*m)) 范围内选择
其中s
)) * std**2std
是与要平滑的“噪声”相关的标准差。The problem is in choosing suitable values for the
s
parameter. Its values depend on the scaling of the data.Reading the documentation carefully, one can deduce that the parameter should be chosen around
s = len(y) * np.var(y)
, i.e. # of data points * variance. Taking for examples = 0.05 * len(y) * np.var(y)
gives a smoothing spline that does not depend on the scaling of the data or the number of data points.EDIT: sensible values for
s
depend of course also on the noise level in the data. The docs seem to recommend choosings
in the range(m - sqrt(2*m)) * std**2 <= s <= (m + sqrt(2*m)) * std**2
wherestd
is the standard deviation associated with the "noise" you want to smooth over.