缩放模型的问题以获取卡方左右1
据我所知,我发现降低的r^2代表了合适的好处,它必须大约为1。
我有3个QuenStions:
1-我的模型是否正确考虑拟合模型中数据的实验不《 andiantian)吗?
2-如何将减少的r^2缩放到1左右?当我阅读时,我必须找到一个更好的诱人,但是如果我要处理Severl Fit模型,我是否应该独立缩放每个模型,或者所有这些都应该具有相同的捕捉?
3-是否还有其他统计参数可以确定降低的R^2之外的几个拟合模型之间的最佳拟合模型,例如确定系数?
这是我的代码和数据
import numpy as np
from lmfit import Minimizer, Parameters, fit_report
import scipy.constants as sc
x=np.array([-0.55140294, -0.53524022, -0.51157048, -0.46087502, -0.36867469,
-0.36463724, -0.33394184, -0.31759441, -0.24301006, -0.19108247,
-0.17992331, -0.15978266, -0.12565861, -0.06353306, -0.03416749,
0. , 0.05381462, 0.05741924, 0.12256307, 0.13431629,
0.15070565, 0.24975708, 0.30714032, 0.36017666, 0.36062918,
0.44028426, 0.46225328, 0.49792282, 0.51374084, 0.54654304,
0.5681194 ])
Data=np.array([3.10022439e-07, 5.28936553e-07, 1.08389150e-06, 4.51866520e-06,
4.03761662e-05, 4.51393005e-05, 8.40351313e-05, 1.11528073e-04,
3.84782855e-04, 7.40352481e-04, 8.34990761e-04, 1.01873368e-03,
1.31826299e-03, 1.85815838e-03, 2.02222472e-03, 2.09575003e-03,
1.98385694e-03, 1.91161400e-03, 1.40295968e-03, 1.28660846e-03,
1.12519421e-03, 3.50500171e-04, 1.43286865e-04, 5.16806731e-05,
4.98973395e-05, 8.13812022e-06, 4.47760445e-06, 1.66754498e-06,
1.07843394e-06, 3.95451986e-07, 1.97595319e-07])
uncertainities=np.array([3.45420178e-06, 3.69177339e-06, 4.15460196e-06, 5.79355068e-06,
1.25738539e-05, 1.30260341e-05, 1.69780223e-05, 1.94687535e-05,
3.41877881e-05, 4.68776601e-05, 4.97061736e-05, 5.47758901e-05,
6.29249004e-05, 7.43568853e-05, 7.73802945e-05, 7.85165928e-05,
7.49519891e-05, 7.44931618e-05, 6.24757861e-05, 5.97435073e-05,
5.57673389e-05, 3.15179837e-05, 2.03474708e-05, 1.29556740e-05,
1.29044986e-05, 6.53164560e-06, 5.52338978e-06, 4.36260354e-06,
3.99640819e-06, 3.45133957e-06, 3.21225716e-06])
def fGaussian(x,sig,A,xc,y0):
y=A/((2*sc.pi)**0.5*sig)*np.exp(-(x-xc)**2/(2*sig**2))+y0
return (y)
def fcn2min_fGauss(params, x, data, uncertainities):
sig = params['sig']
A= params['A']
xc= params['xc']
y0= params['y0']
model = fGaussian(x,sig,A,xc,y0)
return (model - data)/uncertainities
# =============================================================================
# create a set of Parameters for fGauss
params_fGauss= Parameters()
params_fGauss.add('sig', value=10**-3, min=0)
params_fGauss.add('A', value=10**-4, min=0)
params_fGauss.add('xc', value=0, min=0)
params_fGauss.add('y0', value=0, min=0)
print ('******************** Gauss Parameters **********************')
params_fGauss.pretty_print()
print ('************************************************************')
# =============================================================================
print('fitters initiation')
fcn_args=(x, Data,uncertainities)
fitter_fGauss = Minimizer(fcn2min_fGauss, params_fGauss, fcn_args=fcn_args,nan_policy='propagate')
# =============================================================================
# Least-Squares minimization, using Trust Region Reflective method
result_least_squares_Gauss = fitter_fGauss.minimize(method='least_squares',jac='3-point',loss='soft_l1')
print('fGauss fit report ---------------------------------')
print(fit_report(result_least_squares_Gauss))
拟合报告为
[[Fit Statistics]]
# fitting method = least_squares
# function evals = 37
# data points = 31
# variables = 4
chi-square = 0.91407466
reduced chi-square = 0.03385462
Akaike info crit = -101.238737
Bayesian info crit = -95.5027884
[[Variables]]
sig: 0.13202573 +/- 2.6268e-04 (0.20%) (init = 0.001)
A: 6.9992e-04 +/- 1.6989e-06 (0.24%) (init = 0.0001)
xc: 0.00130681 +/- 3.3337e-04 (25.51%) (init = 0)
y0: 1.4545e-24 +/- 2.4519e-07 (16857606207555559424.00%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
C(sig, y0) = -0.314
C(A, y0) = -0.149
C(sig, xc) = 0.124
C(sig, A) = 0.115
I have problem with finding the goodness of fit, as far as I know the reduced R^2 represents the goodness of fit and it have to be around 1. but in my case it is 0.033.
I have 3 quenstions:
1- Does my model is right to consider the expermental uncertianites of the data in the fitting model?
2- How can I scale the reduced R^2 to go around 1? as I readed, I have to find a better scalling, but if i am dealing with severl fit models, should i scale each one independently or all of them should have the same scalling?
3- Is there any other statistcal parameters to determine the best fit model between several fit models other than the reduced R^2 such as the Coefficient of determination?
Here is my code and data
import numpy as np
from lmfit import Minimizer, Parameters, fit_report
import scipy.constants as sc
x=np.array([-0.55140294, -0.53524022, -0.51157048, -0.46087502, -0.36867469,
-0.36463724, -0.33394184, -0.31759441, -0.24301006, -0.19108247,
-0.17992331, -0.15978266, -0.12565861, -0.06353306, -0.03416749,
0. , 0.05381462, 0.05741924, 0.12256307, 0.13431629,
0.15070565, 0.24975708, 0.30714032, 0.36017666, 0.36062918,
0.44028426, 0.46225328, 0.49792282, 0.51374084, 0.54654304,
0.5681194 ])
Data=np.array([3.10022439e-07, 5.28936553e-07, 1.08389150e-06, 4.51866520e-06,
4.03761662e-05, 4.51393005e-05, 8.40351313e-05, 1.11528073e-04,
3.84782855e-04, 7.40352481e-04, 8.34990761e-04, 1.01873368e-03,
1.31826299e-03, 1.85815838e-03, 2.02222472e-03, 2.09575003e-03,
1.98385694e-03, 1.91161400e-03, 1.40295968e-03, 1.28660846e-03,
1.12519421e-03, 3.50500171e-04, 1.43286865e-04, 5.16806731e-05,
4.98973395e-05, 8.13812022e-06, 4.47760445e-06, 1.66754498e-06,
1.07843394e-06, 3.95451986e-07, 1.97595319e-07])
uncertainities=np.array([3.45420178e-06, 3.69177339e-06, 4.15460196e-06, 5.79355068e-06,
1.25738539e-05, 1.30260341e-05, 1.69780223e-05, 1.94687535e-05,
3.41877881e-05, 4.68776601e-05, 4.97061736e-05, 5.47758901e-05,
6.29249004e-05, 7.43568853e-05, 7.73802945e-05, 7.85165928e-05,
7.49519891e-05, 7.44931618e-05, 6.24757861e-05, 5.97435073e-05,
5.57673389e-05, 3.15179837e-05, 2.03474708e-05, 1.29556740e-05,
1.29044986e-05, 6.53164560e-06, 5.52338978e-06, 4.36260354e-06,
3.99640819e-06, 3.45133957e-06, 3.21225716e-06])
def fGaussian(x,sig,A,xc,y0):
y=A/((2*sc.pi)**0.5*sig)*np.exp(-(x-xc)**2/(2*sig**2))+y0
return (y)
def fcn2min_fGauss(params, x, data, uncertainities):
sig = params['sig']
A= params['A']
xc= params['xc']
y0= params['y0']
model = fGaussian(x,sig,A,xc,y0)
return (model - data)/uncertainities
# =============================================================================
# create a set of Parameters for fGauss
params_fGauss= Parameters()
params_fGauss.add('sig', value=10**-3, min=0)
params_fGauss.add('A', value=10**-4, min=0)
params_fGauss.add('xc', value=0, min=0)
params_fGauss.add('y0', value=0, min=0)
print ('******************** Gauss Parameters **********************')
params_fGauss.pretty_print()
print ('************************************************************')
# =============================================================================
print('fitters initiation')
fcn_args=(x, Data,uncertainities)
fitter_fGauss = Minimizer(fcn2min_fGauss, params_fGauss, fcn_args=fcn_args,nan_policy='propagate')
# =============================================================================
# Least-Squares minimization, using Trust Region Reflective method
result_least_squares_Gauss = fitter_fGauss.minimize(method='least_squares',jac='3-point',loss='soft_l1')
print('fGauss fit report ---------------------------------')
print(fit_report(result_least_squares_Gauss))
The fitting report is
[[Fit Statistics]]
# fitting method = least_squares
# function evals = 37
# data points = 31
# variables = 4
chi-square = 0.91407466
reduced chi-square = 0.03385462
Akaike info crit = -101.238737
Bayesian info crit = -95.5027884
[[Variables]]
sig: 0.13202573 +/- 2.6268e-04 (0.20%) (init = 0.001)
A: 6.9992e-04 +/- 1.6989e-06 (0.24%) (init = 0.0001)
xc: 0.00130681 +/- 3.3337e-04 (25.51%) (init = 0)
y0: 1.4545e-24 +/- 2.4519e-07 (16857606207555559424.00%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
C(sig, y0) = -0.314
C(A, y0) = -0.149
C(sig, xc) = 0.124
C(sig, A) = 0.115
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
看起来您的不确定性被低估了。如果您的数据(类似于高斯至高精度)接近0,则不确定性约为值的10倍。在峰附近,不确定性约为该值的3%。但是,如果数据确实具有如此不确定性,那么您会期望实际数据的差异会有所不同。您可以通过模拟预期幅度和宽度的高斯来测试,并添加5.e-5左右的假噪声(例如,使用
numpy.random.normal.normal(scale = 5.e-5,size size) = len(x)
)。 〜SQRT(0.03385)〜0.18
。减少了卡方,基本上是为了避免使用非常刻度的不
确定性
。
是的,统计数据“ Akaike Info Crit”和“ Bayesian Info Crit”是与“减少卡方”相似的替代统计信息,这些角色对于与不同数量的参数进行比较都特别有用。
It sort of looks like your uncertainties are underestimated. Where your data (which resembles a Gaussian to high precision) is near 0, the uncertainty is about 10x of the value. Near the peak, the uncertainty is about 3% of the value. But if the data really had such uncertainty, you would expect a lot more variation in the actual data. You could test that by simulating a Gaussian with your expected amplitude and width and adding fake noise with a size of 5.e-5 or so (eg using
numpy.random.normal(scale=5.e-5, size=len(x)
). I think you'll see that your actual data is not that noisy.Just based on your value of reduced chi-square, you might guess that the uncertainties should be scaled by a factor of ~ sqrt(0.03385) ~ 0.18. That is that the uncertainties should be 5x to 6x what you have them.
FWIW, that won't change the estimated uncertainties in the parameters: those are automatically (well, by default) scaled by the sqrt of reduced chi-square, basically to avoid this problem of needing very well-scaled uncertainties.
Yes, I think so. I think the scale is just off.
scaling your uncertainties by a factor of 1/5 to 1/6 will help.
Yes, the statistics "Akaike info crit" and "Bayesian info crit" are alternative statistics that serve similar roles to "reduced chi-square". These are all particularly useful for comparing fits with a different number of parameters.