Python Scipy 2-D内核密度估计误差

发布于 2025-02-05 03:02:39 字数 2951 浏览 3 评论 0原文

我试图通过风暴模式以圆锥状态完成龙卷风的内核密度估计。我正在尝试从本文中重新创建一些数字(Smith等。 ,2012年)这是每十年在40公里网格上通过对流模式进行的龙卷风事件的KDE。每个事件都是2003年至2011年散点图标记的一系列LATS和LONS。与作者交谈,它们的估计是使用ArcGIS密度函数进行的。查看该功能的信息表,我试图重新创建它,尽管我只能找到高斯内核(而不是四分之一),并且由于Arcgis会自动每个单位区域输出事件,因此我不得不开发自己的比例因子。

我的kde尝试

”我的第一个KDE估计主要是正确的“

我能够复制第一个数字,但是以下数字似乎有一个奇怪的高估了吗?密度估计中间有一个大孔,远高于应有的孔。我可能会缺少带宽或实际估算值的东西吗?我对这些类型的统计数据并不熟悉。对于比例因子,我计算了每个网格电池的面积(1600 km2),并除以数十年的数量(0.9)。有什么原因我缺少为什么估计会以这种方式出现?

这是我的第二个KDE估计和下图。

”图2从纸上”


#Load 40km RAP Grid
f = np.load("/Users/andrewlyons/Downloads/pperf_grid_template.npz")
lon,lat = f['lon'], f['lat']
f.close()
#grid = np.zeros_like(lon)
#NDFD = pj.Proj("+proj=lcc +lat_1=25 +lat_2=25 +lon0=-95 +x_0=0 +y_0=0 +ellps=WGS84 #+datum=WGS84 +units=m +no_defs")

#X,Y = NDFD(lon,lat)
#data_crs= ccrs.LambertConformal(central_longitude=-95.0, central_latitude=25.0)
proj = ccrs.PlateCarree()
fig = plt.figure(figsize=(12, 10))
ax = fig.add_subplot(1, 1, 1, projection=proj)


# the extents of any plotted data

ax.set_extent([-125,-61,22, 49])


states_provinces = cfeature.NaturalEarthFeature(
        category='cultural',
        name='admin_1_states_provinces_lines',
        scale='50m',
        facecolor='none')

data = qlcs03


k = kde.gaussian_kde([data['slon'],data['slat']])


#set 40km grid
xi,yi =lon,lat

zi = k(np.vstack([xi.flatten(), yi.flatten()]))

# Make the plot

#Scale factor
zi=(zi*(1600*(10/9)))

c =ax.contourf(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],alpha=0.17,transform=proj)
cs = ax.contour(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],linewidths=2.5,transform=proj,zorder=9)
ax.clabel(cs, fontsize=16, inline=False, colors ='r')
ax.scatter(data['slon'],data['slat'],color ='k',marker = '.',s=0.1)
ax.add_feature(cfeature.BORDERS)
ax.add_feature(states_provinces)
ax.add_feature(cfeature.LAND,facecolor ='white')
ax.add_feature(cfeature.COASTLINE,zorder=11)
ax.add_feature(cfeature.OCEAN,facecolor='white',zorder=10)


#plt.colorbar(c)
#plt.colorbar(cs)

plt.title("Convective Mode KDE Test QLCS Tor 2003-2011 events per decade 40km grid                      N="+str(len(data)))

plt.show()

I'm attempting to complete a kernel density estimate of tornadoes in the CONUS by storm mode. I'm trying to recreate some figures from this paper (Smith et al., 2012) which is a KDE of tornado events by convective mode per decade on a 40km grid. Each event is a series of lats and lons marked by the scatter plot from 2003 to 2011. Talking with the authors, their estimates were made using an ARCGIS Density function. Looking at the info sheet for that function, I tried to recreate it, though I was only able to find a gaussian kernel (instead of quartic) and I had to develop my own scale factor as ArcGis automatically outputs events per unit area.

Sample KDE

My KDE attempt

my first kde estimate mostly correct

I was able to replicate the first figure but following figures appear to have a strange overestimate? There is a large hole in the middle of the density estimate much higher than it should be. Am I perhaps missing something with the bandwidth or the actual estimate? Im not super familiar with these types of statistics. For the scale factor, I calculated the area of each grid cell (1600 km2) and divided by the number of decades (0.9). Is there any reason im missing why the estimate would appear this way?

Here is my second KDE estimate and the following figure.
My QLCS KDE attempt

Figure 2 from paper


#Load 40km RAP Grid
f = np.load("/Users/andrewlyons/Downloads/pperf_grid_template.npz")
lon,lat = f['lon'], f['lat']
f.close()
#grid = np.zeros_like(lon)
#NDFD = pj.Proj("+proj=lcc +lat_1=25 +lat_2=25 +lon0=-95 +x_0=0 +y_0=0 +ellps=WGS84 #+datum=WGS84 +units=m +no_defs")

#X,Y = NDFD(lon,lat)
#data_crs= ccrs.LambertConformal(central_longitude=-95.0, central_latitude=25.0)
proj = ccrs.PlateCarree()
fig = plt.figure(figsize=(12, 10))
ax = fig.add_subplot(1, 1, 1, projection=proj)


# the extents of any plotted data

ax.set_extent([-125,-61,22, 49])


states_provinces = cfeature.NaturalEarthFeature(
        category='cultural',
        name='admin_1_states_provinces_lines',
        scale='50m',
        facecolor='none')

data = qlcs03


k = kde.gaussian_kde([data['slon'],data['slat']])


#set 40km grid
xi,yi =lon,lat

zi = k(np.vstack([xi.flatten(), yi.flatten()]))

# Make the plot

#Scale factor
zi=(zi*(1600*(10/9)))

c =ax.contourf(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],alpha=0.17,transform=proj)
cs = ax.contour(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],linewidths=2.5,transform=proj,zorder=9)
ax.clabel(cs, fontsize=16, inline=False, colors ='r')
ax.scatter(data['slon'],data['slat'],color ='k',marker = '.',s=0.1)
ax.add_feature(cfeature.BORDERS)
ax.add_feature(states_provinces)
ax.add_feature(cfeature.LAND,facecolor ='white')
ax.add_feature(cfeature.COASTLINE,zorder=11)
ax.add_feature(cfeature.OCEAN,facecolor='white',zorder=10)


#plt.colorbar(c)
#plt.colorbar(cs)

plt.title("Convective Mode KDE Test QLCS Tor 2003-2011 events per decade 40km grid                      N="+str(len(data)))

plt.show()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

水溶 2025-02-12 03:02:39

对于任何绊倒这个问题的人,我相信我找到了答案。与作者交谈后,我认为他们的原始手稿有错误。最初的估计值乘以错误的比例因子,这导致每年的事件密度不是每十年。因此,我的情节很可能是正确的。我将关闭此线程。

For anyone stumbling across this, I believe I found the answer. After talking with the authors, I think their original manuscript had an error. The original estimate was multiplied by the wrong scale factor which resulted in a density of events per year not per decade. Thus, my plot is most likely correct. I will close this thread.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文