Python Scipy 2-D内核密度估计误差

发布于 2025-02-05 03:02:39 字数 2951 浏览 3 评论 0原文

我试图通过风暴模式以圆锥状态完成龙卷风的内核密度估计。我正在尝试从本文中重新创建一些数字（Smith等。，2012年）这是每十年在40公里网格上通过对流模式进行的龙卷风事件的KDE。每个事件都是2003年至2011年散点图标记的一系列LATS和LONS。与作者交谈，它们的估计是使用ArcGIS密度函数进行的。查看该功能的信息表，我试图重新创建它，尽管我只能找到高斯内核（而不是四分之一），并且由于Arcgis会自动每个单位区域输出事件，因此我不得不开发自己的比例因子。

我的kde尝试

我能够复制第一个数字，但是以下数字似乎有一个奇怪的高估了吗？密度估计中间有一个大孔，远高于应有的孔。我可能会缺少带宽或实际估算值的东西吗？我对这些类型的统计数据并不熟悉。对于比例因子，我计算了每个网格电池的面积（1600 km2），并除以数十年的数量（0.9）。有什么原因我缺少为什么估计会以这种方式出现？

这是我的第二个KDE估计和下图。


#Load 40km RAP Grid
f = np.load("/Users/andrewlyons/Downloads/pperf_grid_template.npz")
lon,lat = f['lon'], f['lat']
f.close()
#grid = np.zeros_like(lon)
#NDFD = pj.Proj("+proj=lcc +lat_1=25 +lat_2=25 +lon0=-95 +x_0=0 +y_0=0 +ellps=WGS84 #+datum=WGS84 +units=m +no_defs")

#X,Y = NDFD(lon,lat)
#data_crs= ccrs.LambertConformal(central_longitude=-95.0, central_latitude=25.0)
proj = ccrs.PlateCarree()
fig = plt.figure(figsize=(12, 10))
ax = fig.add_subplot(1, 1, 1, projection=proj)


# the extents of any plotted data

ax.set_extent([-125,-61,22, 49])


states_provinces = cfeature.NaturalEarthFeature(
        category='cultural',
        name='admin_1_states_provinces_lines',
        scale='50m',
        facecolor='none')

data = qlcs03


k = kde.gaussian_kde([data['slon'],data['slat']])


#set 40km grid
xi,yi =lon,lat

zi = k(np.vstack([xi.flatten(), yi.flatten()]))

# Make the plot

#Scale factor
zi=(zi*(1600*(10/9)))

c =ax.contourf(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],alpha=0.17,transform=proj)
cs = ax.contour(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],linewidths=2.5,transform=proj,zorder=9)
ax.clabel(cs, fontsize=16, inline=False, colors ='r')
ax.scatter(data['slon'],data['slat'],color ='k',marker = '.',s=0.1)
ax.add_feature(cfeature.BORDERS)
ax.add_feature(states_provinces)
ax.add_feature(cfeature.LAND,facecolor ='white')
ax.add_feature(cfeature.COASTLINE,zorder=11)
ax.add_feature(cfeature.OCEAN,facecolor='white',zorder=10)


#plt.colorbar(c)
#plt.colorbar(cs)

plt.title("Convective Mode KDE Test QLCS Tor 2003-2011 events per decade 40km grid                      N="+str(len(data)))

plt.show()

原文

I'm attempting to complete a kernel density estimate of tornadoes in the CONUS by storm mode. I'm trying to recreate some figures from this paper (Smith et al., 2012) which is a KDE of tornado events by convective mode per decade on a 40km grid. Each event is a series of lats and lons marked by the scatter plot from 2003 to 2011. Talking with the authors, their estimates were made using an ARCGIS Density function. Looking at the info sheet for that function, I tried to recreate it, though I was only able to find a gaussian kernel (instead of quartic) and I had to develop my own scale factor as ArcGis automatically outputs events per unit area.

My KDE attempt

I was able to replicate the first figure but following figures appear to have a strange overestimate? There is a large hole in the middle of the density estimate much higher than it should be. Am I perhaps missing something with the bandwidth or the actual estimate? Im not super familiar with these types of statistics. For the scale factor, I calculated the area of each grid cell (1600 km2) and divided by the number of decades (0.9). Is there any reason im missing why the estimate would appear this way?

Here is my second KDE estimate and the following figure.


#Load 40km RAP Grid
f = np.load("/Users/andrewlyons/Downloads/pperf_grid_template.npz")
lon,lat = f['lon'], f['lat']
f.close()
#grid = np.zeros_like(lon)
#NDFD = pj.Proj("+proj=lcc +lat_1=25 +lat_2=25 +lon0=-95 +x_0=0 +y_0=0 +ellps=WGS84 #+datum=WGS84 +units=m +no_defs")

#X,Y = NDFD(lon,lat)
#data_crs= ccrs.LambertConformal(central_longitude=-95.0, central_latitude=25.0)
proj = ccrs.PlateCarree()
fig = plt.figure(figsize=(12, 10))
ax = fig.add_subplot(1, 1, 1, projection=proj)


# the extents of any plotted data

ax.set_extent([-125,-61,22, 49])


states_provinces = cfeature.NaturalEarthFeature(
        category='cultural',
        name='admin_1_states_provinces_lines',
        scale='50m',
        facecolor='none')

data = qlcs03


k = kde.gaussian_kde([data['slon'],data['slat']])


#set 40km grid
xi,yi =lon,lat

zi = k(np.vstack([xi.flatten(), yi.flatten()]))

# Make the plot

#Scale factor
zi=(zi*(1600*(10/9)))

c =ax.contourf(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],alpha=0.17,transform=proj)
cs = ax.contour(xi, yi, zi.reshape(xi.shape),colors='k',levels=[0.5,1,2,3,4,5,6,7,8],linewidths=2.5,transform=proj,zorder=9)
ax.clabel(cs, fontsize=16, inline=False, colors ='r')
ax.scatter(data['slon'],data['slat'],color ='k',marker = '.',s=0.1)
ax.add_feature(cfeature.BORDERS)
ax.add_feature(states_provinces)
ax.add_feature(cfeature.LAND,facecolor ='white')
ax.add_feature(cfeature.COASTLINE,zorder=11)
ax.add_feature(cfeature.OCEAN,facecolor='white',zorder=10)


#plt.colorbar(c)
#plt.colorbar(cs)

plt.title("Convective Mode KDE Test QLCS Tor 2003-2011 events per decade 40km grid                      N="+str(len(data)))

plt.show()

分享到QQ

分享到微博