将投影添加到Python中的Rioxarray数据集

发布于 2025-01-28 16:58:44 字数 1573 浏览 1 评论 0原文

我已经从气候数据存储下载了NETCDF,并想为其写下CR,因此我可以将其夹住以作为shapefile。但是,分配CRS时会出现错误。 在我的脚本下方以及正在打印的内容。尝试写CRS后,我会收到此错误:

MissingsPatialDimensionError:找不到Y尺寸(LAT)。数据变量:lon_bnds

# load netcdf with xarray
dset = xr.open_dataset(netcdf_fn)
print(dset)

# add projection system to nc
dset = dset.rio.write_crs("EPSG:4326", inplace=True)

# mask CMIP6 data with shapefile
dset_shp = dset.rio.clip(shp.geometry.apply(mapping), shp.crs)

dset
Out[44]: 
<xarray.Dataset>
Dimensions:      (time: 1825, bnds: 2, lat: 2, lon: 1)
Coordinates:
  * time         (time) object 2021-01-01 12:00:00 ... 2025-12-31 12:00:00
  * lat          (lat) float64 0.4712 1.414
  * lon          (lon) float64 31.25
    spatial_ref  int32 0
Dimensions without coordinates: bnds
Data variables:
    time_bnds    (time, bnds) object ...
    lat_bnds     (lat, bnds) float64 0.0 0.9424 0.9424 1.885
    lon_bnds     (lon, bnds) float64 ...
    pr           (time, lat, lon) float32 ...
Attributes: (12/48)
    Conventions:            CF-1.7 CMIP-6.2
    activity_id:            ScenarioMIP
    branch_method:          standard
    branch_time_in_child:   60225.0
    branch_time_in_parent:  60225.0
    comment:                none
                    ...
    title:                  CMCC-ESM2 output prepared for CMIP6
    variable_id:            pr
    variant_label:          r1i1p1f1
    license:                CMIP6 model data produced by CMCC is licensed und...
    cmor_version:           3.6.0
    tracking_id:            hdl:21.14100/0c6732f7-2cdd-4296-99a0-7952b7ca911e

I've downloaded a netcdf from the Climate Data Store and would like to write a CRS to it, so I can clip it for a shapefile. However, I get an error when assigning a CRS.
Below my script and what is being printed. I receive this error after trying to write a crs:

MissingSpatialDimensionError: y dimension (lat) not found. Data variable: lon_bnds

# load netcdf with xarray
dset = xr.open_dataset(netcdf_fn)
print(dset)

# add projection system to nc
dset = dset.rio.write_crs("EPSG:4326", inplace=True)

# mask CMIP6 data with shapefile
dset_shp = dset.rio.clip(shp.geometry.apply(mapping), shp.crs)

dset
Out[44]: 
<xarray.Dataset>
Dimensions:      (time: 1825, bnds: 2, lat: 2, lon: 1)
Coordinates:
  * time         (time) object 2021-01-01 12:00:00 ... 2025-12-31 12:00:00
  * lat          (lat) float64 0.4712 1.414
  * lon          (lon) float64 31.25
    spatial_ref  int32 0
Dimensions without coordinates: bnds
Data variables:
    time_bnds    (time, bnds) object ...
    lat_bnds     (lat, bnds) float64 0.0 0.9424 0.9424 1.885
    lon_bnds     (lon, bnds) float64 ...
    pr           (time, lat, lon) float32 ...
Attributes: (12/48)
    Conventions:            CF-1.7 CMIP-6.2
    activity_id:            ScenarioMIP
    branch_method:          standard
    branch_time_in_child:   60225.0
    branch_time_in_parent:  60225.0
    comment:                none
                    ...
    title:                  CMCC-ESM2 output prepared for CMIP6
    variable_id:            pr
    variant_label:          r1i1p1f1
    license:                CMIP6 model data produced by CMCC is licensed und...
    cmor_version:           3.6.0
    tracking_id:            hdl:21.14100/0c6732f7-2cdd-4296-99a0-7952b7ca911e

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

时光病人 2025-02-04 16:58:44

当您致电rioxarray登录时

警告:

剪辑变量具有'x'/'y'的变量。其他人则是按原样的。

因此,您遇到的问题是RioXarray看到数据集中的四个变量:

Data variables:
    time_bnds    (time, bnds) object ...
    lat_bnds     (lat, bnds) float64 0.0 0.9424 0.9424 1.885
    lon_bnds     (lon, bnds) float64 ...
    pr           (time, lat, lon) float32 ...

其中,lat_bndslon_bndspr都有x y尺寸可以剪切。 Rioxarray并没有为在这种情况下做什么做出一些任意选择,而是通过消息MissingsPatialDimensionError:y Dimension(LAT)提出了错误。数据变量:lon_bnds。这表明在处理变量lon_bnds时,不确定该怎么做,因为它可以找到x维度,而不是y dimension。

为了解决这个问题,您有两个选择。首先是在pr数组上调用clip。这可能是正确的呼叫 - 通常,我建议尽可能只使用数组(不是数据集)进行数据处理,除非您真正的知道您要在数据集中的所有变量上绘制一个操作。在pr上调用剪辑看起来像这样:

clipped = dset.pr.rio.clip(shp.geometry.apply(mapping), shp.crs)

或者,您可以解决真正应该是坐标的data_variables的问题。您可以使用方法 /a>重新分类非data data_variables为non-Dimension 坐标。在这种情况下:

dset  = dset.set_coords(['time_bnds', 'lat_bnds', 'lon_bnds'])

我不确定这是否会完全解决您的问题 - Rioxarray可能在处理坐标时仍会引起此错误。您总是可以放下界限。但是,仅在单个变量上调用此方法的第一种方法将起作用。

When you call the rioxarray accessor ds.rio.clip using a xr.Dataset rather than a xr.DataArray, rioxarray needs to guess which variables in the dataset should be clipped. The method docstring gives the following warning:

Warning:

Clips variables that have dimensions ‘x’/’y’. Others are appended as is.

So the issue you're running into is that rioxarray sees four variables in your dataset:

Data variables:
    time_bnds    (time, bnds) object ...
    lat_bnds     (lat, bnds) float64 0.0 0.9424 0.9424 1.885
    lon_bnds     (lon, bnds) float64 ...
    pr           (time, lat, lon) float32 ...

Of these, lat_bnds, lon_bnds, and pr all have x or y dimensions which could conceivably be clipped. Rather than making some arbitrary choice about what to do in this situation, rioxarray is raising an error with the message MissingSpatialDimensionError: y dimension (lat) not found. Data variable: lon_bnds. This indicates that when processing the variable lon_bnds, it's not sure what to do, because it can find an x dimension but not a y dimension.

To address this, you have two options. The first is to call clip on the pr array only. This is probably the right call - generally I'd recommend only doing data processing with Arrays (not Datasets) whenever possible unless you really know you want to map an operation across all variables in the dataset. Calling clip on pr would look like this:

clipped = dset.pr.rio.clip(shp.geometry.apply(mapping), shp.crs)

Alternatively, you could resolve the issue of having data_variables that really should be coordinates. You can use the method set_coordsto reclassify the non-data data_variables as non-dimension coordinates. In this case:

dset  = dset.set_coords(['time_bnds', 'lat_bnds', 'lon_bnds'])

I'm not sure if this will completely resolve your issue - it's possible that rioxarray will still raise this error when processing coordinates. You could always drop the bounds, too. But the first method of only calling this on a single variable will work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文