提取NetCDF4数据使用Python点

发布于 2025-02-09 21:53:29 字数 2851 浏览 2 评论 0原文

我正在研究气候数据。我使用ERE5数据提取了我所需的LAT的数据,并且代码非常有效。现在,我正在使用另一个数据集,但它不起作用。代码的编写方式不受LEAP年的打扰。但是在此数据集中,该代码在一年中的第366天被错误“索引超过尺寸约束”时停止工作。 python上的变量数据显示,temp(温度是我的数据中我使用temp = date.variables ['t2']访问的数据的变量)具有[365,1800,3600]的窗口,如果在LEAP的情况下应为366年。当我在CDO中检查文件时,会有366个时间段。除了Leap年代的运行良好,但在Leap年中,它会失败。
在这方面需要一些帮助。谢谢 还附加了更多的元数据信息,我认为这可能会有所帮助

#importing libraries
from netCDF4 import Dataset
import numpy as np
import glob
import pandas as pd
all_years = []
#Reading Netcdf file
for file in glob.glob('*.nc'):
    print (file)
    data = Dataset(file,'r')
    time = data.variables['time']
    year = time.units[11:15]
    temp = data.variables['t2']
    all_years.append(year)
year_start = min(all_years)
year_end = max(all_years)
#creating a list for the data/time column
date_range = pd.date_range(start = str(year_start) + '-01-01',
                           end = str(year_end) + '-12-31',
                           freq = 'D')
#Creating a data frame having an index column and 't2' column
df = pd.DataFrame(0.0, columns = ['t2'], index = date_range)
locations = pd.read_csv('Locations.csv')
for index, row in locations.iterrows():
#getting data from my csv file   
    locations = row['Name']
    loc_lat = row['Latitude']
    loc_long = row['Longitude']
    all_years.sort()
    for yr in all_years:
        data = Dataset(year+'.nc', 'r')
#storing lat lon data into variables
        lati = data.variables['lat'][:]
        long = data.variables['lon'][:]
#squared diff of lat and lon
        sq_diff_lat = (lati - loc_lat)**2 
        sq_diff_lon = (long - loc_long)**2
#identify the index of minimum value of lat and lon
        min_index_lat = sq_diff_lat.argmin()
        min_index_lon = sq_diff_lon.argmin()
        temp = data.variables['t2']
        start = str(yr) + '-01-01'
        end = str(yr) + '-12-31'
        d_range = pd.date_range(start = start,
                                end = end,
                                freq = 'D')
        for t_index in np.arange(0, len(d_range)):
            df.loc[d_range[t_index]]['t2'] = temp[t_index,             
            min_index_lat, min_index_lon]
        
    df.to_csv(locations + '.csv')

i am working on climate data. I have extracted data on my desired lat long using ERA5 data and code works perfectly. Now i am working with another dataset but its not working. Code is written in a way that it is not disturbed by leap-years. But in this dataset, the code stops working when there 366th day of the year by an error "index exceeds dimension bound". Variables data on python show that temp(temperature is the variable in my data which I access using temp=data.variables['t2']) has a window of [365,1800,3600] which should be 366 in case of a leap year. when i check the file in cdo there are complete 366 timesteps. Other than leap years code runs fine but in leap year it flops.
"This is the data shown by python"
"This is the data shown by cdo"
on dec 31, "index exceeds dimension bounds" - ERROR MESSAGE
Need some help in this regard.Thanks
Also attaching some more metadata information which i think might help
"Metadata info"

#importing libraries
from netCDF4 import Dataset
import numpy as np
import glob
import pandas as pd
all_years = []
#Reading Netcdf file
for file in glob.glob('*.nc'):
    print (file)
    data = Dataset(file,'r')
    time = data.variables['time']
    year = time.units[11:15]
    temp = data.variables['t2']
    all_years.append(year)
year_start = min(all_years)
year_end = max(all_years)
#creating a list for the data/time column
date_range = pd.date_range(start = str(year_start) + '-01-01',
                           end = str(year_end) + '-12-31',
                           freq = 'D')
#Creating a data frame having an index column and 't2' column
df = pd.DataFrame(0.0, columns = ['t2'], index = date_range)
locations = pd.read_csv('Locations.csv')
for index, row in locations.iterrows():
#getting data from my csv file   
    locations = row['Name']
    loc_lat = row['Latitude']
    loc_long = row['Longitude']
    all_years.sort()
    for yr in all_years:
        data = Dataset(year+'.nc', 'r')
#storing lat lon data into variables
        lati = data.variables['lat'][:]
        long = data.variables['lon'][:]
#squared diff of lat and lon
        sq_diff_lat = (lati - loc_lat)**2 
        sq_diff_lon = (long - loc_long)**2
#identify the index of minimum value of lat and lon
        min_index_lat = sq_diff_lat.argmin()
        min_index_lon = sq_diff_lon.argmin()
        temp = data.variables['t2']
        start = str(yr) + '-01-01'
        end = str(yr) + '-12-31'
        d_range = pd.date_range(start = start,
                                end = end,
                                freq = 'D')
        for t_index in np.arange(0, len(d_range)):
            df.loc[d_range[t_index]]['t2'] = temp[t_index,             
            min_index_lat, min_index_lon]
        
    df.to_csv(locations + '.csv')

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文