如何在特定纬度范围内的地理框架中选择线链

发布于 2025-01-24 06:37:05 字数 3085 浏览 3 评论 0原文

我正在使用以下数据集进行研究工作。

https://xfer.services.ncdot.gov/gisdot/DistDOTData/NCRoutes_SHP. zip

我正在尝试从上面数据集提取衬里。线条必须在纬度-80.86和-80.83之间。

首先,我使用地理器加载了数据集。

import geopandas as gpd
graph = gpd.read_file("NCRoutes.shp")
graph = graph.to_crs(4236)
print(len(graph['geometry'])) # number of rows in geodataframe
graph.head()

输出 -

373370
Division    MaintCntyC  RouteID RouteClass  RouteNumbe  RouteQuali  RouteInven  RouteName   BeginMP EndMP   RouteMaint  Shape_Leng  geometry
0   13  011 10000026011 1   26  0   0   I-26    0.000063    29.164804   System  56252.469013    MULTILINESTRING Z ((-82.53991 35.79167 0.00000...
1   14  045 10000026045 1   26  0   0   I-26    0.000000    17.459971   System  89044.485587    MULTILINESTRING Z ((-82.53591 35.44044 664.953...
2   13  057 10000026057 1   26  0   0   I-26    0.000000    12.606171   System  66456.002756    MULTILINESTRING Z ((-82.56037 35.95480 1145.01...
3   14  075 10000026075 1   26  0   0   I-26    0.000000    13.121029   System  69207.803704    LINESTRING Z (-82.34883 35.25453 565.22225, -8...
4   7   001 10000040001 1   40  0   0   I-40    0.000000    16.013000   System  84492.598738    LINESTRING Z (-79.53887 36.06290 179.46660, -7...

如上所述,Graph geodataframe具有373370行。 几何列的每一行中都有一个多线或统一性。因此,我编写了下面的代码以提取所需的线条,并将其保留在GeodataFrame 区域中。

from IPython.display import clear_output
area = gpd.GeoDataFrame(columns=['geometry'], geometry='geometry') # geodataframe for saving roads of interest
itr = 0 # to keep track of which row I'm going through
for String in graph['geometry']: # iterating through each row in geometry column of geodataframe
    print("iterating through",itr,"th row in geodataframe")
    itr += 1
    if String.geom_type=='MultiLineString': # if the current row contains a multilinestring
        for ls in String: # iterating through linestrings in the multilinestring
            for i in range(len(ls.xy[0])): # iterating through points in the linestring
                #print("longitude:",ls.xy[0][i],"latitude:",ls.xy[1][i])
                if(-80.86<=ls.xy[0][i]<=-80.83):
                    temp = gpd.GeoDataFrame({'geometry':[ls]},geometry='geometry',crs=4236) #creating a temporary geodataframe for storing the road of interest
                    area = area.append(temp)
                    break # no need to keep looking if linestring is already added to area geodataframe
    else: # if the current row contains a linestring
        for i in range(len(String.xy[0])):
            if(-80.86<=String.xy[0][i]<=-80.83):
                temp = gpd.GeoDataFrame({'geometry':[String]},geometry='geometry',crs=4236)
                area = area.append(temp)
                break
    clear_output(wait=True)

但是,我的笔记本电脑中经过150行的迭代大约需要12分钟。数据框架中有373370行。因此,通过整个Graph GeodataFrame在我的计算机中迭代大约需要500个小时。有更快的方法吗?

I'm using the following dataset for my research work.

https://xfer.services.ncdot.gov/gisdot/DistDOTData/NCRoutes_SHP.zip

I'm trying to extract linestrings from the above dataset. The linestrings have to be between the latitudes -80.86 and -80.83.

First I loaded the dataset using geopandas.

import geopandas as gpd
graph = gpd.read_file("NCRoutes.shp")
graph = graph.to_crs(4236)
print(len(graph['geometry'])) # number of rows in geodataframe
graph.head()

The output-

373370
Division    MaintCntyC  RouteID RouteClass  RouteNumbe  RouteQuali  RouteInven  RouteName   BeginMP EndMP   RouteMaint  Shape_Leng  geometry
0   13  011 10000026011 1   26  0   0   I-26    0.000063    29.164804   System  56252.469013    MULTILINESTRING Z ((-82.53991 35.79167 0.00000...
1   14  045 10000026045 1   26  0   0   I-26    0.000000    17.459971   System  89044.485587    MULTILINESTRING Z ((-82.53591 35.44044 664.953...
2   13  057 10000026057 1   26  0   0   I-26    0.000000    12.606171   System  66456.002756    MULTILINESTRING Z ((-82.56037 35.95480 1145.01...
3   14  075 10000026075 1   26  0   0   I-26    0.000000    13.121029   System  69207.803704    LINESTRING Z (-82.34883 35.25453 565.22225, -8...
4   7   001 10000040001 1   40  0   0   I-40    0.000000    16.013000   System  84492.598738    LINESTRING Z (-79.53887 36.06290 179.46660, -7...

As you can see above, The graph geodataframe has 373370 rows. There is either a multilinestring or a linestring in each row of the geometry column. So I wrote the code below to extract the linestrings I need and keep them in a geodataframe area.

from IPython.display import clear_output
area = gpd.GeoDataFrame(columns=['geometry'], geometry='geometry') # geodataframe for saving roads of interest
itr = 0 # to keep track of which row I'm going through
for String in graph['geometry']: # iterating through each row in geometry column of geodataframe
    print("iterating through",itr,"th row in geodataframe")
    itr += 1
    if String.geom_type=='MultiLineString': # if the current row contains a multilinestring
        for ls in String: # iterating through linestrings in the multilinestring
            for i in range(len(ls.xy[0])): # iterating through points in the linestring
                #print("longitude:",ls.xy[0][i],"latitude:",ls.xy[1][i])
                if(-80.86<=ls.xy[0][i]<=-80.83):
                    temp = gpd.GeoDataFrame({'geometry':[ls]},geometry='geometry',crs=4236) #creating a temporary geodataframe for storing the road of interest
                    area = area.append(temp)
                    break # no need to keep looking if linestring is already added to area geodataframe
    else: # if the current row contains a linestring
        for i in range(len(String.xy[0])):
            if(-80.86<=String.xy[0][i]<=-80.83):
                temp = gpd.GeoDataFrame({'geometry':[String]},geometry='geometry',crs=4236)
                area = area.append(temp)
                break
    clear_output(wait=True)

But it took around 12 minutes to iterate through 150 rows in my laptop; whereas there are 373370 rows in the dataframe. So it'll take around 500 hours to iterate through the whole graph geodataframe in my machine. Is there a faster way to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

你没皮卡萌 2025-01-31 06:37:05

您要使用cx,坐标索引器。

subset = graph.cx[:, -80.86:-80.83]

请参阅文档中的更多信息 htttps://geopandas.org/geopandas.org/en/stable/docs /user_guide/indexing.html

关于多线链vs linestring,您可以在索引之前使用爆炸>爆炸以确保使用各个零件,而不是整个几何形状。

subset = graph.explode().cx[:, -80.86:-80.83]

You want to use cx, the coordinate indexer.

subset = graph.cx[:, -80.86:-80.83]

See more in the documentation https://geopandas.org/en/stable/docs/user_guide/indexing.html

Regarding MultiLinestring vs LineString, you can use explode before the indexing to ensure individual parts are used, not whole geometries.

subset = graph.explode().cx[:, -80.86:-80.83]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文