geopandas:将事故点与最近的道路相匹配

发布于 2025-01-16 20:42:22 字数 1739 浏览 4 评论 0原文

我有以下地理数据框: 事故_c = 奥地利所有事故的集合作为点, 街道 = 奥地利的道路作为线串(开放街道地图数据) 两者都有 crs 类型 epsg:3310

现在我想将每起事故与最近的道路相匹配。我的第一次尝试是这样的:

def nearest_street(accident_point, streets):
  row_canditates=streets.copy()
  nearest_road = None
  min_distance = None
  row_canditates["distance_road"] = row_canditates.apply(lambda row:  accident_point["geometry"].distance(row.geometry),axis=1)
  min_distance = row_canditates["distance_road"].min()
  min_road = row_canditates.loc[row_canditates["distance_road"] == min_distance]
  nearest_road = min_road["osm_id"].values[0]
  return nearest_road, min_distance

accidents_c["nearest_road"], accidents_c["distance_road"] = zip(*accidents_c.apply(nearest_street, streets=roads, axis=1))

这有效,但需要很长时间。所以我就在想一个办法,只包括距离事故点不超过3000米的道路,让速度更快。 为此我使用了缓冲方法。并这样做了:

def nearest_street(accident_point, streets):
  row_canditates=streets.copy()
  nearest_road = None
  min_distance = None
  buffered_accident = accident_point["geometry"].buffer(2000)
  bounds = buffered_accident.bounds
  x_min, x_max, y_min, y_max  = buffered_accident.bounds
  row_canditates=row_canditates.cx[x_min:x_max, y_min:y_max]

  row_canditates["distance_road"] = row_canditates.apply(lambda row:  accident_point["geometry"].distance(row.geometry),axis=1)
  min_distance = row_canditates["distance_road"].min()
  min_road = row_canditates.loc[row_canditates["distance_road"] == min_distance]
  nearest_road = min_road["osm_id"].values[0]
  return nearest_road, min_distance

accidents_c["nearest_road"], accidents_c["distance_road"] = zip(*accidents_c.apply(nearest_street, streets=roads, axis=1))

这工作得更快,但结果更糟。代码 a 和代码 b 之间的差异有时会超过 1000 米。您认为代码中的问题出在哪里? 您知道有什么更好的方法来限制对最近环境的搜索吗?

I have the following GeoDataFrames:
accidents_c = collection of all accidents in austria as points,
streets = roads in austria as a linestrings (open street map data)
Both have the crs type epsg:3310

Now i want to match every accident to the nearest road. My first attempt was this:

def nearest_street(accident_point, streets):
  row_canditates=streets.copy()
  nearest_road = None
  min_distance = None
  row_canditates["distance_road"] = row_canditates.apply(lambda row:  accident_point["geometry"].distance(row.geometry),axis=1)
  min_distance = row_canditates["distance_road"].min()
  min_road = row_canditates.loc[row_canditates["distance_road"] == min_distance]
  nearest_road = min_road["osm_id"].values[0]
  return nearest_road, min_distance

accidents_c["nearest_road"], accidents_c["distance_road"] = zip(*accidents_c.apply(nearest_street, streets=roads, axis=1))

This works but takes forever. So i was thinking about a way to make it faster by only including roads which are not more than 3000 meters away from the accident point.
For this i used the buffer method. And did this:

def nearest_street(accident_point, streets):
  row_canditates=streets.copy()
  nearest_road = None
  min_distance = None
  buffered_accident = accident_point["geometry"].buffer(2000)
  bounds = buffered_accident.bounds
  x_min, x_max, y_min, y_max  = buffered_accident.bounds
  row_canditates=row_canditates.cx[x_min:x_max, y_min:y_max]

  row_canditates["distance_road"] = row_canditates.apply(lambda row:  accident_point["geometry"].distance(row.geometry),axis=1)
  min_distance = row_canditates["distance_road"].min()
  min_road = row_canditates.loc[row_canditates["distance_road"] == min_distance]
  nearest_road = min_road["osm_id"].values[0]
  return nearest_road, min_distance

accidents_c["nearest_road"], accidents_c["distance_road"] = zip(*accidents_c.apply(nearest_street, streets=roads, axis=1))

This works much faster but the results are worse. Differences are sometimes over 1000 meters between code a and code b. Where do you think is the problem in the code?
Do you know any better method to limit the search for the nearest environment?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

你的笑 2025-01-23 20:42:22

我希望 https://osmnx.readthedocs.io/en /stable/osmnx.html#osmnx.distance.nearest_edges 会表现得更好。下面显示了使用英国事故数据和英国城市的完整示例。同样的方法也适用于奥地利。只需参考事故数据并选择您要使用的区域的多边形。

import osmnx as ox
import pandas as pd
import geopandas as gpd

place = "Gloucester"

# get bounding polygon of investigated location
gdf_poly = ox.geocode_to_gdf({"city": place}).loc[
    :, ["geometry", "display_name"]
]

# all uk accidents
df = pd.read_csv(
    "https://data.dft.gov.uk/road-accidents-safety-data/dft-road-casualty-statistics-accident-provisional-mid-year-unvalidated-2021.csv"
)
# accidents within investigated location
gdf = gpd.GeoDataFrame(
    data=df,
    geometry=gpd.points_from_xy(
        df["location_easting_osgr"], df["location_northing_osgr"]
    ),
    crs="EPSG:27700",
).to_crs("epsg:4326").sjoin(gdf_poly).reset_index(drop=True)

# OSMNX graph for investigated location
G = ox.graph_from_polygon(gdf_poly.iloc[0,0], network_type="drive")

# for speed project everything to UTM CRS
G_proj = ox.project_graph(G)
gdf = gdf.to_crs(G_proj.graph["crs"])

# get nodes and edges associated with investigated location
gdf_nodes, gdf_edges = ox.utils_graph.graph_to_gdfs(G_proj)

# find nearest edges (road) to accident points
ne, d = ox.nearest_edges(
    G_proj, X=gdf.geometry.x.values, Y=gdf.geometry.y.values, return_dist=True
)

# reindex accidents by OSM nearest edge
gdf = (
    gdf.set_index(pd.MultiIndex.from_tuples(ne, names=["u", "v", "key"]))
    .assign(distance=d)
    .sort_index()
)

# join accidents to nearest edge, now we have road name etc
gdf.join(gdf_edges.loc[:,["ref","name","highway","maxspeed"]]).loc[:,[
 'accident_year',
 'accident_reference',
 'date',
 'first_road_class',
 'first_road_number',
 'road_type',
 'speed_limit',
 'trunk_road_flag',
 'geometry',
 'ref',
 'name',
 'highway',
 'maxspeed']]

I would expect https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.distance.nearest_edges would perform better. Below shows full example of using UK accident data and a UK city. Same approach would work for Austria. Just need reference to accident data and choose a polygon that is area you want to use.

import osmnx as ox
import pandas as pd
import geopandas as gpd

place = "Gloucester"

# get bounding polygon of investigated location
gdf_poly = ox.geocode_to_gdf({"city": place}).loc[
    :, ["geometry", "display_name"]
]

# all uk accidents
df = pd.read_csv(
    "https://data.dft.gov.uk/road-accidents-safety-data/dft-road-casualty-statistics-accident-provisional-mid-year-unvalidated-2021.csv"
)
# accidents within investigated location
gdf = gpd.GeoDataFrame(
    data=df,
    geometry=gpd.points_from_xy(
        df["location_easting_osgr"], df["location_northing_osgr"]
    ),
    crs="EPSG:27700",
).to_crs("epsg:4326").sjoin(gdf_poly).reset_index(drop=True)

# OSMNX graph for investigated location
G = ox.graph_from_polygon(gdf_poly.iloc[0,0], network_type="drive")

# for speed project everything to UTM CRS
G_proj = ox.project_graph(G)
gdf = gdf.to_crs(G_proj.graph["crs"])

# get nodes and edges associated with investigated location
gdf_nodes, gdf_edges = ox.utils_graph.graph_to_gdfs(G_proj)

# find nearest edges (road) to accident points
ne, d = ox.nearest_edges(
    G_proj, X=gdf.geometry.x.values, Y=gdf.geometry.y.values, return_dist=True
)

# reindex accidents by OSM nearest edge
gdf = (
    gdf.set_index(pd.MultiIndex.from_tuples(ne, names=["u", "v", "key"]))
    .assign(distance=d)
    .sort_index()
)

# join accidents to nearest edge, now we have road name etc
gdf.join(gdf_edges.loc[:,["ref","name","highway","maxspeed"]]).loc[:,[
 'accident_year',
 'accident_reference',
 'date',
 'first_road_class',
 'first_road_number',
 'road_type',
 'speed_limit',
 'trunk_road_flag',
 'geometry',
 'ref',
 'name',
 'highway',
 'maxspeed']]

剑心龙吟 2025-01-23 20:42:22

我想我发现了错误。

x_min, x_max, y_min, y_max = buffered_accident.bounds

必须是这个

x_min, y_min, x_max, y_max = buffered_accident.bounds

这就是结果如此糟糕的原因。
但谢谢你的帮助,罗布·雷蒙德。

i think i found the mistake.

x_min, x_max, y_min, y_max = buffered_accident.bounds

needs to be this one

x_min, y_min, x_max, y_max = buffered_accident.bounds

This was the reason why results were so bad.
But thanks for your help Rob Raymond.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文