地理+ MySQL- GIS数据 /无效的几何形状

发布于 2025-02-03 18:55:30 字数 1818 浏览 2 评论 0原文

我有一个geopandas我插入MySQL数据库中的dataframe。当我尝试在MySQL表中设置几何时,我遇到了一个错误。我在Python中检查有效的几何形状,然后插入SQL。这是我的代码和示例数据:

示例geo-data: https://www.dropbox.com/s/14NI2MFPPT5DN7X/GDF%20%281%281%299999999.DL = 0

import geopandas as gpd
from shapely.geometry import Point, Polygon 
from shapely import wkt

# mysql connection
import pymysql
from sqlalchemy import create_engine
user = 'user'
pwd = 'pwd'
host =  'host1'
port = 3306
database = 'db'

engine = create_engine("mysql+pymysql://{}:{}@{}/{}".format(user,pwd,host,database))

gdf = gpd.read_file()

# To GeoPandas
gdf['geometry'] = gdf['zip_code_geom'].apply(wkt.loads)

# Fix Bad Geometries
def valid_geom(geom):
    try:
       return wkt.loads(geom)
    except:
       return np.nan


gdf['geometry'] = gdf.zip_code_geom.apply(valid_geom)

# Set Geometry
gdf = gdf.GeoDataFrame(gdf, geometry='geometry')

# MultiPolygon to Polygon
gdf = gdf.explode(column='geometry', ignore_index=True, index_parts=False)

# Insert into SQL
con = engine.connect()
gdf.to_sql(name="gdf_sql", con=con, if_exists = 'replace')
con.close()

现在,geterry列被存储为<代码>文本在mySQL中,我们需要将其转换为polygon

我尝试了以下操作,但是继续获取​​错误代码:3037。提供给功能的无效GIS数据ST_POLYGONFROMTEXT。

UPDATE db1.gdf_sql SET geometry = ST_PolygonFromText(geometry) WHERE geometry!='' AND geometry IS NOT NULL;

ALTER TABLE db1.gdf_sql
ADD COLUMN geom_poly Polygon;

# insert column into table
UPDATE db1.gdf_sql 
SET geometry = (
    SELECT ST_GeomFromTxt(t1.geometry)
    FROM geom as t1
    WHERE t1.geometry = db1.gdf_sql.geometry
)

我需要在mySQL中保存几何>几何>,以便能够执行空间功能在我的查询中。

I have a geopandas DataFrame that I insert into a MySQL database. When I attempt to set geometry in MySQL table, I am running into an error. I check for valid geometries in python and then insert into SQL. Here's my code and sample data:

Sample geo-data: https://www.dropbox.com/s/14ni2mfppt5dn7x/gdf%20%281%29.csv?dl=0

import geopandas as gpd
from shapely.geometry import Point, Polygon 
from shapely import wkt

# mysql connection
import pymysql
from sqlalchemy import create_engine
user = 'user'
pwd = 'pwd'
host =  'host1'
port = 3306
database = 'db'

engine = create_engine("mysql+pymysql://{}:{}@{}/{}".format(user,pwd,host,database))

gdf = gpd.read_file()

# To GeoPandas
gdf['geometry'] = gdf['zip_code_geom'].apply(wkt.loads)

# Fix Bad Geometries
def valid_geom(geom):
    try:
       return wkt.loads(geom)
    except:
       return np.nan


gdf['geometry'] = gdf.zip_code_geom.apply(valid_geom)

# Set Geometry
gdf = gdf.GeoDataFrame(gdf, geometry='geometry')

# MultiPolygon to Polygon
gdf = gdf.explode(column='geometry', ignore_index=True, index_parts=False)

# Insert into SQL
con = engine.connect()
gdf.to_sql(name="gdf_sql", con=con, if_exists = 'replace')
con.close()

Now, the geometry column is stored as text in MySQL and we need to convert it to Polygon.

I attempted the following, but keep getting Error Code: 3037. Invalid GIS data provided to function st_polygonfromtext.

UPDATE db1.gdf_sql SET geometry = ST_PolygonFromText(geometry) WHERE geometry!='' AND geometry IS NOT NULL;

ALTER TABLE db1.gdf_sql
ADD COLUMN geom_poly Polygon;

# insert column into table
UPDATE db1.gdf_sql 
SET geometry = (
    SELECT ST_GeomFromTxt(t1.geometry)
    FROM geom as t1
    WHERE t1.geometry = db1.gdf_sql.geometry
)

I need to save the geometries in mysql to be able to perform spatial functions in my queries.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

潇烟暮雨 2025-02-10 18:55:30

对于一个 - nan不是有效的几何形状。因此,您可以在发送到SQL之前丢弃NAN,也可以正确处理st_geomfromtxt正确处理它们。为此,您可以使用一个空的geomemycollection,例如shapely.geometry.geometrycollection()

第二 - 转换为任何字符串格式时,geopandas运行 to_wkt 在引擎盖下。依次调用 pygeos.pygeos.pygeos.to_wkt ,所有关键字参数(和默认值)都传递给其他功能(继承)。默认情况下,在发送到WKT(然后是SQL)之前,几何形状的精度降低到6个小数位,这可能会导致有效的形状变得无效,尤其是当您以程度谈论地理空间数据时。摘录摘自 pygeos docs

to_wkt (几何,roughing_precision = 6,trim = true,output_dimension = 3,old_3d = false,** kwargs)

转换为几何的众所周知的文本(WKT)表示。

参数

    编写WKT字符串时的舍入精度。设置为-1的值以指示完整的精度。

为了解决这个问题,您可以在发送到SQL之前自己进行转换:

df = pd.DataFrame(gdf).assign(
    geometry=gdf.geometry.to_wkt(rounding_precision=-1),
)

最后,有一个“有效的几何形状”概念,与“通过wkt.loads可解行”不同,因此您的测试可能无法按预期工作。而可以使用共线内环和其他“和其他”来定义多边形无效的“形状在整齐的形状中,空间操作将根据语言和操作的不同(或可能无法按预期起作用)。我不确定MySQL的几何引擎如何处理此问题 - 它们可能只是丢失一个错误,因此这是一个可能会渗入错误的地方。您可以使用 gdf.geometry.is_valid.all()带有gdf ​​[〜gd.geometry.is_valid]的无效。

For one - nan isn’t a valid geometry. So you can either drop NaNs before sending to sql or handle them in a way that will be parsed by ST_GeomFromTxt correctly. For this, you can use an empty GeometryCollection, e.g. shapely.geometry.GeometryCollection().

Second - when converting to any string format geopandas runs to_wkt under the hood. This in turn calls pygeos.to_wkt, and all keyword arguments (and defaults) are passed to (inherited from) the others function. By default, the precision of geometries is reduced to 6 decimal places prior to sending to wkt (and then sql) and this can cause valid shapes to become invalid, especially when you're talking about geospatial data in degrees. Excerpted from the pygeos docs:

to_wkt (geometry, rounding_precision=6, trim=True, output_dimension=3, old_3d=False, **kwargs)

Converts to the Well-Known Text (WKT) representation of a Geometry.

Parameters

  • rounding_precision int, default 6

    The rounding precision when writing the WKT string. Set to a value of -1 to indicate the full precision.

To get around this, you can do the conversion yourself prior to sending to sql:

df = pd.DataFrame(gdf).assign(
    geometry=gdf.geometry.to_wkt(rounding_precision=-1),
)

Finally, there is a concept of "valid geometries" which is different from "parseable by wkt.loads", so your test may not be working as intended. While it is possible to define polygons with collinear interior rings and other "invalid" shapes in shapely, spatial operations will not work (or may not work as intended) depending on the language and operation. I'm not sure how MySQL's geometry engine deals with this - they may just throw an error, so this is another place where errors could be creeping in. You can check for valid shapes with gdf.geometry.is_valid.all(), or find invalid ones with gdf[~gd.geometry.is_valid].

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文