合并geodataframes -typeError:float()参数必须是字符串或数字,而不是' point'

发布于 2025-01-23 04:34:19 字数 2041 浏览 3 评论 0 原文

我有一个数据框架,其中一列具有一系列巧妙的点,另一个我有一系列多边形。

df.head()

                    

     hash number                               street unit  \
2024459  283e04eca5c4932a     SN  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   
2024460  1a92a1c3cba7941a    485  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   
2024461  837341c45de519a3    475  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   

            city  district region   postcode  id                     geometry  
2024459  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)  
2024460  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)  
2024461  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)

poly_df.head()
                                          centroids                                           geometry
0   POINT (-29.31067315122428 -54.64176359828149)  POLYGON ((-54.64069 -29.31161, -54.64069 -29.3...
1   POINT (-29.31067315122428 -54.63961783106958)  POLYGON ((-54.63854 -29.31161, -54.63854 -29.3...
2  POINT (-29.31067315122428 -54.637472063857665)  POLYGON ((-54.63640 -29.31161, -54.63640 -29.3...

我正在检查点是否属于多边形,并将点对象插入第二个数据框的单元格中。但是,我会收到以下错误:

Traceback (most recent call last):
   
  File "/tmp/ipykernel_4771/1967309101.py", line 1, in <module>
    df.loc[idx, 'centroids'] = poly_mun.loc[ix, 'centroids']

  File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 692, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)

  File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1599, in _setitem_with_indexer
    self.obj[key] = infer_fill_value(value)

  File ".local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py", line 516, in infer_fill_value
    val = np.array(val, copy=False)

TypeError: float() argument must be a string or a number, not 'Point'

我正在使用以下命令行:

df.loc[idx, 'centroids'] = poly_df.loc[ix, 'centroids']

我也已经尝试了的

谢谢

I have a dataframe whose one of the columns has a Series of shapely Points and another one in which I have a Series of Polygons.

df.head()

                    

     hash number                               street unit  \
2024459  283e04eca5c4932a     SN  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   
2024460  1a92a1c3cba7941a    485  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   
2024461  837341c45de519a3    475  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   

            city  district region   postcode  id                     geometry  
2024459  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)  
2024460  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)  
2024461  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)

poly_df.head()
                                          centroids                                           geometry
0   POINT (-29.31067315122428 -54.64176359828149)  POLYGON ((-54.64069 -29.31161, -54.64069 -29.3...
1   POINT (-29.31067315122428 -54.63961783106958)  POLYGON ((-54.63854 -29.31161, -54.63854 -29.3...
2  POINT (-29.31067315122428 -54.637472063857665)  POLYGON ((-54.63640 -29.31161, -54.63640 -29.3...

I'm checking if the Point belongs to the Polygon and inserting the Point object into the cell of the second dataframe. However, I'm getting the following error:

Traceback (most recent call last):
   
  File "/tmp/ipykernel_4771/1967309101.py", line 1, in <module>
    df.loc[idx, 'centroids'] = poly_mun.loc[ix, 'centroids']

  File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 692, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)

  File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1599, in _setitem_with_indexer
    self.obj[key] = infer_fill_value(value)

  File ".local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py", line 516, in infer_fill_value
    val = np.array(val, copy=False)

TypeError: float() argument must be a string or a number, not 'Point'

I'm using the following command line:

df.loc[idx, 'centroids'] = poly_df.loc[ix, 'centroids']

I have already tried at as well.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

逆流 2025-01-30 04:34:20

您不能使用LOC创建熊猫中的新列:

In [1]: import pandas as pd, shapely.geometry

In [2]: df = pd.DataFrame({'mycol': [1, 2, 3]})

In [3]: df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:1642: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
  self.obj[key] = infer_fill_value(value)
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/dtypes/missing.py:550: FutureWarning: The input object of type 'Point' is an array-like implementing one of the corresponding protocols (`__array__`, `__array_interface__` or `__array_struct__`); but not a sequence (or 0-D). In the future, this object will be coerced as if it was first converted using `np.array(obj)`. To retain the old behaviour, you have to either modify the type 'Point', or assign to an empty array created with `np.empty(correct_shape, dtype=object)`.
  val = np.array(val, copy=False)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:716, in _LocationIndexer.__setitem__(self, key, value)
    713 self._has_valid_setitem_indexer(key)
    715 iloc = self if self.name == "iloc" else self.obj.iloc
--> 716 iloc._setitem_with_indexer(indexer, value, self.name)

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:1642, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
   1639     self.obj[key] = empty_value
   1641 else:
-> 1642     self.obj[key] = infer_fill_value(value)
   1644 new_indexer = convert_from_missing_indexer_tuple(
   1645     indexer, self.obj.axes
   1646 )
   1647 self._setitem_with_indexer(new_indexer, value, name)

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/dtypes/missing.py:550, in infer_fill_value(val)
    548 if not is_list_like(val):
    549     val = [val]
--> 550 val = np.array(val, copy=False)
    551 if needs_i8_conversion(val.dtype):
    552     return np.array("NaT", dtype=val.dtype)

TypeError: float() argument must be a string or a real number, not 'Point'

从本质上讲,熊猫不知道如何解释点对象,因此使用NANS创建一个浮点列,然后无法处理点。这可能将来可以解决,但是最好明确地将列定义为对象dtype:

In [27]: df['centroid'] = None

In [28]: df['centroid'] = df['centroid'].astype(object)

In [29]: df
Out[29]:
   mycol centroid
0      1     None
1      2     None
2      3     None

In [30]: df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/internals/managers.py:304: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
  applied = getattr(b, f)(**kwargs)

In [31]: df
Out[31]:
   mycol     centroid
0      1  POINT (0 0)
1      2         None
2      3         None

也就是说,根据多边形在多边形中是否在多边形中加入两个GeodataFrames,当然听起来像是 geopandas.sjoin

union = gpd.sjoin(polygon_df, points_df, op='contains')

You can't create a new column in pandas with a shapely geometry using loc:

In [1]: import pandas as pd, shapely.geometry

In [2]: df = pd.DataFrame({'mycol': [1, 2, 3]})

In [3]: df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:1642: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
  self.obj[key] = infer_fill_value(value)
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/dtypes/missing.py:550: FutureWarning: The input object of type 'Point' is an array-like implementing one of the corresponding protocols (`__array__`, `__array_interface__` or `__array_struct__`); but not a sequence (or 0-D). In the future, this object will be coerced as if it was first converted using `np.array(obj)`. To retain the old behaviour, you have to either modify the type 'Point', or assign to an empty array created with `np.empty(correct_shape, dtype=object)`.
  val = np.array(val, copy=False)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:716, in _LocationIndexer.__setitem__(self, key, value)
    713 self._has_valid_setitem_indexer(key)
    715 iloc = self if self.name == "iloc" else self.obj.iloc
--> 716 iloc._setitem_with_indexer(indexer, value, self.name)

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:1642, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
   1639     self.obj[key] = empty_value
   1641 else:
-> 1642     self.obj[key] = infer_fill_value(value)
   1644 new_indexer = convert_from_missing_indexer_tuple(
   1645     indexer, self.obj.axes
   1646 )
   1647 self._setitem_with_indexer(new_indexer, value, name)

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/dtypes/missing.py:550, in infer_fill_value(val)
    548 if not is_list_like(val):
    549     val = [val]
--> 550 val = np.array(val, copy=False)
    551 if needs_i8_conversion(val.dtype):
    552     return np.array("NaT", dtype=val.dtype)

TypeError: float() argument must be a string or a real number, not 'Point'

Essentially, pandas doesn't know how to interpret a point object, and so creates a float column with NaNs, and then can't handle the point. This might get fixed in the future, but you're best off explicitly defining the column as object dtype:

In [27]: df['centroid'] = None

In [28]: df['centroid'] = df['centroid'].astype(object)

In [29]: df
Out[29]:
   mycol centroid
0      1     None
1      2     None
2      3     None

In [30]: df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/internals/managers.py:304: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
  applied = getattr(b, f)(**kwargs)

In [31]: df
Out[31]:
   mycol     centroid
0      1  POINT (0 0)
1      2         None
2      3         None

That said, joining two GeoDataFrames with polygons and points based on whether the points are in the polygons certainly sounds like a job for geopandas.sjoin:

union = gpd.sjoin(polygon_df, points_df, op='contains')
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文