读取嵌套拉链档案中的shapefile
我有一个大的Zip存档“ Polska_shp.zip”,其中包含另一个Zip Archives(名为“ 02_SHP.ZIP”,“ 04_SHP.ZIP”等)。这些档案中的每一个都包含另一个邮政档案(例如,档案“ 02_shp.zip”具有“ 0201_shp.zip”,“ 0202_shp.zip”内部等等)。最后,这些档案包含许多shapefiles,我需要与所有Shapefiles一起阅读所有Shapefiles到目前为止
import zipfile
from io import BytesIO
import geopandas as gpd
with zipfile.ZipFile("Polska_SHP.zip", "r") as main_zfile:
for name in main_zfile.namelist(): # lista archiwów w głównym folderze
print("name: ", name)
if ".zip" in name:
zfiledata = BytesIO(main_zfile.read(name))
with zipfile.ZipFile(zfiledata) as zfile2:
for name2 in zfile2.namelist():
print("name2: ", name2)
if ".zip" in name2:
zfiledata2 = BytesIO(zfile2.read(name2))
with zipfile.ZipFile(zfiledata2) as zfile3:
for name3 in zfile3.namelist():
if "SWRS" in name3 and ".shp" in name3:
print("name3: ", name3)
gdf = gpd.read_file(name3)
gdf.head()
name: 32_SHP.zip
name2: 32/3209_SHP.zip
name3: PL.PZGiK.339.3209__OT_SWRS_L.shp
阅读Shapefile:
CPLE_OPENFAILEDERROR TRACEBACK(最近的最新电话) fiona/_shim.pyx in fiona._shim.gdal_open_vector() fiona/_err.pyx in fiona._err.exc_wrap_pointer() CPLE_OPENFAILEDERROR:PL.PZGIK.339.3209__OT_SWRS_L.SHP:没有此类文件或目录
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
截至此答案时,Geopandas支持Zipfile内部的路径。您只需要使用
!
将它们分开As of the time of this answer, geopandas supports the path inside zipfile. You just need to separate them by using
!
name3
变量您要传递到gpd.read_file()
只是zip中文件的名称,为此,您首先必须提取zip。另一个选项是传递存档的类似文件状对象,尽管这假设ZIP和SHP文件中只有一个数据集,其所有朋友都位于顶级目录中。请注意,我的样本只有 2 嵌套档案的水平。 ShapeFiles具有不同的属性,因此使用GeodataFrames的列表 -
gdfs
- 用于收集所有数据。在您的情况下,您可能想使用 pandas.concat() 。(顺便说一句,您当前的循环尝试覆盖
gdf
每次)输出:
The
name3
variable you are passing togpd.read_file()
is just the name of the file in ZIP, for this to work you would first have to extract the ZIP.Another option would be passing the file-like object of the archive, though this assumes there's only one dataset included in the zip and shp-file with all its friends are in top level directory. Please note that my sample had just 2 levels of nested archives. And shapefiles had different attributes, thus a list of geodataframes -
gdfs
- is used to collect all the data. In your case you probably want to usepandas.concat()
.(BTW, your current loop attempts to overwrite
gdf
each time)Output: