我会尽我所能。我想从S3存储桶中的SageMaker写入/附加到XLS/XLSX文件。在我上传之前,有一个为每种文件类型(CSV,XLS,XLSX)的空表(CSV,XLS,XLSX)。我可以在S3存储桶中写入/附加DF到空的CSV文件。但它不适用于XLS/XLSX。这是我用于CSV的代码:
df.to_csv('s3://bucket_name/temp/Database.csv', index=False, mode = 'w', header = False)
这是我用于XLSX文件的代码:
with pd.ExcelWriter('s3://bucket_name/project/Database.xlsx', mode = 'w', engine="xlsxwriter") as writer:
df.to_excel(writer, "Sheet 1")
writer.save()
注意:对于XLS,我只是将引擎更改为OpenPyXl,然后将文件路径更改为XLS One
在为XLSX/XLS运行时,我会从上面的代码中获取此内容:
FileCreateError: [Errno 2] No such file or directory: 's3://bucket_name/project/Database.xlsx'
即使它与另一个位置完全相同。
我不确定问题是什么,但我没有找到任何解决方案。我已经尝试添加“ R”使其成为一个原始的字符串,我尝试更改斜线,但似乎没有任何作用。有没有在XLSXWriter/OpenPyXl经验的人知道问题可能是什么?
I will try to phrase this the best I can. I want to write/append to an xls/xlsx file from Sagemaker that is in an S3 bucket. There is an empty excel sheet for each file type (csv,xls,xlsx) in the S3 bucket that I upload prior to. I am able to write/append a df to the empty csv file in the S3 bucket no problem. But it does not work for xls/xlsx. Here is the code I am using for the csv:
df.to_csv('s3://bucket_name/temp/Database.csv', index=False, mode = 'w', header = False)
Here is the code I am using for the xlsx file:
with pd.ExcelWriter('s3://bucket_name/project/Database.xlsx', mode = 'w', engine="xlsxwriter") as writer:
df.to_excel(writer, "Sheet 1")
writer.save()
Note: For xls, I just change the engine to openpyxl and change the file path to the xls one
I get this from the above code when running for xlsx/xls:
FileCreateError: [Errno 2] No such file or directory: 's3://bucket_name/project/Database.xlsx'
Even though it is in the exact same location as the other one.
I am not sure what the problem is but I have not found any solution. I have tried adding 'r' to make it a raw string, I have tried changing the slashes around, but nothing seems to work. Does anyone that has experience with xlsxwriter/openpyxl know what the problem could be?
发布评论
评论(1)
pd.to_csv
可以处理S3路径,因为 v0.20.0 ,但是pd.excelwriter
不能。您需要 s3fs 或boto 或boto,例如: store excel文件从aws中的pandas导出。
pd.to_csv
can handle s3 paths since v0.20.0, butpd.ExcelWriter
can't.You'll need to s3fs or boto like explained here: Store Excel file exported from Pandas in AWS.