我们如何为现有的Excel电子表格编写新数据?
我有一个过程,该过程每周都会创建近1,000行的数据框架。我希望能够将其附加到现有表上,而无需重新阅读电子表格,因为随着文件的增长,这将需要很长时间。我在这里看到了这个答案:使用python pandas 数据框。不幸的是,它似乎对我来说并不正确。这是我试图将其附加到该现有文件的一些虚拟代码。目前,它导致两个问题 - 首先,它没有附加,而是覆盖数据。其次,当我打开文件时,即使程序运行后,它也只允许我以仅阅读模式打开它。我已经确认我也使用熊猫1.4。
import pandas as pd
data = {'Name': ['Tom', 'Joseph', 'Krish', 'John'], 'Age': [20, 21, 19, 18]}
df = pd.DataFrame(data)
filename = "Testing Append Process.xlsx"
writer = pd.ExcelWriter(filename, engine="openpyxl", mode="a", if_sheet_exists="overlay")
df.to_excel(writer, index=False)
writer.save()
I have a process that creates a dataframe of almost 1,000 rows that runs each week. I would like to be able to append to an existing sheet without having to re-read the spreadsheet because that will take a long time as the file grows. I saw this answer here: Append existing excel sheet with new dataframe using python pandas. Unfortunately, it doesn't seem to be working correctly for me. Here is some dummy code that I am trying to append to that existing file. It causes two issues at present - first, it does not append, but rather overwrites the data. Secondly, when I go to open the file, even after the program runs, it only allows me to open it in read-only mode. I have confirmed I am using pandas 1.4 as well.
import pandas as pd
data = {'Name': ['Tom', 'Joseph', 'Krish', 'John'], 'Age': [20, 21, 19, 18]}
df = pd.DataFrame(data)
filename = "Testing Append Process.xlsx"
writer = pd.ExcelWriter(filename, engine="openpyxl", mode="a", if_sheet_exists="overlay")
df.to_excel(writer, index=False)
writer.save()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
请注意,
测试Append Process.xlsx
必须在运行此代码之前创建文件。如果您将两次运行代码,则返回以下内容。
Please notes that
Testing Append Process.xlsx
file has to be created before running this code.Returns the following if you will run the code twice.
上面的替代解决方案现在为Writer.book属性提供了未来的warning消息。因此,这是新解决方案。
如果您将两次运行代码,则返回以下内容。
The alternative solution above now gives a FutureWarning message for writer.book attribute. So here is the new solution.
Returns the following if you will run the code twice.
这是一种限制读取的电子表格的方法,但使用OpenPyxl读取了多少。
获取设置: opentypyxl
文档对于active_ws ['a']中的c,如果c.value不是无)) + 1
这将返回文档中的最大行量...添加一个行,您将拥有您的起点要开始放置您需要添加的数据。
以下是将数据放置在使用OpenPyXl的示例:
编辑:最大行VAR可以调整为您想要检查的任何列。 (此一个选中列“ A”)...列在数据完成之前不能具有空单元格,它会给您错误的“最大行”。
Here is a way to limit how much of the spreadsheet is read, but with openpyxl.
Getting setup: openpyxl documentation
max_rows = max((c.row for c in active_ws['A'] if c.value is not None)) + 1
This will return the max amount of rows within the document... add one to it, and you will have your starting point to start placing the data you need to add.
Here is an example of placing that data with openpyxl:
Edit: The max rows var can be adjusted to any column you want checked. (This one checks column "A")... The column cannot have an empty cells until your data is done otherwise, it will give you an incorrect "max rows".