如何在两个不同的Excel表中搜索两个列中的相等产品编号,并从匹配行到新表格复制某些单元
我有两个Excel表:
old_data.xlsx
产品编号名称当前价格其他列
1000产品名称1 10
AB23104产品名称2 5
430267产品名称3 20
new_data.xlsx
产品编号名称新价格其他列
AB23104更名为产品名称2 20
1000重命名的产品名称1 5
345LKT10023产品名称4 100
预期结果:下表 + 2个反馈消息
消息1)产品ID 430267在新数据文件中缺少
消息2 )产品ID 345LKT10023是新添加的
Product ID Name of product New price Old price
AB23104 Product name 2 20 5
1000 Product name 1 5 10
345LKT10023 Product name 4 100 100
我现在有此代码,但是由于我缺乏知识,它尚未工作且无法完成:
import openpyxl
import pandas as pd
new_datacols = [0, 1, 2]
old_datacols = [0, 1, 2]
new_data = pd.read_excel('new_data.xlsx', skiprows=1, usecols=new_datacols, index_col=0)
old_data = pd.read_excel('old_data.xlsx', skiprows=1, usecols=old_datacols, index_col=0)
def format_data():
# combine_type = inner, left, right, outer
df = pd.merge(new_data, old_data, on='Product number', how='outer')
df = df.rename(columns={"Product number": "Product ID",
"Name": "Name of product",
"Current price": "Old price"})
nan_value = float("NaN")
df.replace("", nan_value, inplace=True)
df.dropna(subset=["Name of product"], inplace=True)
df = df[['Product ID', 'Name of product',
'New price', 'Old price']]
print(df.columns)
# df.to_excel('updated_table.xlsx')
if __name__ == "__main__":
format_data()
I have two excel tables:
old_data.xlsx
Product number Name Current price Other columns
1000 Product name 1 10
AB23104 Product name 2 5
430267 Product name 3 20
new_data.xlsx
Product number Name New price Other columns
AB23104 Renamed product name 2 20
1000 Renamed product name 1 5
345LKT10023 Product name 4 100
Expected result: table below + 2 feedback messages somewhere
Message 1) Product ID 430267 is missing in new data file
Message 2) Product ID 345LKT10023 is newly added
Product ID Name of product New price Old price
AB23104 Product name 2 20 5
1000 Product name 1 5 10
345LKT10023 Product name 4 100 100
I have this code for now, but it is not working and not finished due to lack of knowledge on my part:
import openpyxl
import pandas as pd
new_datacols = [0, 1, 2]
old_datacols = [0, 1, 2]
new_data = pd.read_excel('new_data.xlsx', skiprows=1, usecols=new_datacols, index_col=0)
old_data = pd.read_excel('old_data.xlsx', skiprows=1, usecols=old_datacols, index_col=0)
def format_data():
# combine_type = inner, left, right, outer
df = pd.merge(new_data, old_data, on='Product number', how='outer')
df = df.rename(columns={"Product number": "Product ID",
"Name": "Name of product",
"Current price": "Old price"})
nan_value = float("NaN")
df.replace("", nan_value, inplace=True)
df.dropna(subset=["Name of product"], inplace=True)
df = df[['Product ID', 'Name of product',
'New price', 'Old price']]
print(df.columns)
# df.to_excel('updated_table.xlsx')
if __name__ == "__main__":
format_data()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是我的尝试。它将消息放在同一文件中的另一张纸上。最后的电子表格看起来像这样:
编辑:与实际电子表格一起使用的代码:
This is my attempt. It puts the messages in another sheet in the same file. The final spreadsheet looks like this:
EDIT: Code that works with the actual spreadsheets: