如何比较2个不同的数据框列,如果在Python中相等,则添加1列?

发布于 2025-01-19 19:19:01 字数 1659 浏览 0 评论 0原文

我有 2 个数据框 Table1 & Table2

Table1 示例输出:

CustomerID
CUST_3849502
CUST_3935123

Table2 示例输出:

CustomerIDAccountIDTimeCreated
CUST_38495023823479@store2022-04-07T21:38:13.195641Z
CUST_3935123343950347@store2022-04-07T21:38:13.647964Z
CUST_4566768876876465@store2022-02-08T15:55:13.857347Z

我正在尝试添加Table2["AccountID"] &如果 Table1["CustomerID"] 位于 Table2["CustomerID"] 中,则 Table2["TimeCreated"]Table1 >

所以 Table1 所需的输出是:

CustomerIDAccountIDTimeCreated
CUST_38495023823479@store2022-04-07T21:38:13.195641Z
CUST_3935123343950347@store2022-04-07T21:38:13.647964Z

我已经尝试过:

for x in Table1["CustomerID"]:
   if Table2["CustomerID"] in x:
     Table1["AccountID"] = Table2["AccountID"]
     Table1["TimeCreated"] = Table2["TimeCreated"]

但不断收到 TypeError: 'in'第二行需要字符串作为左操作数,而不是 Series

两列都是类型 pandas.core.series.Series 所以不确定这里的问题是什么,请帮助

I have 2 dataframes Table1 & Table2

Table1 example output:

CustomerID
CUST_3849502
CUST_3935123

Table2 example output:

CustomerIDAccountIDTimeCreated
CUST_38495023823479@store2022-04-07T21:38:13.195641Z
CUST_3935123343950347@store2022-04-07T21:38:13.647964Z
CUST_4566768876876465@store2022-02-08T15:55:13.857347Z

I'm trying to add Table2["AccountID"] & Table2["TimeCreated"] to Table1 if Table1["CustomerID"] is in Table2["CustomerID"]

So desired output of Table1 is:

CustomerIDAccountIDTimeCreated
CUST_38495023823479@store2022-04-07T21:38:13.195641Z
CUST_3935123343950347@store2022-04-07T21:38:13.647964Z

I've tried:

for x in Table1["CustomerID"]:
   if Table2["CustomerID"] in x:
     Table1["AccountID"] = Table2["AccountID"]
     Table1["TimeCreated"] = Table2["TimeCreated"]

But keep getting TypeError: 'in <string>' requires string as left operand, not Series for the 2nd line

Both columns are type pandas.core.series.Series so not sure what the issue is here, please help

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

甜宝宝 2025-01-26 19:19:01

您会发现下面的代码实现了您的预期目标。

import pandas as pd

df = pd.DataFrame()
df['CustomerID']=['CUST_3849502','CUST_3935123']

df2 = pd.DataFrame()

df2['CustomerID']=['CUST_3849502','CUST_3935123','trash']
df2['AccountID']=['3823479@store','343950347@store','yeet']
df2['TimeCreated']=['2022-04-07T21:38:13.195641Z','2022-04-07T21:38:13.647964Z','yams']

df3 = pd.merge(df,df2,'left')
print(df3)

我注意到您试图将数据框列数据作为列表解析。 可以使用这可以实现这一点

df["Your Column Name"].to_list() 

如果您不想使用“ pd.merge”, ,则可以'构建'列表并将其添加到您的数据框架中(只要列表的长度等于dataframe中的行数量...否则它将抛出索引)。

You'll find the code below accomplishes your intended goal.

import pandas as pd

df = pd.DataFrame()
df['CustomerID']=['CUST_3849502','CUST_3935123']

df2 = pd.DataFrame()

df2['CustomerID']=['CUST_3849502','CUST_3935123','trash']
df2['AccountID']=['3823479@store','343950347@store','yeet']
df2['TimeCreated']=['2022-04-07T21:38:13.195641Z','2022-04-07T21:38:13.647964Z','yams']

df3 = pd.merge(df,df2,'left')
print(df3)

I noticed you were trying to parse the dataframe column data as a list. This can be accomplished using

df["Your Column Name"].to_list() 

If you didn't want to use "pd.merge", you can 'build out' your lists and add them onto your dataframe (SO LONG AS THE LENGTH OF THE LIST EQUALS THE AMOUNT OF ROWS IN THE DATAFRAME... Otherwise it will throw an IndexError).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文