如何从熊猫数据帧中的条件下的字符串列表中获取子字符串

发布于 2025-02-01 05:51:37 字数 1013 浏览 2 评论 0原文

这就是我所拥有的:

df = pd.DataFrame({'Name': {0: 'Mark', 1: 'John', 2: 'Rick'},
 'Location': {0: ['Mark lives in UK',
   'Rick lives in France',
   'John Lives in US'],
  1: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'],
  2: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']}})

这就是我想要得到的:

desired_output = pd.DataFrame({'Name': ['Mark', 'John', 'Rick'],
                   'Location':[['Mark lives in UK', 'Rick lives in France', 'John Lives in US'], ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'], ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']],
              'Outcome': ['Mark lives in UK', 'John Lives in US', 'Rick lives in France']
            })

这就是我尝试的:

df['Sorted'] = df['Location'].str.split(',')
df.apply(lambda x: [idx for idx,s in enumerate(x.sorted) if x.Name in x.sorted])

事先感谢您!

This is what I have:

df = pd.DataFrame({'Name': {0: 'Mark', 1: 'John', 2: 'Rick'},
 'Location': {0: ['Mark lives in UK',
   'Rick lives in France',
   'John Lives in US'],
  1: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'],
  2: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']}})

This is what I'd like to get:

desired_output = pd.DataFrame({'Name': ['Mark', 'John', 'Rick'],
                   'Location':[['Mark lives in UK', 'Rick lives in France', 'John Lives in US'], ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'], ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']],
              'Outcome': ['Mark lives in UK', 'John Lives in US', 'Rick lives in France']
            })

Here is what I tried:

df['Sorted'] = df['Location'].str.split(',')
df.apply(lambda x: [idx for idx,s in enumerate(x.sorted) if x.Name in x.sorted])

Thank you in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

别靠近我心 2025-02-08 05:51:37

如果您从一开始就不需要位置列,则可以使用以下方式:

df = pd.DataFrame({'Name': {0: 'Mark', 1: 'John', 2: 'Rick'},
 'Location': {0: ['Mark lives in UK',
   'Rick lives in France',
   'John Lives in US'],
  1: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'],
  2: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']}})
df = df.explode('Location')
df['Person_IND'] = df['Location'].apply(lambda x : x.split(' ')[0])
df = df.loc[df['Name'] == df['Person_IND']]
df[['Name', 'Location']]

如果您真的需要该中间列,则可以执行此操作并重新名称列

df = pd.DataFrame({'Name': {0: 'Mark', 1: 'John', 2: 'Rick'},
 'Location': {0: ['Mark lives in UK',
   'Rick lives in France',
   'John Lives in US'],
  1: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'],
  2: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']}})
df1 = df.explode('Location')
df1['Person_IND'] = df1['Location'].apply(lambda x : x.split(' ')[0])
df1 = df1.loc[df1['Name'] == df1['Person_IND']]
df1 = df1[['Name', 'Location']]
df_merge = pd.merge(df, df1, on = 'Name')
df_merge

If you don't need the Location column from the beginning you can use this:

df = pd.DataFrame({'Name': {0: 'Mark', 1: 'John', 2: 'Rick'},
 'Location': {0: ['Mark lives in UK',
   'Rick lives in France',
   'John Lives in US'],
  1: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'],
  2: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']}})
df = df.explode('Location')
df['Person_IND'] = df['Location'].apply(lambda x : x.split(' ')[0])
df = df.loc[df['Name'] == df['Person_IND']]
df[['Name', 'Location']]

If you really need that middle column you can do this and re-name the columns

df = pd.DataFrame({'Name': {0: 'Mark', 1: 'John', 2: 'Rick'},
 'Location': {0: ['Mark lives in UK',
   'Rick lives in France',
   'John Lives in US'],
  1: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US'],
  2: ['Mark lives in UK', 'Rick lives in France', 'John Lives in US']}})
df1 = df.explode('Location')
df1['Person_IND'] = df1['Location'].apply(lambda x : x.split(' ')[0])
df1 = df1.loc[df1['Name'] == df1['Person_IND']]
df1 = df1[['Name', 'Location']]
df_merge = pd.merge(df, df1, on = 'Name')
df_merge
客…行舟 2025-02-08 05:51:37

您可以在行上尝试应用

df['Outcome'] = df.apply(lambda row: [loc for loc in row['Location'] if row['Name'] in loc], axis=1)
print(df)

   Name                                                    Location  \
0  Mark  [Mark lives in UK, Rick lives in France, John Lives in US]
1  John  [Mark lives in UK, Rick lives in France, John Lives in US]
2  Rick  [Mark lives in UK, Rick lives in France, John Lives in US]

                  Outcome
0      [Mark lives in UK]
1      [John Lives in US]
2  [Rick lives in France]

,也可以尝试爆炸

df['Outcome'] = (df.explode('Location')
                 .loc[lambda df: df.apply(lambda row: row['Name'] in row['Location'], axis=1), 'Location'])
print(df)

   Name                                                    Location  \
0  Mark  [Mark lives in UK, Rick lives in France, John Lives in US]
1  John  [Mark lives in UK, Rick lives in France, John Lives in US]
2  Rick  [Mark lives in UK, Rick lives in France, John Lives in US]

                Outcome
0      Mark lives in UK
1      John Lives in US
2  Rick lives in France

You can try apply on rows

df['Outcome'] = df.apply(lambda row: [loc for loc in row['Location'] if row['Name'] in loc], axis=1)
print(df)

   Name                                                    Location  \
0  Mark  [Mark lives in UK, Rick lives in France, John Lives in US]
1  John  [Mark lives in UK, Rick lives in France, John Lives in US]
2  Rick  [Mark lives in UK, Rick lives in France, John Lives in US]

                  Outcome
0      [Mark lives in UK]
1      [John Lives in US]
2  [Rick lives in France]

Or you can try explode

df['Outcome'] = (df.explode('Location')
                 .loc[lambda df: df.apply(lambda row: row['Name'] in row['Location'], axis=1), 'Location'])
print(df)

   Name                                                    Location  \
0  Mark  [Mark lives in UK, Rick lives in France, John Lives in US]
1  John  [Mark lives in UK, Rick lives in France, John Lives in US]
2  Rick  [Mark lives in UK, Rick lives in France, John Lives in US]

                Outcome
0      Mark lives in UK
1      John Lives in US
2  Rick lives in France
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文