当前位置：文江博客话题详情

如何将2个数据范围合并为顺序

发布于 2025-01-23 10:59:05 字数 1061 浏览 5 评论 0 原文

我有2个dataframes

df_my ，看起来

Seq    Grp
-----------
1      1
2      1
3      2
4      1
5      2
6      2
7      3
8      3
9      1
.
.
135    3
136    1
137    1
138    2
139    3

其他数据框是df_filer

df_filer

Seq    Grp    Alg
-------------------
1      1      J
3      2      F
7      3      Z
136    1      M
137    1      R
138    2      T
139    3      Y

我想将两个dataframes合并到1中，其中

1-如果df_my seq编号已经在df_filer中，则它应该与该数字合并为df_my，基于df_my和df_filter

2-如果df_my seq编号不在df_filer中，则应与同一GRP中的最小seq一起加入，

因此最终结果应该如下，

Seq    Grp    Alg
------------------
1      1      J
2      1      J
3      2      F
4      1      J
5      2      F
6      2      F
7      3      Z
8      3      Z
9      1      J
.
.
135    3      Z
136    1      M
137    1      R
138    2      T
139    3      Y

我尝试了这一点，但是它不会给我预期的结果，

df_Out = df_My.merge(df_Filter, axis=1), on='Grp')

任何想法如何实现这一目标？

原文

I have 2 dataframes

df_My which looks like this

Seq    Grp
-----------
1      1
2      1
3      2
4      1
5      2
6      2
7      3
8      3
9      1
.
.
135    3
136    1
137    1
138    2
139    3

The other dataframe is df_Filer

df_Filer

Seq    Grp    Alg
-------------------
1      1      J
3      2      F
7      3      Z
136    1      M
137    1      R
138    2      T
139    3      Y

I want to merge both dataframes into 1 where

1- if df_My Seq number has is already in df_Filer then it should merge with that number is df_My based on Seq in both df_My and df_Filter

2- if df_My Seq number is not in df_Filer then it should join with the smallest Seq in the same Grp

so the end result should be as follow

Seq    Grp    Alg
------------------
1      1      J
2      1      J
3      2      F
4      1      J
5      2      F
6      2      F
7      3      Z
8      3      Z
9      1      J
.
.
135    3      Z
136    1      M
137    1      R
138    2      T
139    3      Y

I tried this but it does not give me the expected result

df_Out = df_My.merge(df_Filter, axis=1), on='Grp')

Any idea how to achieve this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

怕倦 2025-01-30 10:59:05

如 merge_asof ：

out = pd.merge_asof(df1, df2, on='Seq', by='Grp')

另一个选项是，您可以 left-merge ，然后使用 groupby + idxmin 创建一个从GRP到ALG的映射丢失值，然后填写：

df_merged = df1.merge(df2, on='Seq', how='left', suffixes=('','_')).drop(columns=['Grp_'])
no_nan = df_merged.dropna()
mapping = df_merged['Grp'].map(no_nan.loc[no_nan.groupby('Grp')['Seq'].idxmin(), ['Grp','Alg']].set_index('Grp')['Alg'])
df_merged['Alg'] = df_merged['Alg'].fillna(mapping)

输出：

    Seq  Grp Alg
0     1    1   J
1     2    1   J
2     3    2   F
3     4    1   J
4     5    2   F
5     6    2   F
6     7    3   Z
7     8    3   Z
8     9    1   J
9   135    3   Z
10  136    1   M
11  137    1   R
12  138    2   T
13  139    3   Y

As @sammywemmy suggested, you could use merge_asof:

out = pd.merge_asof(df1, df2, on='Seq', by='Grp')

Another options is, you can left-merge, then use groupby + idxmin to create a mapping from Grp to Alg for the missing values, then fill:

df_merged = df1.merge(df2, on='Seq', how='left', suffixes=('','_')).drop(columns=['Grp_'])
no_nan = df_merged.dropna()
mapping = df_merged['Grp'].map(no_nan.loc[no_nan.groupby('Grp')['Seq'].idxmin(), ['Grp','Alg']].set_index('Grp')['Alg'])
df_merged['Alg'] = df_merged['Alg'].fillna(mapping)

Output:

    Seq  Grp Alg
0     1    1   J
1     2    1   J
2     3    2   F
3     4    1   J
4     5    2   F
5     6    2   F
6     7    3   Z
7     8    3   Z
8     9    1   J
9   135    3   Z
10  136    1   M
11  137    1   R
12  138    2   T
13  139    3   Y

回复收藏 0 原文

~没有更多了~