如何在不重复PANDAS DataFrame组合的情况下进行随机列组合的循环？

发布于 2025-01-24 14:08:38 字数 1999 浏览 0 评论 0原文

我有一个有4列（a，b，d，e，f，g）的熊猫数据框架。我想将每种组合随机分为4种组合（例如ABDE，ADEF，AEFG）。然后将组合的列添加到我的现有数据框架中，其中包含列“ C”（例如，输出将如下：Cabde）。但是我想制作所有组合，并将其添加到包含“ C”列的其他数据框架中，并将其保存为数据框架。这是我的DFC ：（带有C列C的DataFrame）

             C
0     0.439024
1     0.429268
2     0.429268
3     0.434146
4     0.439024
...
2203  0.346341
2204  0.341463
2205  0.331707
2206  0.312195
2207  0.390244

这是我的DF6（带有列（a，b，d，e，f，g）的DataFrame，

             A         B         D         E         F         G
0     0.043902  0.014634  0.356098  0.253659  0.117073  0.112195
1     0.043902  0.058537  0.375610  0.229268  0.141463  0.082927
2     0.058537  0.087805  0.400000  0.234146  0.141463  0.053659
3     0.068293  0.102439  0.429268  0.239024  0.146341  0.034146
4     0.082927  0.102439  0.468293  0.248780  0.151220  0.029268
...
2203  0.063415  0.068293  0.204878  0.312195  0.019512  0.053659
2204  0.053659  0.073171  0.195122  0.307317  0.019512  0.053659
2205  0.063415  0.073171  0.180488  0.302439  0.024390  0.043902
2206  0.073171  0.073171  0.160976  0.302439  0.034146  0.043902
2207  0.092683  0.087805  0.097561  0.287805  0.043902  0.053659

这是我的代码，可以从DF6中获取列的随机组合：

df4.sample(n=4,axis='columns')

这是我使用C列和DF4添加数据框的方式：

dfC.join(dfR)

这是示例输出：

             C         D         A         F         B
0     0.439024  0.356098  0.043902  0.117073  0.014634
1     0.429268  0.375610  0.043902  0.141463  0.058537
2     0.429268  0.400000  0.058537  0.141463  0.087805
3     0.434146  0.429268  0.068293  0.146341  0.102439
4     0.439024  0.468293  0.082927  0.151220  0.102439
...
2203  0.346341  0.204878  0.063415  0.019512  0.068293
2204  0.341463  0.195122  0.053659  0.019512  0.073171
2205  0.331707  0.180488  0.063415  0.024390  0.073171
2206  0.312195  0.160976  0.073171  0.034146  0.073171
2207  0.390244  0.097561  0.092683  0.043902  0.087805

但是我想获得所有组合并将其保存为数据框架。

原文

I have a pandas dataframe that has 4 columns (A,B,D,E,F,G). I want to randomize each combination into 4 combinations (e.g. ABDE, ADEF, AEFG). And then add the combined columns into my existing dataframe which contains column 'C' (the output will be like this for example: CABDE). But I want to make all the combinations and add it to the other dataframe which contains column 'C', and save each of it as a dataframe.
This is my dfC: (the dataframe with column C in it)

             C
0     0.439024
1     0.429268
2     0.429268
3     0.434146
4     0.439024
...
2203  0.346341
2204  0.341463
2205  0.331707
2206  0.312195
2207  0.390244

This is my df6 (the dataframe with column (A,B,D,E,F,G)

             A         B         D         E         F         G
0     0.043902  0.014634  0.356098  0.253659  0.117073  0.112195
1     0.043902  0.058537  0.375610  0.229268  0.141463  0.082927
2     0.058537  0.087805  0.400000  0.234146  0.141463  0.053659
3     0.068293  0.102439  0.429268  0.239024  0.146341  0.034146
4     0.082927  0.102439  0.468293  0.248780  0.151220  0.029268
...
2203  0.063415  0.068293  0.204878  0.312195  0.019512  0.053659
2204  0.053659  0.073171  0.195122  0.307317  0.019512  0.053659
2205  0.063415  0.073171  0.180488  0.302439  0.024390  0.043902
2206  0.073171  0.073171  0.160976  0.302439  0.034146  0.043902
2207  0.092683  0.087805  0.097561  0.287805  0.043902  0.053659

This is my code to get a randomized combination of columns from df6:

df4.sample(n=4,axis='columns')

This is how I add the dataframe with C column and the df4:

dfC.join(dfR)

This is the sample output:

             C         D         A         F         B
0     0.439024  0.356098  0.043902  0.117073  0.014634
1     0.429268  0.375610  0.043902  0.141463  0.058537
2     0.429268  0.400000  0.058537  0.141463  0.087805
3     0.434146  0.429268  0.068293  0.146341  0.102439
4     0.439024  0.468293  0.082927  0.151220  0.102439
...
2203  0.346341  0.204878  0.063415  0.019512  0.068293
2204  0.341463  0.195122  0.053659  0.019512  0.073171
2205  0.331707  0.180488  0.063415  0.024390  0.073171
2206  0.312195  0.160976  0.073171  0.034146  0.073171
2207  0.390244  0.097561  0.092683  0.043902  0.087805

But I want to get all of the combinations and save it as a dataframe. I will get 15 combinations which means 15 new dataframes.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

獨角戲 2025-01-31 14:08:38

您可以为列名称的所有组合创建数据框字典：

from  itertools import combinations

cols= ['A','B','D','E','F','G']
#or get columns to variable 
cols = df4.columns

d = {"".join(tup): dfC.join(df4[tup]) for tup in combinations(cols, 4)}

print (d['CABDE'])

You can create dictionary of DataFrames for all combinations of columns names:

from  itertools import combinations

cols= ['A','B','D','E','F','G']
#or get columns to variable 
cols = df4.columns

d = {"".join(tup): dfC.join(df4[tup]) for tup in combinations(cols, 4)}

print (d['CABDE'])

回复收藏 0 原文

~没有更多了~