查找熊猫数据框列的所有独特组合
我有一个数据平衡问题,其中我的图像具有多个类,即每个图像都可以有多个类或一个类。我有标签文件,该文件将所有从A到G和FN(图像名称)命名为列的类。每列具有一个值0或1,其中0表示图像中没有类,而1表示图像中存在特定类。现在,我想以一种方式将数据帧子集成,以使我获得不同类别的不同类别 << img src =“ https://i.sstatic.net/woux2.png” alt =“ labels dataframe”>
问题是,如果我将多个条件与dataframe命令(例如表示dataframe:
pp_A_B=pp[(pp['A']==1) & (pp['B']==1) & (pp['C']==0) & (pp['D']==0) & (x['E']==0) & (x['F']==0) &(pp['G']==0)]
在这里,pp_a_b为我提供了只有A和B类的图像。
我将不得不编写多个变量以了解各种组合。方式。
I have a data balancing problem at hand wherein I have images which have multiple classes i.e. each image can have multiple class or one class. I have the label file which has all the classes named from A to G and fn(image name) as the columns. Each column has a value 0 or 1,wherein 0 means that class is absent in image and 1 means that particular class is present in the image. Now, I want to subset the dataframe in such a manner that I get different dataframes each with combinations of different classes
The issue is if I use the multiple conditions with the dataframe command such as (here pp is used to denote dataframe :
pp_A_B=pp[(pp['A']==1) & (pp['B']==1) & (pp['C']==0) & (pp['D']==0) & (x['E']==0) & (x['F']==0) &(pp['G']==0)]
Here,pp_A_B gives me the dataframe having images which have only A and B classes.
I will have to write multiple variables to know about the various combinations.Kindly help how can we automate it to get all the possible combinations in a faster manner.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
嗨,您应该使用
groupyby
和get_group
方法来提取所需的元素。如果您想获取数据= 0&amp; b = 0:
更新:
现在使用上述方法:
在这里您可以轻松找到所有满足a = 0的数据&amp; B = 0.
Now you can iterate thought all of your targeted columns combinations this way :
Hi you should use the
groupyby
andget_group
methods to extract the desired elements.Here is an example if you are trying to get datas where A = 0 & B= 0 :
UPDATE :
And now the use of the mentioned methods :
Here you easily find all the data that meet A = 0 & B = 0.
Now you can iterate thought all of your targeted columns combinations this way :
让我们假设您有以下数据框架:
并且要将所有数据框架组合存储在列表中。 ,您可以编写以下功能以获取与
0
和1
相同组合相对应的所有索引然后 所有索引组合:
例如,如果您运行,
您将获得一个可能组合的查询。
注意:
对于所有可能的组合,您甚至可以进一步创建几个数据帧(未存储在列表中)。如果您弄脏并使用这样的东西:
它会创建一个名为
final_output
的字典。在那里,存储了所有创建的数据帧的名称。例如:然后,您只需在
all_names
中打印所有帧,例如df_abg
,它返回您:Let us suppose you have the following data frame:
and that you want to store all data frame combinations in a list. Then, you can write the following function to get all indices that correspond to the same combination of
0
and1
:Finally, you can decompose the data frame in chunks by iterating over all index combinations:
If you then, for instance, run
you will get a query for one possible combination.
Note:
You can even go further an really create several data frames (not stored in a list) for all possible combinations. If you go dirty and use something like this:
It creates you a dictionary named
final_output
. There, the names of all created data frames are stored. For example:You can then just print all frames in
all_names
, for exampledf_ABG
, which returns you: