将元组转换为数据框中的分组行,而无需更改顺序
我有一个元组,需要将其转换为数据框架。
res1_ = [
('z1', '1'),
('z1', '2'),
('x1', '1'),
('x2', '1'),
('x1', '3'),
('z1', '1')]
我的预期数据帧应该是这样的:
docid secid
z1 [1,2]
x1 [1]
x2 [1]
x1 [3]
z1 [1]
如果注意到,则不会更改订单,如果在下一行中重复DOCID,则将两个Secid合并到一个列表中。 尽管X1发生了两次,但SEC ID 1和3不在单个列表中,因为我们在X1中部有DOCID X2。
我尝试了:
df = pd.DataFrame(res1_,columns=['docid','secid'])
df.groupby('docid')['secid'].apply(list)
但是当我失去订单时,没有运气,x1也被分组。
I have a tuple and I need to convert it to dataframe.
res1_ = [
('z1', '1'),
('z1', '2'),
('x1', '1'),
('x2', '1'),
('x1', '3'),
('z1', '1')]
My expected dataframe should be like this :
docid secid
z1 [1,2]
x1 [1]
x2 [1]
x1 [3]
z1 [1]
If you note, the order is not changed and if docid get repeated in next row, then two secids are merged into a single list.
Although x1 is occurring twice, sec id 1 and 3 are not in single list as we have docid x2 in mid of the x1s.
I tried with :
df = pd.DataFrame(res1_,columns=['docid','secid'])
df.groupby('docid')['secid'].apply(list)
But no luck as I am losing the order and x1 too is grouped.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用dataframe构造函数,然后
groupby.agg
:输出:
You can use the DataFrame constructor, then
GroupBy.agg
:output:
您可以使用
itertools.groupbys.groupbys.groupby
将数据分组,然后转换为数据框:输出:
You could use
itertools.groupby
to group the data, and then convert to a dataframe:Output: