用熊猫将一列转置
我正在尝试学习大熊猫,并且很难实现一个简单的目标。
我有一个数据集,我想在多列中对列的行进行转动(检查IMG)。 列“最大起飞重量和发电厂类型”具有12个不同的值,目标是实现该值成为“值”填充的列。 保留其他列也很重要,在此过程的后面,我将消除其中一些列,但我想知道是否可以实现目标,我知道这行中的某些值会丢失,例如“坐标”。
我已经阅读了有关
df=df.pivot_table(index=['REF_DATE','GEO','DGUID','Airports','UOM','UOM_ID','SCALAR_FACTOR','SCALAR_ID','VECTOR','COORDINATE','STATUS','SYMBOL','TERMINATED','DECIMALS'],
columns='Maximum take-off weight and type of power plant',
values ='VALUE')
输入样本:
"REF_DATE","GEO","DGUID","Airports","Maximum take-off weight and type of power plant","UOM","UOM_ID","SCALAR_FACTOR","SCALAR_ID","VECTOR","COORDINATE","VALUE","STATUS","SYMBOL","TERMINATED","DECIMALS"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 2,000 kilograms and under","Number","223","units ","0","v41840825","1.1.1","3551","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 2,001 to 4,000 kilograms","Number","223","units ","0","v41840829","1.1.2","4702","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 4,001 to 5,670 kilograms","Number","223","units ","0","v41840830","1.1.3","4293","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 5,671 to 9,000 kilograms","Number","223","units ","0","v41840831","1.1.4","1498","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 9,001 to 18,000 kilograms","Number","223","units ","0","v41840832","1.1.5","1704","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 18,001 to 35,000 kilograms","Number","223","units ","0","v41840833","1.1.6","1790","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 35,001 kilograms and over","Number","223","units ","0","v41840834","1.1.7","202","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, jet engines","Number","223","units ","0","v41840835","1.1.8","475","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, turbo-propellers","Number","223","units ","0","v41840836","1.1.9","9645","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, piston engines","Number","223","units ","0","v41840826","1.1.10","6398","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, helicopters","Number","223","units ","0","v41840827","1.1.11","1203","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, gliders","Number","223","units ","0","v41840828","1.1.12","19","","","","0"
I'm trying to learn Pandas and I'm having difficulties to achieve a simple goal.
I have a dataset where I want to transpose the rows of a column in multiple column (check img).
The column "Maximum take-off weight and type of power plant" has 12 different values and the goal is to achive this values become columns filled with "VALUE".
It's also important to keep the other columns, later in the process I will eliminate some of them but I was wondering if it's possible to achieve the goal anyway, I know some of the values in this rows will be lost, like 'COORDINATE'.
I have read the documentation about pivot and pivot_table, but I haven't achieved anything. Since I want to keep all the columns and just transform 'Maximum take-off weight and type of power plant' values into columns, filled with values in 'VALUE' I thought I would need to add to 'index' all columns I want to keep.
df=df.pivot_table(index=['REF_DATE','GEO','DGUID','Airports','UOM','UOM_ID','SCALAR_FACTOR','SCALAR_ID','VECTOR','COORDINATE','STATUS','SYMBOL','TERMINATED','DECIMALS'],
columns='Maximum take-off weight and type of power plant',
values ='VALUE')
The output should be something like this:
Input sample:
"REF_DATE","GEO","DGUID","Airports","Maximum take-off weight and type of power plant","UOM","UOM_ID","SCALAR_FACTOR","SCALAR_ID","VECTOR","COORDINATE","VALUE","STATUS","SYMBOL","TERMINATED","DECIMALS"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 2,000 kilograms and under","Number","223","units ","0","v41840825","1.1.1","3551","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 2,001 to 4,000 kilograms","Number","223","units ","0","v41840829","1.1.2","4702","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 4,001 to 5,670 kilograms","Number","223","units ","0","v41840830","1.1.3","4293","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 5,671 to 9,000 kilograms","Number","223","units ","0","v41840831","1.1.4","1498","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 9,001 to 18,000 kilograms","Number","223","units ","0","v41840832","1.1.5","1704","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 18,001 to 35,000 kilograms","Number","223","units ","0","v41840833","1.1.6","1790","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Maximum take-off weight, 35,001 kilograms and over","Number","223","units ","0","v41840834","1.1.7","202","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, jet engines","Number","223","units ","0","v41840835","1.1.8","475","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, turbo-propellers","Number","223","units ","0","v41840836","1.1.9","9645","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, piston engines","Number","223","units ","0","v41840826","1.1.10","6398","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, helicopters","Number","223","units ","0","v41840827","1.1.11","1203","","","","0"
"1997-01","Canada","2016A000011124","Total, all airports","Power plant, gliders","Number","223","units ","0","v41840828","1.1.12","19","","","","0"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
基于问题中的示例,看来您只需要
value
在第一行中的条目匹配每个唯一最大起飞重量和发电厂类型
条目。这是一种完成您要求的方法(为简单起见,我将列名缩短为
最大权重和类型
,减少了列的数量和使用的任意值):示例输入:
输出:输出:
说明:
最大权重和类型
列中每个唯一值的第一行以外的所有行,value
在原始dataFrame的第一行中匹配每个唯一最大ravey and Type
最大重量和类型
及其唯一值,并从原始数据框架的列标签中消除value
最大权重和类型
,value
),使用reindex()
添加最大权重和类型
值作为列标记(最初使用NAN值),并使用nistion()
覆盖这些新列中的NAN值,并使用原始DataFrame的相应value
条目。Based on the example in your question, it looks like you only need the
Value
entry in the first row matching each uniqueMaximum take-off weight and type of power plant
entry.Here is a way to do what you've asked (for simplicity, I have shortened the column name to
Max weight and type
, reduced the number of columns and used arbitrary values):Sample input:
Output:
Explanation:
drop_duplicates()
to eliminate all but the first row for each unique value in theMax weight and type
columnMax weight and type
and whose only row contains theValue
entry in the first row of the original dataframe matching each uniqueMax weight and type
entryMax weight and type
with its unique values and eliminatesValue
from the original dataframe's column labelsMax weight and type
,Value
), usereindex()
to add theMax weight and type
values as column labels (with NaN values initially), and useassign()
to overwrite the NaN values in these new columns with the correspondingValue
entries from the original dataframe.如果您要保留原始DF,并且只需在两个不同现有列中添加一些带有名称和值的列,则可以用说明原始值仅填充新列的第一行。问候...
In case you want to keep the original df and just add some columns with names and values from two different existing columns you can try this with remark that the original values populate just the first row of new columns. Regards...
您可以尝试:
结果:
you can try:
Result: