如何旋转特定的数据框？

发布于 2025-01-14 01:13:16 字数 1238 浏览 1 评论 0原文

我有一个包含混合数据（浮点和文本）的数据框 df ，打印时看起来像这样（这是打印的很小一部分）：

              0           1
0   Le Maurepas         NaN
1       CODE_90     AREA_HA
2           112      194.97
3           121       70.37
4           211      113.86
5    La Rolande         NaN
6       CODE_90     AREA_HA
7           112      176.52
8           211       97.28

如果需要，可以通过以下方式重现此输出代码（例如）：

import pandas as pd

fst_col = ['Le Maurepas', 'CODE_90', 112, 121, 211, 'La Rolande', 'CODE_90', 112, 211]
snd_col = ['NaN', 'AREA_HA', 194.97, 70.37, 113.86, 'NaN', 'AREA_HA', 176.52, 97.28]

df = pd.DataFrame({'0' : fst_col, '1' : snd_col})

df

我想为我的数据框 df 提供另一个结构，并使其在打印时看起来像这样：

           Name     Code      Area
0   Le Maurepas      112    194.97
1   Le Maurepas      121     70.37
2   Le Maurepas      211    113.86
3    La Rolande      112    176.52
4    La Rolande      211     97.28

我浏览了 SO 并且我知道功能类似于pivot(index='', columns='', value='') 也许可以完成这项工作，但我不知道它是否适用于我的情况，事实上，我不知道如何应用它...

我是否还必须坚持使用这个功能，通过操作参数index，columns，values，或者是否有一种特定的方式，更准确地对应于我的结构初始数据帧df？

欢迎任何帮助。

原文

I have a dataframe df with mixed data (float and text) which, when printed, looks like this (it's a very small part of the printing):

              0           1
0   Le Maurepas         NaN
1       CODE_90     AREA_HA
2           112      194.97
3           121       70.37
4           211      113.86
5    La Rolande         NaN
6       CODE_90     AREA_HA
7           112      176.52
8           211       97.28

If necessary, this output can be reproduced by the following code (for example):

import pandas as pd

fst_col = ['Le Maurepas', 'CODE_90', 112, 121, 211, 'La Rolande', 'CODE_90', 112, 211]
snd_col = ['NaN', 'AREA_HA', 194.97, 70.37, 113.86, 'NaN', 'AREA_HA', 176.52, 97.28]

df = pd.DataFrame({'0' : fst_col, '1' : snd_col})

df

I would like to give another structure to my dataframe df and get it to look like this when printed:

           Name     Code      Area
0   Le Maurepas      112    194.97
1   Le Maurepas      121     70.37
2   Le Maurepas      211    113.86
3    La Rolande      112    176.52
4    La Rolande      211     97.28

I browsed SO and I am aware that a function like pivot(index='', columns='', values='') could maybe do the job, but I don't know if it is applicable in my case, and, in fact, I don't know how to apply it...

Do I still have to insist with this function, by manipulating the parameters index, columns, values, or is there a particular way, corresponding more precisely to the structure of my initial dataframe df?

Any help welcome.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

远昼 2025-01-21 01:13:16

IIUC，尝试：

#change the string "NaN" empty values
df["1"] = df["1"].replace("NaN", None)

output = pd.DataFrame()
output["Name"] = df.loc[df["1"].isnull(), "0"].reindex(df.index, method="ffill")
output["Code"] = pd.to_numeric(df["0"], errors="coerce")
output["Area"] = pd.to_numeric(df["1"], errors="coerce")
output = output.dropna().reset_index(drop=True)

>>> output

          Name   Code    Area
0  Le Maurepas  112.0  194.97
1  Le Maurepas  121.0   70.37
2  Le Maurepas  211.0  113.86
3   La Rolande  112.0  176.52
4   La Rolande  211.0   96.28

IIUC, try:

#change the string "NaN" empty values
df["1"] = df["1"].replace("NaN", None)

output = pd.DataFrame()
output["Name"] = df.loc[df["1"].isnull(), "0"].reindex(df.index, method="ffill")
output["Code"] = pd.to_numeric(df["0"], errors="coerce")
output["Area"] = pd.to_numeric(df["1"], errors="coerce")
output = output.dropna().reset_index(drop=True)

>>> output

          Name   Code    Area
0  Le Maurepas  112.0  194.97
1  Le Maurepas  121.0   70.37
2  Le Maurepas  211.0  113.86
3   La Rolande  112.0  176.52
4   La Rolande  211.0   96.28

回复收藏 0 原文

相守太难 2025-01-21 01:13:16

您可以使用：

indexes = (df[df['0'].eq('CODE_90')].index - 1).to_list()
indexes.append(len(df))
all_dfs = []
for idx in range(0, len(indexes)-1):
    df_temp = df.loc[indexes[idx]:indexes[idx+1]-1]
    print(df_temp)
    df_temp['Name'] = df_temp['0'].iloc[0]
    df_temp.rename(columns={'0': 'Code', '1': 'Area'}, inplace=True)
    all_dfs.append(df_temp.iloc[2:])

df = pd.concat(all_dfs, ignore_index=True)
print(df)

You can use:

indexes = (df[df['0'].eq('CODE_90')].index - 1).to_list()
indexes.append(len(df))
all_dfs = []
for idx in range(0, len(indexes)-1):
    df_temp = df.loc[indexes[idx]:indexes[idx+1]-1]
    print(df_temp)
    df_temp['Name'] = df_temp['0'].iloc[0]
    df_temp.rename(columns={'0': 'Code', '1': 'Area'}, inplace=True)
    all_dfs.append(df_temp.iloc[2:])

df = pd.concat(all_dfs, ignore_index=True)
print(df)

回复收藏 0 原文

~没有更多了~