Pandas 对文件进行排序并对值进行分组

发布于 2025-01-19 09:11:03 字数 1829 浏览 0 评论 0原文

我正在学习大熊猫，但是遇到了一些麻烦。我将数据作为数据帧导入，并希望将2017年人口值汇入四个相等大小的组。并计算group4的数量

，但是系统打印出来：

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-52-05d9f2e7ffc8> in <module>
      2 
      3 df=pd.read_excel('C:/Users/Sam/Desktop/商業分析/Python_Jabbia1e/Chapter 2/jaggia_ba_1e_ch02_Data_Files.xlsx',sheet_name='Population')
----> 4 df=df.sort_values('2017',ascending=True)
      5 df['Group'] = pd.qcut(df['2017'], q = 4, labels = range(1, 5))
      6 splitData = [group for _, group in df.groupby('Group')]

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in sort_values(self, by, axis, ascending, inplace, kind, na_position, ignore_index, key)
   5453 
   5454             by = by[0]
-> 5455             k = self._get_label_or_level_values(by, axis=axis)
   5456 
   5457             # need to rewrap column in Series to apply key function

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_label_or_level_values(self, key, axis)
   1682             values = self.axes[axis].get_level_values(key)._values
   1683         else:
-> 1684             raise KeyError(key)
   1685 
   1686         # Check for duplicates

KeyError: '2017'

有什么问题？谢谢〜

这是数据框：

，我尝试了：

df=pd.read_excel('C:/Users/Sam/Desktop/商業分析/Python_Jabbia1e/Chapter 2/jaggia_ba_1e_ch02_Data_Files.xlsx',sheet_name='Population')
df=df.sort_values('2017',ascending=True)
df['Group'] = pd.qcut(df['2017'], q = 4, labels = range(1, 5))
splitData = [group for _, group in df.groupby('Group')]
print('The number of group4 is :',splitData[3].shape[0])

原文

I'm learning pandas,but having some trouble.
I import data as DataFrame and want to bin the 2017 population values into four equal-size groups.
And count the number of group4

However the system print out:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-52-05d9f2e7ffc8> in <module>
      2 
      3 df=pd.read_excel('C:/Users/Sam/Desktop/商業分析/Python_Jabbia1e/Chapter 2/jaggia_ba_1e_ch02_Data_Files.xlsx',sheet_name='Population')
----> 4 df=df.sort_values('2017',ascending=True)
      5 df['Group'] = pd.qcut(df['2017'], q = 4, labels = range(1, 5))
      6 splitData = [group for _, group in df.groupby('Group')]

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in sort_values(self, by, axis, ascending, inplace, kind, na_position, ignore_index, key)
   5453 
   5454             by = by[0]
-> 5455             k = self._get_label_or_level_values(by, axis=axis)
   5456 
   5457             # need to rewrap column in Series to apply key function

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_label_or_level_values(self, key, axis)
   1682             values = self.axes[axis].get_level_values(key)._values
   1683         else:
-> 1684             raise KeyError(key)
   1685 
   1686         # Check for duplicates

KeyError: '2017'

What's wrong with it?
Thanks~

Here's the dataframe:

And I tried:

df=pd.read_excel('C:/Users/Sam/Desktop/商業分析/Python_Jabbia1e/Chapter 2/jaggia_ba_1e_ch02_Data_Files.xlsx',sheet_name='Population')
df=df.sort_values('2017',ascending=True)
df['Group'] = pd.qcut(df['2017'], q = 4, labels = range(1, 5))
splitData = [group for _, group in df.groupby('Group')]
print('The number of group4 is :',splitData[3].shape[0])

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我偏爱纯白色 2025-01-26 09:11:03

您正在将 df.sort_values() 的键作为 str 插入。您可以将其作为列表中的元素提供，也可以不提供。

df = df.sort_values(by=['2017'], ascending=True)

或

df = df.sort_values(by='2017', ascending=True)

仅当列值与您传递的字符串完全匹配时才有效。如果它不是字符串或者该字符串包含一些空格，则它将不起作用。您可以在排序之前删除任何尾随空格，

df.columns = df.columns.str.strip()

如果它不是您应该使用的字符串，

df = df.sort_values(by=[2017], ascending=True)

You are inserting the key for df.sort_values() as a str. You can either give it as an element in a list or not.

df = df.sort_values(by=['2017'], ascending=True)

df = df.sort_values(by='2017', ascending=True)

This only works if the column value is exactly matching the string you pass. If it is not a string or if that string contains some white spaces it won't work. You can remove any trailing white spaces before sorting by,

df.columns = df.columns.str.strip()

and if it is not a string you should use,

df = df.sort_values(by=[2017], ascending=True)

回复收藏 0 原文

清风疏影 2025-01-26 09:11:03

首先，您有4条与排序有问题，您告诉排序功能以寻找String 2017，但它是整数。尝试此操作，然后继续使用您的代码：

df=df.sort_values([2017],ascending=True)

Firstly, you have problem in 4 line with the sort, you tell sort function to look for string 2017, but it's integer. Try this then move on on your code:

df=df.sort_values([2017],ascending=True)

回复收藏 0 原文

~没有更多了~

关于作者

寄居人

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

Pandas 对文件进行排序并对值进行分组

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

Pandas 对文件进行排序并对值进行分组

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。