Python计数列
我正在尝试使用每个属性(数字)的计数来创建新列,但是它给了我行总数,如下所示,
POM = pd.read_excel('POM.xlsx' , sheet_name='Sheet1')
POM['Merged']=POM['ACD ID'].astype(str)+PG['Date'].astype(str)
POM['Count']=POM['Merged'].count()
print(POM.head(2))
POM.to_excel('Test.xlsx' , index=False)
结果
Date ACD ID Merged Count
0 2022-03-08 14004 140042022-03-08 986
1 2022-03-09 14004 140042022-03-09 986
如何计算列(合并)中发现的唯一代码?
# Date ACD ID Merged Count
2022-03-08 00:00:00 14004 140042022-03-08 10
2022-03-09 00:00:00 14004 140042022-03-09 49
2022-03-10 00:00:00 14004 140042022-03-10 62
我尝试了回答的鲍泽代码。我找到了准确的答案,但同时,我由组创建的合并新的coulmn(日期+ ACD ID)未获得如下所示,该日期在(合并)中不匹配日期
POM = pd.read_excel('POM.xlsx' , sheet_name='Sheet1')
POM['Merged']=POM['ACD ID'].astype(str)+PG['Date'].astype(str)
#POM['Count'] = POM['Merged'].count()
POM['Count'] = POM['Merged'].copy().replace(POM['Merged'].value_counts().to_dict())
#POM['Count']=POM.groupby(["Merged"])["Merged"].transform("count")
#POM['counts'] = POM['Merged'].map(POM['Merged'].value_counts())
print(POM.head(6))
#POM.to_excel('Test.xlsx' , index=False)
结果结果
Date ACD ID Merged Count
0 2022-03-08 14004 140042022-03-08 1
1 2022-03-08 14005 140052022-03-09 1
2 2022-03-08 14006 140062022-03-10 1
3 2022-03-08 14007 140072022-03-15 1
I'm trying to create new column with the count of each attribute (Number) but it's given me the total count of row as shown below
POM = pd.read_excel('POM.xlsx' , sheet_name='Sheet1')
POM['Merged']=POM['ACD ID'].astype(str)+PG['Date'].astype(str)
POM['Count']=POM['Merged'].count()
print(POM.head(2))
POM.to_excel('Test.xlsx' , index=False)
Result
Date ACD ID Merged Count
0 2022-03-08 14004 140042022-03-08 986
1 2022-03-09 14004 140042022-03-09 986
How can I count the unique code which is repetition found in column (Merged)?
# Date ACD ID Merged Count
2022-03-08 00:00:00 14004 140042022-03-08 10
2022-03-09 00:00:00 14004 140042022-03-09 49
2022-03-10 00:00:00 14004 140042022-03-10 62
i have try the bellow code as answered & i found accurate answer but in same time the Merged New coulmn that i created by group(date+ ACD ID) are not accuaret as shown below , that date in ( merged ) not matched the date
POM = pd.read_excel('POM.xlsx' , sheet_name='Sheet1')
POM['Merged']=POM['ACD ID'].astype(str)+PG['Date'].astype(str)
#POM['Count'] = POM['Merged'].count()
POM['Count'] = POM['Merged'].copy().replace(POM['Merged'].value_counts().to_dict())
#POM['Count']=POM.groupby(["Merged"])["Merged"].transform("count")
#POM['counts'] = POM['Merged'].map(POM['Merged'].value_counts())
print(POM.head(6))
#POM.to_excel('Test.xlsx' , index=False)
result
Date ACD ID Merged Count
0 2022-03-08 14004 140042022-03-08 1
1 2022-03-08 14005 140052022-03-09 1
2 2022-03-08 14006 140062022-03-10 1
3 2022-03-08 14007 140072022-03-15 1
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试
groupby()
,然后transform(“ count”)
:查看此线程。
Try
groupby()
and thentransform("count")
:Check out this thread.
您可以做类似的事情:
创建合并的副本,然后计算合并中的值并将其转换为{id:count}的dist,然后使用该字典在复制的合并列中替换ID值。
这比
groupby
具有额外的好处,您可以保留数据框的原始索引和形状,同时仍填写每个ID的计数。前任:
You could do something like this:
You create a copy of Merged, then count the values in Merged and convert it to a dict of {id: count}, then use that dictionary to replace the id values in the copied Merged column.
This has the added benefit over
groupby
that you can retain your original indexing and shape of the dataframe while still filling in the counts of each id.Ex: