“系列分组依据”对象没有属性“is_unique”；

发布于 2025-01-10 05:51:54 字数 1422 浏览 0 评论 0原文

操作系统：Windows 10
python：3.7.11
IDE：jupyter笔记本

我有一个包含以下列四列的数据集：bug_report_number、class_id、time_stamp ，标签。数据集如下所示：

41737   120098  1583149803  0
41737   120116  1583149803  0
41737   120136  1583149803  0
41748   120179  1583135020  0
41748   120177  1583135020  -1
41748   120177  1583135020  -1
41754   120177  1583135020  1
41754   120200  1583135020  0
41754   120188  1583135020  0

我想按 bug_report_number 进行分组，然后检查 class_id 列值对于该错误报告是否是唯一的。

例如，对于 41748 bug_report_number 我期望得到 False，对于 41754 我期望得到 True。

我编写的代码如下：

import pandas as pd
train_file_path = "dataset_hbase - v.03.csv"
columns_name = ["bug_report_number", "class_id", "time_stamp", "label"]
columns_dtype = {0: "int64", 1: "int64", 2: "int64", 3:"int64"}
df = pd.read_csv(train_file_path, header=None, names=columns_name, dtype=columns_dtype)

temp = df.groupby(["bug_report_number"])
temp["class_id"].is_unique

但是当我使用 .is_unique 时，它返回以下错误：

AttributeError: 'SeriesGroupBy' object has no attribute 'is_unique'

Question:

How to groupby bug_report_number and then check if the class_id< /code> 列值对于该错误报告是否是唯一的？

原文

Operating System: Windows 10
python: 3.7.11
IDE: jupyter notebook

I have a dataset with the four following columns: bug_report_number, class_id, time_stamp, label. The dataset is something like bellow:

41737   120098  1583149803  0
41737   120116  1583149803  0
41737   120136  1583149803  0
41748   120179  1583135020  0
41748   120177  1583135020  -1
41748   120177  1583135020  -1
41754   120177  1583135020  1
41754   120200  1583135020  0
41754   120188  1583135020  0

I want to groupby bug_report_number and then check if the class_id column values are unique for that bug report or not.

For example, for 41748 bug_report_number I expect to get False, and for 41754 I expect to get True.

The code I wrote is as follows:

import pandas as pd
train_file_path = "dataset_hbase - v.03.csv"
columns_name = ["bug_report_number", "class_id", "time_stamp", "label"]
columns_dtype = {0: "int64", 1: "int64", 2: "int64", 3:"int64"}
df = pd.read_csv(train_file_path, header=None, names=columns_name, dtype=columns_dtype)

temp = df.groupby(["bug_report_number"])
temp["class_id"].is_unique

but when I use .is_unique it returns the following error:

AttributeError: 'SeriesGroupBy' object has no attribute 'is_unique'

Question:

How to groupby bug_report_number and then check if the class_id column values are unique for that bug report or not?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

风月客 2025-01-17 05:51:54

使用：

data = pd.DataFrame({'bug_report_number': [1,2,1,2,1], 'id': [50,35,50,30,50]})
df =  pd.DataFrame(data)
df.groupby('bug_report_number')['id'].apply(lambda x: 0 if len(list(x))==len(set(x)) else 1)

输出：

Use:

data = pd.DataFrame({'bug_report_number': [1,2,1,2,1], 'id': [50,35,50,30,50]})
df =  pd.DataFrame(data)
df.groupby('bug_report_number')['id'].apply(lambda x: 0 if len(list(x))==len(set(x)) else 1)

Output:

回复收藏 0 原文

缱绻入梦 2025-01-17 05:51:54

IIUC，您可以使用groupby + nunique + eq(1)。这个想法是计算每个“bug_report_number”的唯一“class_id”的数量，如果等于 1，则返回 True，否则返回 False。

s = df.groupby('bug_report_number')['class_id'].nunique()
out =  s.eq(1)

输出：

bug_report_number
41737    False
41748    False
41754    False
Name: class_id, dtype: bool

IIUC, you could use groupby + nunique + eq(1). The idea is to count the number of unique "class_id"s for each "bug_report_number" and return True if it's equal to 1 False otherwise.

s = df.groupby('bug_report_number')['class_id'].nunique()
out =  s.eq(1)

Output:

bug_report_number
41737    False
41748    False
41754    False
Name: class_id, dtype: bool

回复收藏 0 原文

~没有更多了~

关于作者

御守

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

“系列分组依据”对象没有属性“is_unique”；

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

“系列分组依据”对象没有属性“is_unique”；

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。