熊猫中是否有分组
我在我的数据框架上遇到了一些麻烦。我有以下DF。我正在尝试分组,一排被“ - ”和另一行分开。我遇到的问题是,我需要连续数量一定数量(至少4个)。
a b c
0 a Num_1 0
1 a Num_1 1
2 a Num_1 2
3 a Num_2 5
4 a Num_2 6
5 a Num_2 7
6 a Num_2 8
7 a Num_2 9
我制作了以下代码:
def split_by_threshold(li):
inds = [0]+[ind for ind,(i,j) in enumerate(zip(li,li[1:]),1) if j-i != 1]+[len(li)+1]
rez = [li[i:j] for i,j in zip(inds,inds[1:])]
return rez
def dropst(serie):
serie = serie.to_numpy().tolist()
serie = list(dict.fromkeys(serie))
return '\n'.join(serie)
def joining_(series):
series = series.to_numpy().tolist()
if series:
split_li = split_by_threshold(series)
a=[]
for x in split_li:
if x[-1]-x[0]:
a.append(str(x[0])+'-'+str(x[-1]))
return '\n'.join(a)
else:
return 'None'
col_1, col_2, col_3 = d.columns
final = d.groupby([col_1], as_index = False).agg(
{ col_1: 'first',
col_2: dropst,
col_3: joining_}
)
print(final)
我收到的答案是:
a b c
0 a Num_1\nNum_2 0-2\n5-9
我有点需要:
a b c
0 a Num_2 5-9
i having some trouble with the Dataframe of mine. I have the following DF below. I am trying to group by , one row separated by "-" and other just simply \n. The problem that i have is that i need to has a certain amount of numbers in a row (minimum 4).
a b c
0 a Num_1 0
1 a Num_1 1
2 a Num_1 2
3 a Num_2 5
4 a Num_2 6
5 a Num_2 7
6 a Num_2 8
7 a Num_2 9
And i made the following code:
def split_by_threshold(li):
inds = [0]+[ind for ind,(i,j) in enumerate(zip(li,li[1:]),1) if j-i != 1]+[len(li)+1]
rez = [li[i:j] for i,j in zip(inds,inds[1:])]
return rez
def dropst(serie):
serie = serie.to_numpy().tolist()
serie = list(dict.fromkeys(serie))
return '\n'.join(serie)
def joining_(series):
series = series.to_numpy().tolist()
if series:
split_li = split_by_threshold(series)
a=[]
for x in split_li:
if x[-1]-x[0]:
a.append(str(x[0])+'-'+str(x[-1]))
return '\n'.join(a)
else:
return 'None'
col_1, col_2, col_3 = d.columns
final = d.groupby([col_1], as_index = False).agg(
{ col_1: 'first',
col_2: dropst,
col_3: joining_}
)
print(final)
The answer i receive is :
a b c
0 a Num_1\nNum_2 0-2\n5-9
and i kinda need to be:
a b c
0 a Num_2 5-9
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
iiuc,您可以
groupby
a,b,最终是一个新组来识别连续值。然后agg
带有自定义功能:输出:
IIUC, you can
groupby
a, b, and eventually a new group to identify consecutive values. Thenagg
with a custom function:Output: