使用 pandas 获取每行一个特定值的最大出现次数
我有以下数据框:
1 2 3 4 5 6 7 8 9
0 0 0 1 0 0 0 0 0 1
1 0 0 0 0 1 1 0 1 0
2 1 1 0 1 1 0 0 1 1
...
我想为每一行获取该行中值 0 的最长序列。 因此,该数据帧的预期结果将是一个如下所示的数组:
[5,4,2,...]
如第一行所示,最大序列 eof 值 0 是 5,等等。
我看过这篇帖子并尝试开始这是第一行(尽管我想立即对整个数据框执行此操作),但我收到错误:
s=df_day.iloc[0]
(~s).cumsum()[s].value_counts().max()
类型错误:输入类型不支持 ufunc 'invert',并且 根据以下规定,输入无法安全地强制为任何支持的类型 铸造规则“安全”
当我手动插入这样的值时,
s=pd.Series([0,0,1,0,0,0,0,0,1])
(~s).cumsum()[s].value_counts().max()
>>>7
:我得到 7,这是行中总 0 的数量,但不是最大序列。 但是,我不明白为什么它一开始会引发错误,而且更重要的是,我想在 while 数据帧和每行的末尾运行它。
我的最终目标:连续最大程度地连续出现 0 值。
I have the following dataframe:
1 2 3 4 5 6 7 8 9
0 0 0 1 0 0 0 0 0 1
1 0 0 0 0 1 1 0 1 0
2 1 1 0 1 1 0 0 1 1
...
I want to get for each row the longest sequence of value 0 in the row.
so, the expected results for this dataframe will be an array that looks like this:
[5,4,2,...]
as on the first row, maximum sequenc eof value 0 is 5, ect.
I have seen this post and tried for the beginning to get this for the first row (though I would like to do this at once for the whole dataframe) but I got errors:
s=df_day.iloc[0]
(~s).cumsum()[s].value_counts().max()
TypeError: ufunc 'invert' not supported for the input types, and the
inputs could not be safely coerced to any supported types according to
the casting rule ''safe''
when I inserted manually the values like this:
s=pd.Series([0,0,1,0,0,0,0,0,1])
(~s).cumsum()[s].value_counts().max()
>>>7
I got 7 which is number of total 0 in the row but not the max sequence.
However, I don't understand why it raises the error at first, and , more important, I would like to run it on the end on the while dataframe and per row.
My end goal: get the maximum uninterrupted occurance of value 0 in a row.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
用于对每行连续
0
进行计数的矢量化解决方案,因此为了最大程度地使用 DataFramec
的max
:Vectorized solution for counts consecutive
0
per rows, so for maximal usemax
of DataFramec
:使用:
输出
Use:
OUTPUT
下面的代码应该可以完成这项工作。
函数
longest_streak
将计算连续零的数量并返回最大值,您可以在 df 上使用apply
。The following code should do the job.
the function
longest_streak
will count the number of consecutive zeros and return the max, and you can useapply
on your df.