忽略groupby()子句中的最后一个值
我想知道这是否可能。我当前有一条代码行,该行累积地添加了我的总时间(s)
列中的所有值,该列由列cyclenumber中包含的值分组
。进入称为cycle_times
的列表。我现在正在实现以下操作:
cycle_times = raw_data['Total Time (s)'].diff().fillna(0).groupby(interim_output['CycleNumber']).cumsum()
通过这样的情况,这在小组的末尾提供了一个输出:
print(interim_output['CycleNumber'][328:334])
328 1
329 1
330 1
331 2
332 2
333 2
print(cycle_times[328:334])
328 65.643
329 65.673
330 65.994
331 66.008
332 0.0
333 0.251
这几乎是我想要的。但是,如您所见,cyclenumber
中的第2个实例正在添加到总数(机器在读取中重置的短时间)。无论如何,是否有使用GroupBy,并告诉它忽略此值,或者强迫其重置cyclenumber
的更改?如果我这样拥有,我期望的输出就是这样:
print(cycle_times[328:334])
328 65.643
329 65.673
330 65.994
331 0.0
332 0.0
333 0.251
任何帮助都将不胜感激!
I was wondering if this was possible. I currently have a line of code that accumulatively adds all the values in my Total Time (s)
column, grouped by the value contained in the column CycleNumber
. Into a list called cycle_times
. I'm achieving this right now as follows:
cycle_times = raw_data['Total Time (s)'].diff().fillna(0).groupby(interim_output['CycleNumber']).cumsum()
This provides an output at the end of the group by, like this:
print(interim_output['CycleNumber'][328:334])
328 1
329 1
330 1
331 2
332 2
333 2
print(cycle_times[328:334])
328 65.643
329 65.673
330 65.994
331 66.008
332 0.0
333 0.251
Which is almost what I want. However, as you can see, the first instance of number 2 in CycleNumber
is adding to the total (the short time it takes for the machine to reset in its reading). Is there anyway of using groupBy, and telling it to ignore this value, or forcing it to reset at the change of CycleNumber
? If I had it this way, my desired output would be this:
print(cycle_times[328:334])
328 65.643
329 65.673
330 65.994
331 0.0
332 0.0
333 0.251
Any help would be most appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为缺少一个.groupby(df ['cyclenumber'])来获得想要的东西,请参阅“ cycle_times_v1”。但是,结果代码是非常不可读的。我添加了一个给出相同输出但更明确的版本,请参见“ cycle_times_v2”
I think there is one .groupby(df['CycleNumber']) missing to get what you want, see "cycle_times_V1". However, the resulting code is then very unreadable. I added a version which gives the same output but is much more explicit, see "cycle_times_V2"