熊猫在两个分位数之间的组中挑选值
我想通过选择每个组之间的两个值(dinamsin定义为分位数)之间的行来过滤我的数据集 。具体而言,我有一个数据集,例如
import pandas as pd
df = pd.DataFrame({'day': ['one', 'one', 'one', 'one', 'one', 'one', 'two', 'two', 'two', 'two', 'two'],
'weather': ['rain', 'rain', 'rain', 'sun', 'sun', 'sun', 'sun', 'rain', 'rain', 'sun', 'rain'],
'value': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]})
我想选择每天的值在0.1和0.9分位数之间的行,每个we仪。我可以通过计算分位数,
df.groupby(['day', 'weather']).quantile([0.1, .9])
但是我感到卡住了。将结果数据集加入原始数据集(原始数据集可能很大),我想知道是否有一些东西
df.groupby(['day', 'weather']).select('value', between=[0.1, 0.9])
I'd like to filter my dataset by picking rows that are between two values (dinamically defined as quantiles) per each group. Concretely, I have a dataset like
import pandas as pd
df = pd.DataFrame({'day': ['one', 'one', 'one', 'one', 'one', 'one', 'two', 'two', 'two', 'two', 'two'],
'weather': ['rain', 'rain', 'rain', 'sun', 'sun', 'sun', 'sun', 'rain', 'rain', 'sun', 'rain'],
'value': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]})
I'd like to select the rows where the values are between the 0.1 and 0.9 quantile per each day and per each weater. I can calculate the quantiles via
df.groupby(['day', 'weather']).quantile([0.1, .9])
But then I feel stuck. Joining the resulting dataset with the original one it's a waste (the original dataset can be quite big), and I am wondering if there is something along the lines of
df.groupby(['day', 'weather']).select('value', between=[0.1, 0.9])
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
变换
value
用Quantile
Transform
value
withquantile