在对第一列中的特定值进行排序/过滤后,使用 group_by 确定第二列的中位数?
我有一个巨大的数据集,很难使用。
我想找到第二列的中位数,但仅基于第一列中的一个值。我使用此公式来查找一般中位数,而无需按第一列中的特定值指定/排序:
df%>% +group_by(column1)%>% +summarise(Median=median(colum2))< /code>
但是,我希望在第 1 列中有一个特定值进行排序,并且我只想要基于第一个值的第二列的中位数。我会做类似下面的事情吗?
df%>% +group_by(column1, Specificvalue)%>% +summarise(Median=median(colum2))
有更简单的方法吗?根据第一列中的特定值创建新的数据框会更容易吗?如何做到这一点,以便我可以让第 1 列仅包含我想要的特定值,但包含其余行,以便我可以轻松确定第 2 列的中位数?
谢谢!!
I have a huge dataset which has been difficult to work with.
I want to find the median of a second column but only based on one value in the first column. I have used this formula to find general medians without specifying/sorting by the specific values in the first column:
df%>% +group_by(column1)%>% +summarise(Median=median(colum2))
However, there is a specific value in column1 I am hoping to sort by and I only want the medians of the second column based on this first value. Would I do something similar to the below?
df%>% +group_by(column1, specificvalue)%>% +summarise(Median=median(colum2))
Is there an easier way to do this? Would it be easier to make a new dataframe based on the specific value in the first column? How would that be done so that I could have column 1 only include the specific value I want but the rest of the rows included so I can easily determine the median of column2?
Thanks!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论