如何在 R 中计算条件模式?
我有一个包含 11 列和 100000 行的大型数据集(例如),其中值是 1,2,3,4。其中 4 是缺失值。我需要的是计算众数。我正在使用以下数据和函数,
ac<-matrix(c("4","4","4","4","4","4","4","3","3","4","4"), nrow=1, ncol=11)
m<-as.matrix(apply(ac, 1, Mode))
如果我使用上面的命令,那么它会给我“4”作为模式,这是我不需要的。我希望模式将省略 4 并将“3”显示为模式,因为 4 是缺失值。
提前致谢。
I have a large data set with 11 columns and 100000 rows (for example) in which i have values 1,2,3,4. Where 4 is a missing value. What i need is to compute the Mode. I am using following data and function
ac<-matrix(c("4","4","4","4","4","4","4","3","3","4","4"), nrow=1, ncol=11)
m<-as.matrix(apply(ac, 1, Mode))
if i use the above command then it will give me "4" as the Mode, which i do not need. I want that the Mode will omit 4 and display "3" as Mode, because 4 is a missing value.
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
R 拥有强大的处理缺失值的机制。您可以使用
NA
表示缺失值,并且许多 R 函数都支持处理NA
值。创建一个包含随机数的小矩阵:
由于您用值 4 表示缺失,因此您可以用
NA
替换每个出现的情况:例如,要计算平均值:
要计算众数,您可以使用包
prettyR
中的函数Mode
:(请注意,在这个非常小的数据集中,只有第 4 行具有唯一的模态值:R has a powerful mechanism to work with missing values. You can represent a missing value with
NA
and many of the R functions have support for dealing withNA
values.Create a small matrix with random numbers:
Since you represent missingness by the value 4, you can replace each occurrence by
NA
:To calculate, for example, the mean:
To calculate the mode, you can use the function
Mode
in packageprettyR
: (Note that in this very small set of data, only the 4th row has a unique modal value:一种方法(尽管我不太确定其性能):
这是用于查找整体模式的代码,但它很容易适应在行内查找。
或者,基于单行者 Thomas Lumley 对 R 邮件列表上的一个老问题的一些回答:
One way of doing it (though I'm not too sure on its performance):
This is code for looking for the overall mode, but it's easily adapted to look within rows.
Or, based upon some answer to an old question on the R mailing list by Thomas Lumley, a oneliner: