在 R 中计算精确度、召回率和 F1 分数的简单方法
我在 R 中使用 rpart 分类器。问题是 - 我想在测试数据上测试经过训练的分类器。这很好 - 我可以使用 predict.rpart
函数。
但我还想计算精确率、召回率和 F1 分数。
我的问题是 - 我是否必须自己为这些函数编写函数,或者 R 或任何 CRAN 库中有任何函数吗?
I am using an rpart
classifier in R. The question is - I would want to test the trained classifier on a test data. This is fine - I can use the predict.rpart
function.
But I also want to calculate precision, recall and F1 score.
My question is - do I have to write functions for those myself, or is there any function in R or any of CRAN libraries for that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
使用 caret 包:
适用于二元和多类分类的通用函数,无需使用任何包是:
关于该函数的一些注释:
positive.class
仅用于如果
predicted
和expected
具有不同的级别,predicted
将收到预期
水平using the caret package:
A generic function that works for binary and multi-class classification without using any package is:
Some comments about the function:
positive.class
is used only inbinary f1
predicted
andexpected
had different levels,predicted
will receive theexpected
levelsROCR 库计算所有这些以及更多内容(另请参阅http://rocr.bioinf.mpi-sb.mpg.de):
The ROCR library calculates all these and more (see also http://rocr.bioinf.mpi-sb.mpg.de):
只是为了更新这一点,因为我现在遇到了这个线程,
caret
中的confusionMatrix
函数会自动为您计算所有这些内容。您也可以用以下任意一项替换“F1”以提取相关值:
我认为当你只做二元分类问题时,这表现得略有不同,但在这两种情况下,当你查看混淆矩阵内部时,所有这些值都会为你计算出来对象,位于
$byClass
下Just to update this as I came across this thread now, the
confusionMatrix
function incaret
computes all of these things for you automatically.You can substitute any of the following for "F1" to extract the relevant values as well:
I think this behaves slightly differently when you're only doing a binary classifcation problem, but in both cases, all of these values are computed for you when you look inside the confusionMatrix object, under
$byClass
caret 包中的 fusionMatrix() 可以与适当的可选字段“Positive”一起使用,指定哪个因子应被视为正因子。
此代码还将给出附加值,例如 F 统计量、准确度等。
confusionMatrix() from caret package can be used along with a proper optional field "Positive" specifying which factor should be taken as positive factor.
This code will also give additional values such as F-statistic, Accuracy, etc.
我注意到关于二元类需要 F1 分数的评论。我怀疑通常是这样。但不久前我写了这篇文章,其中我将其分类为用数字表示的几个组。这可能对你有用...
I noticed the comment about F1 score being needed for binary classes. I suspect that it usually is. But a while ago I wrote this in which I was doing classification into several groups denoted by number. This may be of use to you...
我们可以简单地从caret的confusionMatrix函数中获取F1值
We can simply get F1 value from caret's confusionMatrix function
您还可以使用
caret
包提供的confusionMatrix()
。输出包括灵敏度(也称为召回率)和预测预测值(也称为精度)。那么 F1 可以很容易地计算出来,如上所述,如下:F1 <-(2 * 精度 * 召回率)/(精度 + 召回率)
You can also use the
confusionMatrix()
provided bycaret
package. The output includes,between others, Sensitivity (also known as recall) and Pos Pred Value(also known as precision). Then F1 can be easily computed, as stated above, as:F1 <- (2 * precision * recall) / (precision + recall)
库(插入符)
结果<-confusionMatrix(预测,标签)
#这显示了您需要的所有度量,包括精度、召回率和F1
结果$byClass
library(caret)
result <- confusionMatrix(Prediction, label)
#This shows all the measures you need including precision, recall and F1
result$byClass