当我尝试钙化AUC并绘制ROC曲线时,我对Revide()函数和$投票都有问题。我的模型是RantomForest,代码看起来像:
rf_optimal = randomForest(Revenue~., data=train, ntree=250, mtry=4, do.trace=T)
火车集上的ROC曲线,我使用以下代码获得了如下:
roc(train$Revenue, rf_optimal$votes[,2],
plot = TRUE, legacy.axes = TRUE, percent = TRUE,
xlab = "False Positive Percentage", ylab = "True positive percentage",
col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)
使用$票证曲线
,当我使用preadive()函数时,auc等于100%
roc(train$Revenue, predict(rf_optimal,newdata = train, type = "prob")[,2],
plot = TRUE, legacy.axes = TRUE, percent = TRUE,
xlab = "False Positive Percentage", ylab = "True positive percentage",
col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)
Noreferrer“> ROC曲线
这两个裂缝之间有什么区别?是否可以完全获得AUC度量的值,还是意味着模型过度拟合?测试集的AUC约为90%。
I've got a problem regarding predict() function and $votes in r when i try to calcualte AUC and draw ROC curve. My model is randomForest and code looks like this:
rf_optimal = randomForest(Revenue~., data=train, ntree=250, mtry=4, do.trace=T)
The ROC curve on a train set I get using the code below looks like this:
roc(train$Revenue, rf_optimal$votes[,2],
plot = TRUE, legacy.axes = TRUE, percent = TRUE,
xlab = "False Positive Percentage", ylab = "True positive percentage",
col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)
Roc curve using $votes
And when I use predict() function instead the AUC equals to 100%
roc(train$Revenue, predict(rf_optimal,newdata = train, type = "prob")[,2],
plot = TRUE, legacy.axes = TRUE, percent = TRUE,
xlab = "False Positive Percentage", ylab = "True positive percentage",
col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)
Roc curve using predcit()
What is the difference between those two aproaches? Is it possible to obtain such a value of AUC measure at all, or does it mean that the model is overfitted? AUC for the test set is about 90%.
发布评论