AUC和投票和预测（）在r中的差异（）

发布于 2025-01-30 17:53:30 字数 1078 浏览 2 评论 0 原文

当我尝试钙化AUC并绘制ROC曲线时，我对Revide（）函数和$投票都有问题。我的模型是RantomForest，代码看起来像：

rf_optimal = randomForest(Revenue~., data=train, ntree=250, mtry=4, do.trace=T)

火车集上的ROC曲线，我使用以下代码获得了如下：

roc(train$Revenue, rf_optimal$votes[,2], 
    plot = TRUE, legacy.axes = TRUE, percent = TRUE,
    xlab = "False Positive Percentage", ylab = "True positive percentage",
    col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)

使用$票证曲线

，当我使用preadive（）函数时，auc等于100％

roc(train$Revenue, predict(rf_optimal,newdata = train, type = "prob")[,2], 
    plot = TRUE, legacy.axes = TRUE, percent = TRUE,
    xlab = "False Positive Percentage", ylab = "True positive percentage",
    col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)

Noreferrer“> ROC曲线

这两个裂缝之间有什么区别？是否可以完全获得AUC度量的值，还是意味着模型过度拟合？测试集的AUC约为90％。

原文

I've got a problem regarding predict() function and $votes in r when i try to calcualte AUC and draw ROC curve. My model is randomForest and code looks like this:

rf_optimal = randomForest(Revenue~., data=train, ntree=250, mtry=4, do.trace=T)

The ROC curve on a train set I get using the code below looks like this:

roc(train$Revenue, rf_optimal$votes[,2], 
    plot = TRUE, legacy.axes = TRUE, percent = TRUE,
    xlab = "False Positive Percentage", ylab = "True positive percentage",
    col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)

Roc curve using $votes

And when I use predict() function instead the AUC equals to 100%

roc(train$Revenue, predict(rf_optimal,newdata = train, type = "prob")[,2], 
    plot = TRUE, legacy.axes = TRUE, percent = TRUE,
    xlab = "False Positive Percentage", ylab = "True positive percentage",
    col = "blue", lwd = 3, print.auc = TRUE, print.auc.y = 45)

Roc curve using predcit()

What is the difference between those two aproaches? Is it possible to obtain such a value of AUC measure at all, or does it mean that the model is overfitted? AUC for the test set is about 90%.

分享到QQ

分享到微博