计算逻辑模型的混淆矩阵时出错
我在 r-studio 中创建了一个空逻辑模型。
nullModel <- glm(train$bigFire ~ 1, data = train, family = binomial)
然后要求模型对测试集进行预测。
nullModel.pred <- predict(nullModel, test, type = "response")
此时我想计算混淆矩阵以评估模型的性能。
CM <- table(test$bigFire, nullModel.pred>0.5)
结果输出如下:
TRUE
0 58
1 46
即使我更改截止值(现在设置为 0.5),结果也始终相同。我不明白为什么,因为模型应该以具有不同截止值的不同方式执行。
数据集如下:
month day FFMC DMC DC ISI temp RH wind rain zone bigFire
1 mar fri 86.2 26.2 94.3 5.1 8.2 51 6.7 0.0 75 0
2 oct tue 90.6 35.4 669.1 6.7 18.0 33 0.9 0.0 74 0
3 oct sat 90.6 43.7 686.9 6.7 14.6 33 1.3 0.0 74 0
4 mar fri 91.7 33.3 77.5 9.0 8.3 97 4.0 0.2 86 0
5 mar sun 89.3 51.3 102.2 9.6 11.4 99 1.8 0.0 86 0
6 aug sun 92.3 85.3 488.0 14.7 22.2 29 5.4 0.0 86 0
共有 517 行。 测试和训练是从之前的数据帧生成的,其中 80% 用于训练,20% 用于测试(104 行)。 预测向量的长度为:
> length(nullModel.pred)
[1] 104
并且始终包含相同的值-> 0.542。 这是合理的,因为它只能估计响应的期望值为 1。
I created in r-studio a null logistic model.
nullModel <- glm(train$bigFire ~ 1, data = train, family = binomial)
Then it is asked to the model to make predictions on the test-set.
nullModel.pred <- predict(nullModel, test, type = "response")
At this point i want to compute the confusion matrix in order to evaluate the performances of the model.
CM <- table(test$bigFire, nullModel.pred>0.5)
The resulting output is the following:
TRUE
0 58
1 46
Even if i change the cutoff value (now set to 0.5) the result is always the same. I don't understand why since the model should perform in a different way having different cutoff values.
The dataset is the following:
month day FFMC DMC DC ISI temp RH wind rain zone bigFire
1 mar fri 86.2 26.2 94.3 5.1 8.2 51 6.7 0.0 75 0
2 oct tue 90.6 35.4 669.1 6.7 18.0 33 0.9 0.0 74 0
3 oct sat 90.6 43.7 686.9 6.7 14.6 33 1.3 0.0 74 0
4 mar fri 91.7 33.3 77.5 9.0 8.3 97 4.0 0.2 86 0
5 mar sun 89.3 51.3 102.2 9.6 11.4 99 1.8 0.0 86 0
6 aug sun 92.3 85.3 488.0 14.7 22.2 29 5.4 0.0 86 0
It counts 517 rows.
The test and train are generated from the previous datafram with a split of 80% for train and 20% for test (104 rows).
The length of the prediction vector is:
> length(nullModel.pred)
[1] 104
and contains always the same value -> 0.542.
This is reasonable since it is only able to estimate the expected value for the response to be 1.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论