您应该如何测试 2 个分类准确度分数的显着性：配对排列测试

发布于 2025-01-10 06:36:29 字数 441 浏览 0 评论 0原文

我有一个经过训练的分类器在 2 个相关的多类分类任务上进行了测试。由于分类任务的每次试验都是相关的，因此 2 组预测构成配对数据。我想运行配对排列测试来查明两个预测集之间的分类准确性差异是否显着。

因此，我的数据由 2 个预测类别列表组成，其中每个预测都与同一索引的其他测试集中的预测相关。

示例：

actual_classes = [1, 3, 6, 1, 22, 1, 11, 12, 9, 2]
predictions1 = [1, 3, 6, 1, 22, 1, 11, 12, 9 10] # 90% acc.
predictions2 = [1, 3, 7, 10, 22, 1, 7, 12, 2, 10] # 50% acc.

H0：分类准确率没有显着差异。

如何运行配对排列测试来测试分类准确性差异的显着性？

原文

I have a single trained classifier tested on 2 related multiclass classification tasks. As each trial of the classification tasks are related, the 2 sets of predictions constitute paired data. I would like to run a paired permutation test to find out if the difference in classification accuracy between the 2 prediction sets is significant.

So my data consists of 2 lists of predicted classes, where each prediction is related to the prediction in the other test set at the same index.

Example:

actual_classes = [1, 3, 6, 1, 22, 1, 11, 12, 9, 2]
predictions1 = [1, 3, 6, 1, 22, 1, 11, 12, 9 10] # 90% acc.
predictions2 = [1, 3, 7, 10, 22, 1, 7, 12, 2, 10] # 50% acc.

H0: There is no significant difference in classification accuracy.

How do I go about running a paired permutation test to test significance of the difference in classification accuracy?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

无戏配角 2025-01-17 06:36:29

我一直在考虑这个问题，我将发布一个提议的解决方案，看看是否有人批准或解释我为什么错了。

actual_classes = [1, 3, 6, 1, 22, 1, 11, 12, 9, 2]
predictions1 = [1, 3, 6, 1, 22, 1, 11, 12, 9 10] # 90% acc.
predictions2 = [1, 3, 7, 10, 22, 1, 7, 12, 2, 10] # 50% acc.
paired_predictions = [[1,1], [3,3], [6,7], [1,10], [22,22], [1,1], [11,7], [12,12], [9,2], [10,10]]

actual_test_statistic = predictions1 - predictions2 # 90%-50%=40 # 0.9-0.5=0.4
all_simulations = [] # empty list
for number_of_iterations:
    shuffle(paired_predictions) # only shuffle between pairs, not within
    simulated_predictions1 = paired_predictions[first prediction of each pair]
    simulated_predictions2 = paired_predictions[second prediction of each pair]
    simulated_accuracy1 = proportion of times simulated_predictions1 equals actual_classes
    simulated_accuracy2 = proportion of times simulated_predictions2 equals actual_classes
    all_simulations.append(simulated_accuracy1 - simulated_accuracy2) # Put the simulated difference in the list

p = count(absolute(all_simulations) > absolute(actual_test_statistic ))/number_of_iterations

如果您有任何想法，请在评论中告诉我。或者更好的是，在您自己的答案中提供您自己的更正版本。谢谢你！

I have been thinking about this and I'm going to post a proposed solution and see if someone approves or explains why I'm wrong.

actual_classes = [1, 3, 6, 1, 22, 1, 11, 12, 9, 2]
predictions1 = [1, 3, 6, 1, 22, 1, 11, 12, 9 10] # 90% acc.
predictions2 = [1, 3, 7, 10, 22, 1, 7, 12, 2, 10] # 50% acc.
paired_predictions = [[1,1], [3,3], [6,7], [1,10], [22,22], [1,1], [11,7], [12,12], [9,2], [10,10]]

actual_test_statistic = predictions1 - predictions2 # 90%-50%=40 # 0.9-0.5=0.4
all_simulations = [] # empty list
for number_of_iterations:
    shuffle(paired_predictions) # only shuffle between pairs, not within
    simulated_predictions1 = paired_predictions[first prediction of each pair]
    simulated_predictions2 = paired_predictions[second prediction of each pair]
    simulated_accuracy1 = proportion of times simulated_predictions1 equals actual_classes
    simulated_accuracy2 = proportion of times simulated_predictions2 equals actual_classes
    all_simulations.append(simulated_accuracy1 - simulated_accuracy2) # Put the simulated difference in the list

p = count(absolute(all_simulations) > absolute(actual_test_statistic ))/number_of_iterations

If you have any thoughts, let me know in the comments. Or better still, provide your own corrected version in your own answer. Thank you!

回复收藏 0 原文

~没有更多了~