二进制分类的部分依赖图(PDP)的Y轴是什么意思?

发布于 2025-02-01 13:27:03 字数 208 浏览 2 评论 0 原文

我不确定PDP的Y轴是什么?这是我的目标功能是1(二进制分类)还是其他的概率?

I am not sure what the y-axis of my PDP implies? Is that the probability for my target feature to be 1 (binary classification) or something else?

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一梦等七年七年为一梦 2025-02-08 13:27:03

如果您执行列 a 的部分依赖图,并且要解释y值 x = 0.0 ,则y轴值表示类 1 由

  • 数据集中的所有行中的列 a 计算出的所有行中的所有行 0.0
  • 通过拟合模型预测所有更改的行,
  • 平均概率为模型给出的概率

我可能不擅长解释,但是您可以在。希望这个帮助:)

If you do the partial dependence plot of column a and you want to interpret the y value at x = 0.0, the y-axis value represent the average probability of class 1 computed by

  • changing value of column a in all rows in your dataset to 0.0
  • predicting all changed row with your fitted model
  • averaging the probability given by the model

I may not good at explaining but you can read more about PDP at https://christophm.github.io/interpretable-ml-book/pdp.html. Hope this help :)

压抑⊿情绪 2025-02-08 13:27:03

一般而言,我们可以从函数中产生分类器, f ,产生真实价值输出加上阈值。我们将输出称为“激活”。如果激活符合阈值条件,我们说该类已检测到:

is_class:=(f(x0,x1,...)> threshold)

activation = f(x0,x1,...)

pdp图只是显示激活值,因为它们会对输入值的变化而变化(我们忽略阈值)。那可能会绘制:
f(x0,x,x2,x3,...)
作为单个输入 X 的变化。通常,我们将其他人保持不变,尽管我们也可以在2D和3D中绘制。

有时我们对我们感兴趣:

  • 单个如何更改激活
  • 如何独立地改变激活方式的激活
  • 如何根据不同的输入来更改多个激活,依此类推。

严格来说,在查看PDP图时,我们甚至不必谈论分类器。生产实际价值输出(激活)的任何功能,以响应我们可以改变的更真实的功能输入之一,从而使我们能够产生PDP图。

分类器的激活不必像其他人所写的那样被解释为概率。在很多情况下,这只是不正确。然而,对激活水平的分析对我们来说是感兴趣的,独立地激活是否代表概率:在PDP图中,我们可以看到,例如,哪些特征值会产生强大的变化 - 更加水平图可能意味着一个毫无价值的功能。
同样,在ROC图中,我们明确检查有关改变激活值阈值的真实阳性和假位检测率的信息。
在这两种情况下,分类器都没有必要产生概率作为激活。

PDP阴谋的解释充满了危险。至少,您需要清楚随着输入功能的变化而保持恒定。其他功能是否设置为零(线性模型的一个不错选择)?我们是否将它们设置为测试集中最常见的值?还是样本中已知类别的最常见值?没有这些信息,垂直轴可能会较小。

知道激活是一种概率,似乎也对PDP图也没有帮助 - 您不能指望其下方的区域总和为一个。也许您可能发现的最有用的是错误情况,其中输出概率不在范围0..1。

Generally speaking, we can produce a classifier from a function, f, producing a real-value output plus a threshold. We call the output an 'activation'. If the activation meets a threshold condition is met, the we say the class is detected:

is_class := ( f(x0, x1, ...) > threshold )
and

activation = f(x0, x1, ...)

PDP plots simply show activation values as they change in response to changes in an input value (we ignore the threshold). That is might plot:
f(x0, x, x2, x3, ...)
as a single input x varies. Typically, we hold the others constant, although we can also plot in 2d and 3d.

Sometimes we're interested in:

  • how a single change the activation
  • how multiple inputs independently change the activation
  • how multiple activations change based on different inputs, and so on.

Strictly speaking, we need not even be talking about a classifier when looking a PDP plots. Any function that productions a real-value output (an activation) in response to one of more real-valued feature inputs that we can vary allows us to produce PDP plots.

Classifier activations need not be, and often should not be, interpreted as probabilities, as others have written. In very many cases, this is simply just incorrect. Nevertheless, the analysis of the activation levels is of interest to us, independently of whether the activations represent probabilities: in PDP plots, we can see, for example, which feature values produce strong change - more horizontal plots may imply a worthless feature.
Similarly, in RoC plots, we explicitly examine information about the true-positive and false-position detection rates that result for varying the threshold of activation values.
In both cases, there's no necessity that the classifier produce probabilities as its activation.

Interpretation of PDP plots is fraught with dangers. At a minimum, you need to be clear about what is being held constant as a input feature is varied. Were the other features set to zero (a good choice for linear models)? Did we the set them to their most common values in the test set? Or the most common values for a known class in a sample? Without this information, the vertical axis may be less helpful.

Knowing that an activation is a probability also doesn't seem to helpful in PDP plots -- you can't expect the area under it to sum to one. Perhaps the most useful thing you might find is error cases, where output probabilities are not in the range 0..1.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文