将base_score与XGBClassifier一起为每个目标类提供初始先验

发布于 2025-01-24 04:06:23 字数 508 浏览 3 评论 0 原文

使用XGBRegressor时，可以使用 base_score 设置设置所有数据点的初始预测值。通常，该值将设置为训练集中观察值的平均值。

当 objective 参数设置为 multi：softproba 时，是否可以使用XGBClassifier，通过指定每个目标类的值来实现类似的目标？

例如，计算训练集中每个目标类别的每次发生的总和，并按总数为准，将为我们归一百分比：

class      pct_total
--------------------
blue       0.57
red        0.22
green      0.16
black      0.05

因此，在开始其第一次迭代时，XGBClassifier将从每个数据点开始以这些每类值开始，而不是简单地开始从 1 / num_classes < / code>开始。

有可能实现这一目标吗？

原文

When using XGBRegressor, it's possible to use the base_score setting to set the initial prediction value for all data points. Typically that value would be set to the mean of the observed value in the training set.

Is it possible to achieve a similar thing using XGBClassifier, by specifying a value for every target class, when the objective parameter is set to multi:softproba?

E.g. computing the sum of each occurrence for each target class in the training set and normalizing by percentage of total would give us:

class      pct_total
--------------------
blue       0.57
red        0.22
green      0.16
black      0.05

So that when beginning its first iteration, XGBClassifier would start with these per-class values for every data point, instead of simply starting with 1 / num_classes for all classes.

Is it possible to achieve this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夜清冷一曲。 2025-01-31 04:06:23

您可以使用参数 base_margin 来完成此操作。阅读中引用的演示为 dmatrix ;但是，正如文档所说，您可以在 xgbClassifier.fit method（具有足够新的XGBOOST）中设置 base_margin 。

base_margin 的形状有望为（n_samples，n_classes）;由于Xgboost 以单vs-rest时尚适合多类模型，您为每个样本提供了每个样本的基础分数在三个独立的GBM中。还请注意，这些值在log-odds空间中，因此相应地转换。另外，请不要忘记将 base_margin 添加到每个预测调用（现在，将会更好，因为它可以保存到课堂上...再次查看链接的问题在本段的早期）。