如何将SHAP本地解释导出到数据框?
我正在使用随机森林进行二元分类,并尝试使用 SHAP 来解释模型预测。
但是,我想将带有值的 SHAP 本地解释图转换为每个实例的 pandas 数据框。
这里有人可以帮助我将每个实例的 SHAP 本地解释导出到 pandas 数据框吗?
我知道 SHAPASH 有 .to_pandas()
方法,但在 SHAP 中找不到类似的内容
我根据 SO 帖子 这里但这并没有帮助
feature_names = shap_values.feature_names
shap_df = pd.DataFrame(shap_values.values, columns=feature_names)
vals = np.abs(shap_df.values).mean(0)
shap_importance = pd.DataFrame(list(zip(feature_names, vals)), columns=['col_name', 'feature_importance_vals'])
shap_importance.sort_values(by=['feature_importance_vals'], ascending=False, inplace=True)
我期望我的输出如下所示。这里,负号表示对类 0 的特征贡献,正值表示对类 1 的特征贡献
subject_id Feature importance value (contribution)
1 F1 31
1 F2 27
1 F3 20
1 F5 - 10
1 F9 - 29
I am working on a binary classification using random forest and trying out SHAP to explain the model predictions.
However, I would like to convert the SHAP local explanation plots with values into a pandas dataframe for each instance.
Is there any one here who can help me with exporting SHAP local explanations to pandas dataframe for each instance?
I know that SHAPASH has .to_pandas()
method but couldn't find anything like that in SHAP
I tried something like below based on the SO post here but it doesn't help
feature_names = shap_values.feature_names
shap_df = pd.DataFrame(shap_values.values, columns=feature_names)
vals = np.abs(shap_df.values).mean(0)
shap_importance = pd.DataFrame(list(zip(feature_names, vals)), columns=['col_name', 'feature_importance_vals'])
shap_importance.sort_values(by=['feature_importance_vals'], ascending=False, inplace=True)
I expect my output something like below. Here, negative sign indicates feature contribution for class 0 and positive values indicates feature contribution for class 1
subject_id Feature importance value (contribution)
1 F1 31
1 F2 27
1 F3 20
1 F5 - 10
1 F9 - 29
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您有这样的模型:
您可以像这样分解您的结果:
这完全等于:
如果您想将结果放入 Pandas df:
或者,如果您希望所有内容按行排列:
If you have a model like this:
you can decompose your results like this:
Which is exactly equal to:
If you want to put results to Pandas df:
Alternatively, if you wish everything arranged row-wise: