导出造型瀑布图到数据框
我正在使用随机森林模型和神经网络进行二元分类,其中使用 SHAP 来解释模型预测。我按照教程编写了下面的代码来获取如下所示的瀑布图
row_to_show = 20
data_for_prediction = ord_test_t.iloc[row_to_show] # use 1 row of data here. Could use multiple rows if desired
data_for_prediction_array = data_for_prediction.values.reshape(1, -1)
rf_boruta.predict_proba(data_for_prediction_array)
explainer = shap.TreeExplainer(rf_boruta)
# Calculate Shap values
shap_values = explainer.shap_values(data_for_prediction)
shap.plots._waterfall.waterfall_legacy(explainer.expected_value[0], shap_values[0],ord_test_t.iloc[row_to_show])
这生成了如下所示的图
< img src="https://i.sstatic.net/Ftxu7.png" alt="在此处输入图像描述">
但是,我想将其导出到数据框,我该怎么做?
我希望我的输出如下所示。我想将其导出为完整的数据框。你能帮我吗?
I am working on a binary classification using random forest model, neural networks in which am using SHAP to explain the model predictions. I followed the tutorial and wrote the below code to get the waterfall plot shown below
row_to_show = 20
data_for_prediction = ord_test_t.iloc[row_to_show] # use 1 row of data here. Could use multiple rows if desired
data_for_prediction_array = data_for_prediction.values.reshape(1, -1)
rf_boruta.predict_proba(data_for_prediction_array)
explainer = shap.TreeExplainer(rf_boruta)
# Calculate Shap values
shap_values = explainer.shap_values(data_for_prediction)
shap.plots._waterfall.waterfall_legacy(explainer.expected_value[0], shap_values[0],ord_test_t.iloc[row_to_show])
This generated the plot as shown below
However, I want to export this to dataframe and how can I do it?
I expect my output to be like as shown below. I want to export this for the full dataframe. Can you help me please?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我们来做一个小实验:
这里的
explainer
是什么?如果您执行dir(explainer)
,您会发现它有一些方法和属性,其中包括:您对此感兴趣,因为这是 SHAP 值相加的基础。
此外:
将给出提示
sv
是一个由 2 个对象组成的列表,它们很可能是1
和0
的 SHAP 值,它们必须是对称的(因为向 1 移动的东西会以完全相同的量移动,但符号相反,向0
移动)。因此:
现在您已拥有将其打包为所需格式的一切:
问:我怎么知道?
A:阅读文档和源代码。
Let's do a small experiment:
What is
explainer
here? If you dodir(explainer)
you'll find out it has some methods and attributes among which is:which is of interest to you because this is base on which SHAP values add up.
Furthermore:
will give a hint
sv
is a list consisting of 2 objects which are most probably SHAP values for1
and0
, which must be symmetric (because what moves towards 1 moves exactly by the same amount, but with opposite sign, towards0
).Hence:
Now you have everything to pack it to the desired format:
Q: How do I know?
A: Read docs and source code.
如果我没记错的话,您可以使用
pandas
来获取功能名称,您应该执行这样的操作(如果
data_for_frediction
是dataframe):If I recall correctly, you can do something like this with
pandas
to get the feature names, you should do something like this (if
data_for_prediction
is a dataframe):我目前正在使用它:
它首先显示模型的形状值,然后显示每个预测的形状值,最后返回正类的数据帧(我处于不平衡上下文中)
它用于树解释器不是瀑布,但基本是一样的。
I'm a currenty using that :
It first displays the shap values for the model, and for each prediction after that, and finally it returns the dataframe for the positive class(i'm on an imbalance context)
It is for a Tree explainer and not a waterfall, but it is basically the same.