未渲​​染 databricks 笔记本中的 RandomForestClassifier Explainer 仪表板输出

发布于 2025-01-11 20:31:58 字数 2751 浏览 0 评论 0原文

我正在尝试使用ExplainerDashboard 包渲染RandomForestClassifier 模型仪表板,但它没有在笔记本中渲染仪表板。

代码

model = RandomForestClassifier(n_estimators=50, max_depth=10).fit(X_train, y_train)
explainer = ClassifierExplainer(model, X_test, y_test) 
ExplainerDashboard(explainer).run()

这是我在输出下方得到的

=========================================================================

Detected RandomForestClassifier model: Changing class type to RandomForestClassifierExplainer...

Note: model_output=='probability', so assuming that raw shap output of RandomForestClassifier is in probability space...

Generating self.shap_explainer = shap.TreeExplainer(model)
Building ExplainerDashboard..

Detected notebook environment, consider setting mode='external', mode='inline' or mode='jupyterlab' to keep the notebook interactive while the dashboard is running...

Warning: calculating shap interaction values can be slow! Pass shap_interaction=False to remove interactions tab.

Generating layout...

Calculating shap values...

Calculating prediction probabilities...

Calculating metrics...

Calculating confusion matrices...

Calculating classification_dfs...

Calculating roc auc curves...

Calculating pr auc curves...

Calculating liftcurve_dfs...

Calculating shap interaction values... (this may take a while)

Reminder: TreeShap computational complexity is O(TLD^2), where T is the number of trees, L is the maximum number of leaves in any tree and D the maximal depth of any tree. So reducing these will speed up the calculation.

Calculating dependencies...

Calculating permutation importances (if slow, try setting n_jobs parameter)...

Calculating pred_percentiles...

Calculating predictions...

Calculating ShadowDecTree for each individual decision tree...

Reminder: you can store the explainer (including calculated dependencies) with explainer.dump('explainer.joblib') and reload with e.g. ClassifierExplainer.from_file('explainer.joblib')

Registering callbacks...

Starting ExplainerDashboard on http://19.221.249.249:8055

Dash is running on http://0.0.0.0:8055/


 * Serving Flask app 'explainerdashboard.dashboards' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on all addresses.
   WARNING: This is a development server. Do not use it in a production deployment.
 * Running on http://19.221.249.249:8055/

=========================================================================

,但仪表板未在笔记本中呈现。我也尝试使用 InlineExplainer,它返回

您能否建议在 databricks 笔记本中渲染仪表板的任何想法

I am trying to render RandomForestClassifier model dashboard using ExplainerDashboard package, but it is not rendering the dashboard in notebook.

Here is the code

model = RandomForestClassifier(n_estimators=50, max_depth=10).fit(X_train, y_train)
explainer = ClassifierExplainer(model, X_test, y_test) 
ExplainerDashboard(explainer).run()

I was getting below output

=========================================================================

Detected RandomForestClassifier model: Changing class type to RandomForestClassifierExplainer...

Note: model_output=='probability', so assuming that raw shap output of RandomForestClassifier is in probability space...

Generating self.shap_explainer = shap.TreeExplainer(model)
Building ExplainerDashboard..

Detected notebook environment, consider setting mode='external', mode='inline' or mode='jupyterlab' to keep the notebook interactive while the dashboard is running...

Warning: calculating shap interaction values can be slow! Pass shap_interaction=False to remove interactions tab.

Generating layout...

Calculating shap values...

Calculating prediction probabilities...

Calculating metrics...

Calculating confusion matrices...

Calculating classification_dfs...

Calculating roc auc curves...

Calculating pr auc curves...

Calculating liftcurve_dfs...

Calculating shap interaction values... (this may take a while)

Reminder: TreeShap computational complexity is O(TLD^2), where T is the number of trees, L is the maximum number of leaves in any tree and D the maximal depth of any tree. So reducing these will speed up the calculation.

Calculating dependencies...

Calculating permutation importances (if slow, try setting n_jobs parameter)...

Calculating pred_percentiles...

Calculating predictions...

Calculating ShadowDecTree for each individual decision tree...

Reminder: you can store the explainer (including calculated dependencies) with explainer.dump('explainer.joblib') and reload with e.g. ClassifierExplainer.from_file('explainer.joblib')

Registering callbacks...

Starting ExplainerDashboard on http://19.221.249.249:8055

Dash is running on http://0.0.0.0:8055/


 * Serving Flask app 'explainerdashboard.dashboards' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on all addresses.
   WARNING: This is a development server. Do not use it in a production deployment.
 * Running on http://19.221.249.249:8055/

=========================================================================

But dashboard is not rendered in notebook. I tried with InlineExplainer also, it was returning <IPython.lib.display.IFrame at 0x7f4eea3e1c70>

Can you please suggest any idea to render dashboard in databricks notebook

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

别再吹冷风 2025-01-18 20:31:58

要在笔记本中呈现仪表板,您应该使用 InlineExplainer
这样,您就可以绘制模型性能或形状值,例如文档。

您可以使用以下代码作为参考:

from sklearn.ensemble import RandomForestClassifier
from explainerdashboard.datasets import titanic_survive
from explainerdashboard import ClassifierExplainer, ExplainerDashboard
from explainerdashboard import InlineExplainer

X_train, y_train, X_test, y_test = titanic_survive()

model = RandomForestClassifier(n_estimators=50, max_depth=10).fit(X_train, y_train)
explainer = ClassifierExplainer(model, X_test, y_test) 

InlineExplainer(explainer).shap.overview()

输出:
输入图片此处描述

To render the dashboard in your notebook, you should use the InlineExplainer.
With that, you can plot model performance or shape values for instance as explained in the documentation.

You can use the following code as a reference:

from sklearn.ensemble import RandomForestClassifier
from explainerdashboard.datasets import titanic_survive
from explainerdashboard import ClassifierExplainer, ExplainerDashboard
from explainerdashboard import InlineExplainer

X_train, y_train, X_test, y_test = titanic_survive()

model = RandomForestClassifier(n_estimators=50, max_depth=10).fit(X_train, y_train)
explainer = ClassifierExplainer(model, X_test, y_test) 

InlineExplainer(explainer).shap.overview()

Output:
enter image description here

剪不断理还乱 2025-01-18 20:31:58

请执行以下步骤,您将能够获得ExplainerDashboard视图:

步骤1)在Databricks集群中设置环境变量

例如:

DASH_REQUEST_PATHNAME_PREFIX=/driver-proxy/o/40xxxxxxxx/1004xxxxx/8888

工作空间ID:40xxxxxxx
集群 ID:1004xxxxxxxx
端口号:8888

步骤 2) 安装explainerdashboard库

%pip installexplainerdashboard

步骤3) 验证“破折号”的示例代码

from sklearn.ensemble import RandomForestClassifier

fromexplainerdashboard import ClassifierExplainer,ExplainerDashboard

fromexplainerdashboard.datasets import titanic_survive, feature_descriptions

X_train, y_train, X_测试, y_test = titanic_survive()

model = RandomForestClassifier(n_estimators = 50,max_depth = 10).fit(X_train,y_train)

explainer = ClassifierExplainer(model,X_test,y_test,

cats = ['Deck','Embarked'],

descriptions = feature_descriptions,

labels=['未幸存','幸存'])

ExplainerDashboard(explainer, mode = 'dash',

重要性=False,

model_summary=False,

tributions=True,

whatif=False,

shap_dependence=False,

shap_interaction=False,

decision_trees=False).run (8888)

步骤 4) 仪表板 URL

https://xxxxxxxx.databricks.com/driver-proxy/o/

Please do the following steps you will be able to get the ExplainerDashboard view:

Step 1) Set the Environment variables in Databricks Cluster

For example :

DASH_REQUEST_PATHNAME_PREFIX=/driver-proxy/o/40xxxxxxxx/1004xxxxx/8888

Workspace ID : 40xxxxxxx
Cluster-ID: 1004xxxxxxxx
Port number: 8888

Step 2) Install the explainerdashboard library

%pip install explainerdashboard

Step 3) Sample code to validate the "dash"

from sklearn.ensemble import RandomForestClassifier

from explainerdashboard import ClassifierExplainer, ExplainerDashboard

from explainerdashboard.datasets import titanic_survive, feature_descriptions

X_train, y_train, X_test, y_test = titanic_survive()

model = RandomForestClassifier(n_estimators=50, max_depth=10).fit(X_train, y_train)

explainer = ClassifierExplainer(model, X_test, y_test,

cats=['Deck', 'Embarked'],

descriptions=feature_descriptions,

labels=['Not survived', 'Survived'])

ExplainerDashboard(explainer, mode = 'dash',

importances=False,

model_summary=False,

contributions=True,

whatif=False,

shap_dependence=False,

shap_interaction=False,

decision_trees=False).run(8888)

Step 4) Dashboard URL

https://xxxxxxxx.databricks.com/driver-proxy/o/<workspaceID/clusterID/8888

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文