使用石灰来解释深神经网进行欺诈检测
我建立了一个深层神经网络,该网络对欺诈性交易进行了分类。我正在尝试使用石灰进行解释,但是正在面临driventor.explain_instance()
函数的错误。
完整的代码如下:
import lime
from lime import lime_tabular
interpretor = lime_tabular.LimeTabularExplainer(
training_data=x_train_scaled,
feature_names=X_train.columns,
mode='classification'
)
exp = interpretor.explain_instance(
data_row=x_test_scaled[:1], ##new data
predict_fn=model.predict,num_features=11
)
xp.show_in_notebook(show_table=True)
这引发错误:
--
IndexError Traceback (most recent call last)
/tmp/ipykernel_33/1730959582.py in <module>
1 exp = interpretor.explain_instance(
2 data_row=x_test_scaled[1], ##new data
----> 3 predict_fn=model.predict
4 )
5
/opt/conda/lib/python3.7/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor)
457 num_features,
458 model_regressor=model_regressor,
--> 459 feature_selection=self.feature_selection)
460
461 if self.mode == "regression":
/opt/conda/lib/python3.7/site-packages/lime/lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor)
180
181 weights = self.kernel_fn(distances)
--> 182 labels_column = neighborhood_labels[:, label]
183 used_features = self.feature_selection(neighborhood_data,
184 labels_column,
IndexError: index 1 is out of bounds for axis 1 with size 1
I have built a deep neural network which classifies fraudulent transactions. I am trying to use LIME for explanation, but am facing an error from the interpretor.explain_instance()
function.
The complete code is as follows:
import lime
from lime import lime_tabular
interpretor = lime_tabular.LimeTabularExplainer(
training_data=x_train_scaled,
feature_names=X_train.columns,
mode='classification'
)
exp = interpretor.explain_instance(
data_row=x_test_scaled[:1], ##new data
predict_fn=model.predict,num_features=11
)
xp.show_in_notebook(show_table=True)
This throws the error:
--
IndexError Traceback (most recent call last)
/tmp/ipykernel_33/1730959582.py in <module>
1 exp = interpretor.explain_instance(
2 data_row=x_test_scaled[1], ##new data
----> 3 predict_fn=model.predict
4 )
5
/opt/conda/lib/python3.7/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor)
457 num_features,
458 model_regressor=model_regressor,
--> 459 feature_selection=self.feature_selection)
460
461 if self.mode == "regression":
/opt/conda/lib/python3.7/site-packages/lime/lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor)
180
181 weights = self.kernel_fn(distances)
--> 182 labels_column = neighborhood_labels[:, label]
183 used_features = self.feature_selection(neighborhood_data,
184 labels_column,
IndexError: index 1 is out of bounds for axis 1 with size 1
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
添加
labels =(0,)
inexp = eplainer.explain_instance()
可能可以解决您的问题。我在试图预测良性或恶性肿瘤的乳腺癌数据上也有类似的问题。包含记录的样本的列标题为
benign_0_malignant_1
,每行放置0或1。Adding
labels=(0,)
inexp = eplainer.explain_instance()
might resolve your issue.I had a similar issue with breast cancer data trying to predict benign or malignant tumors. The column that contains the recorded sample is titled
benign_0_malignant_1
with either 0 or 1 placed in each row.我认为问题是您正在使用2D数组,但是根据文档
dimend_instance()
期望该实例是1D数组。请注意,来自2D数组的单行切片本身是2D:
另一个问题是预测函数,预计会产生一系列概率,而不是单个预测。同样,文档解释(我的重点):
要解决这些问题,请使用简单的索引中的
x_test_scaled
而不是切片,然后将model.predict_proba
作为预测函数:I think the problem is that you are passing in a 2D array, but according to the docs
explain_instance()
is expecting the instance as a 1D array.Note that a single-row slice from a 2D array is itself 2D:
The other issue is the prediction function, which is expected to produce an array of probabilities, not a single prediction. Again, the docs explain (my emphasis):
To fix these things, use a simple index into
x_test_scaled
instead of a slice, and passmodel.predict_proba
as the prediction function: