请帮助修复它:typeError:preadion_proba()缺少1所需的位置参数:' x'
我正在使用随机森林分类器构建二元分类器。在此之前,我根据高AUC分数进行了功能选择。但是,当我想为此模型获得AUC时,我无法做到。这是下面的代码。抱歉缺乏数据集。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_selection import VarianceThreshold
df_process_label1 = 'AAA'
X = df_process.iloc[:,200:500]
y = df_process[df_process_label1].values
import sklearn
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state = 0)
constant_filter = VarianceThreshold(threshold = 0.01)
constant_filter.fit(X_train)
X_train_filter = constant_filter.transform(X_train)
X_test_filter = constant_filter.transform(X_test)
roc_auc = []
for features in X_train.columns:
clf = RandomForestClassifier(n_estimators = 100, random_state=0)
clf.fit(X_train[features].to_frame(), y_train)
y_pred = clf.predict(X_test[features].to_frame())
roc_auc.append(roc_auc_score(y_test, y_pred))
roc_values = pd.Series(roc_auc)
roc_values.index = X_train.columns
roc_values.sort_values(ascending = False, inplace =True)
sel = roc_values[roc_values>0.5]
sel
X_train_roc = X_train[sel.index]
X_test_roc = X_test[sel.index]
def run_randomForest(X_train, X_test, y_train, y_test):
clf = RandomForestClassifier(n_estimators=100, random_state=0, n_jobs=1)
clf.fit(X_train, y_train)
y_pred1 = clf.predict(X_test)
print('Accuracy on test set: ', accuracy_score(y_test, y_pred))
print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))
%time
run_randomForest(X_train_roc, X_test_roc, y_train, y_test)
但是,一个错误不断反复出现。
TypeError: predict_proba() missing 1 required positional argument: 'X'
您知道如何解决吗? 提前致谢!
I was building a binary classifier using the random forest classifier. Before it, I did a feature selection based on the high AUC score. However, when I wanted to get AUC for this model I couldn't. Here is the code below. Sorry for the lack of the dataset.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_selection import VarianceThreshold
df_process_label1 = 'AAA'
X = df_process.iloc[:,200:500]
y = df_process[df_process_label1].values
import sklearn
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state = 0)
constant_filter = VarianceThreshold(threshold = 0.01)
constant_filter.fit(X_train)
X_train_filter = constant_filter.transform(X_train)
X_test_filter = constant_filter.transform(X_test)
roc_auc = []
for features in X_train.columns:
clf = RandomForestClassifier(n_estimators = 100, random_state=0)
clf.fit(X_train[features].to_frame(), y_train)
y_pred = clf.predict(X_test[features].to_frame())
roc_auc.append(roc_auc_score(y_test, y_pred))
roc_values = pd.Series(roc_auc)
roc_values.index = X_train.columns
roc_values.sort_values(ascending = False, inplace =True)
sel = roc_values[roc_values>0.5]
sel
X_train_roc = X_train[sel.index]
X_test_roc = X_test[sel.index]
def run_randomForest(X_train, X_test, y_train, y_test):
clf = RandomForestClassifier(n_estimators=100, random_state=0, n_jobs=1)
clf.fit(X_train, y_train)
y_pred1 = clf.predict(X_test)
print('Accuracy on test set: ', accuracy_score(y_test, y_pred))
print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))
%time
run_randomForest(X_train_roc, X_test_roc, y_train, y_test)
However, one error keep appearing over and over again.
TypeError: predict_proba() missing 1 required positional argument: 'X'
Do you know how to fix it?
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您应该使用
clf.predict_proba(x_test)
,我也认为您也需要修复此部分:您正在声明
y_pred1
,但是使用y_pred
You should use
clf.predict_proba(X_test)
instead, and also I think you need to fix this part too:you are declaring
y_pred1
, but usingy_pred
问题是您正在使用RandomForestClassifier,但是您需要从为RandomforestClassifier创建的对象调用preadive_proba,该对象称为CLF
所以而不是
print(roc_auc_score(y_test,RandomForestClassifier.predict_proba(x_test)[:,1])))
你应该写
print(roc_auc_score(y_test,clf.predict_proba(x_test)[:,1])))
希望这会有所帮助
The problem is you're using RandomForestClassifier but you need to call predict_proba from the object you created for RandomForestClassifier which is named as clf
So instead of
print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))
you should write
print(roc_auc_score(y_test, clf.predict_proba(X_test)[:,1]))
Hope this helps