请帮助修复它:typeError:preadion_proba()缺少1所需的位置参数:' x'

发布于 2025-02-08 14:01:35 字数 1960 浏览 2 评论 0原文

我正在使用随机森林分类器构建二元分类器。在此之前,我根据高AUC分数进行了功能选择。但是,当我想为此模型获得AUC时,我无法做到。这是下面的代码。抱歉缺乏数据集。


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_selection import VarianceThreshold

df_process_label1 = 'AAA'
X = df_process.iloc[:,200:500]
y = df_process[df_process_label1].values

import sklearn
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state = 0)

constant_filter = VarianceThreshold(threshold = 0.01)
constant_filter.fit(X_train)
X_train_filter = constant_filter.transform(X_train)
X_test_filter = constant_filter.transform(X_test)


roc_auc = []
for features in X_train.columns:
    clf = RandomForestClassifier(n_estimators = 100, random_state=0)
    clf.fit(X_train[features].to_frame(), y_train)
    y_pred = clf.predict(X_test[features].to_frame())
    roc_auc.append(roc_auc_score(y_test, y_pred))


roc_values = pd.Series(roc_auc)
roc_values.index = X_train.columns
roc_values.sort_values(ascending = False, inplace =True)


sel = roc_values[roc_values>0.5]
sel


X_train_roc = X_train[sel.index]
X_test_roc = X_test[sel.index]

def run_randomForest(X_train, X_test, y_train, y_test):
    clf = RandomForestClassifier(n_estimators=100, random_state=0, n_jobs=1)
    clf.fit(X_train, y_train)
    y_pred1 = clf.predict(X_test)
    print('Accuracy on test set: ', accuracy_score(y_test, y_pred))
    print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))
%time
run_randomForest(X_train_roc, X_test_roc, y_train, y_test)

但是,一个错误不断反复出现。

TypeError: predict_proba() missing 1 required positional argument: 'X'

您知道如何解决吗? 提前致谢!

I was building a binary classifier using the random forest classifier. Before it, I did a feature selection based on the high AUC score. However, when I wanted to get AUC for this model I couldn't. Here is the code below. Sorry for the lack of the dataset.


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_selection import VarianceThreshold

df_process_label1 = 'AAA'
X = df_process.iloc[:,200:500]
y = df_process[df_process_label1].values

import sklearn
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state = 0)

constant_filter = VarianceThreshold(threshold = 0.01)
constant_filter.fit(X_train)
X_train_filter = constant_filter.transform(X_train)
X_test_filter = constant_filter.transform(X_test)


roc_auc = []
for features in X_train.columns:
    clf = RandomForestClassifier(n_estimators = 100, random_state=0)
    clf.fit(X_train[features].to_frame(), y_train)
    y_pred = clf.predict(X_test[features].to_frame())
    roc_auc.append(roc_auc_score(y_test, y_pred))


roc_values = pd.Series(roc_auc)
roc_values.index = X_train.columns
roc_values.sort_values(ascending = False, inplace =True)


sel = roc_values[roc_values>0.5]
sel


X_train_roc = X_train[sel.index]
X_test_roc = X_test[sel.index]

def run_randomForest(X_train, X_test, y_train, y_test):
    clf = RandomForestClassifier(n_estimators=100, random_state=0, n_jobs=1)
    clf.fit(X_train, y_train)
    y_pred1 = clf.predict(X_test)
    print('Accuracy on test set: ', accuracy_score(y_test, y_pred))
    print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))
%time
run_randomForest(X_train_roc, X_test_roc, y_train, y_test)

However, one error keep appearing over and over again.

TypeError: predict_proba() missing 1 required positional argument: 'X'

Do you know how to fix it?
Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

虐人心 2025-02-15 14:01:35

您应该使用clf.predict_proba(x_test),我也认为您也需要修复此部分:

y_pred1 = clf.predict(X_test)
print('Accuracy on test set: ', accuracy_score(y_test, y_pred))

您正在声明y_pred1,但是使用y_pred

You should use clf.predict_proba(X_test) instead, and also I think you need to fix this part too:

y_pred1 = clf.predict(X_test)
print('Accuracy on test set: ', accuracy_score(y_test, y_pred))

you are declaring y_pred1, but using y_pred

彻夜缠绵 2025-02-15 14:01:35

问题是您正在使用RandomForestClassifier,但是您需要从为RandomforestClassifier创建的对象调用preadive_proba,该对象称为CLF
所以而不是
print(roc_auc_score(y_test,RandomForestClassifier.predict_proba(x_test)[:,1])))
你应该写
print(roc_auc_score(y_test,clf.predict_proba(x_test)[:,1])))

希望这会有所帮助

The problem is you're using RandomForestClassifier but you need to call predict_proba from the object you created for RandomForestClassifier which is named as clf
So instead of
print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))
you should write
print(roc_auc_score(y_test, clf.predict_proba(X_test)[:,1]))

Hope this helps

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文