请帮助修复它：typeError：preadion_proba（）缺少1所需的位置参数：＆＃x27; x＆＃x27;

发布于 2025-02-08 14:01:35 字数 1960 浏览 2 评论 0原文

我正在使用随机森林分类器构建二元分类器。在此之前，我根据高AUC分数进行了功能选择。但是，当我想为此模型获得AUC时，我无法做到。这是下面的代码。抱歉缺乏数据集。


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_selection import VarianceThreshold

df_process_label1 = 'AAA'
X = df_process.iloc[:,200:500]
y = df_process[df_process_label1].values

import sklearn
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state = 0)

constant_filter = VarianceThreshold(threshold = 0.01)
constant_filter.fit(X_train)
X_train_filter = constant_filter.transform(X_train)
X_test_filter = constant_filter.transform(X_test)


roc_auc = []
for features in X_train.columns:
    clf = RandomForestClassifier(n_estimators = 100, random_state=0)
    clf.fit(X_train[features].to_frame(), y_train)
    y_pred = clf.predict(X_test[features].to_frame())
    roc_auc.append(roc_auc_score(y_test, y_pred))


roc_values = pd.Series(roc_auc)
roc_values.index = X_train.columns
roc_values.sort_values(ascending = False, inplace =True)


sel = roc_values[roc_values>0.5]
sel


X_train_roc = X_train[sel.index]
X_test_roc = X_test[sel.index]

def run_randomForest(X_train, X_test, y_train, y_test):
    clf = RandomForestClassifier(n_estimators=100, random_state=0, n_jobs=1)
    clf.fit(X_train, y_train)
    y_pred1 = clf.predict(X_test)
    print('Accuracy on test set: ', accuracy_score(y_test, y_pred))
    print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))

%time
run_randomForest(X_train_roc, X_test_roc, y_train, y_test)

但是，一个错误不断反复出现。

TypeError: predict_proba() missing 1 required positional argument: 'X'

您知道如何解决吗？提前致谢！

原文

I was building a binary classifier using the random forest classifier. Before it, I did a feature selection based on the high AUC score. However, when I wanted to get AUC for this model I couldn't. Here is the code below. Sorry for the lack of the dataset.


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.feature_selection import VarianceThreshold

df_process_label1 = 'AAA'
X = df_process.iloc[:,200:500]
y = df_process[df_process_label1].values

import sklearn
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state = 0)

constant_filter = VarianceThreshold(threshold = 0.01)
constant_filter.fit(X_train)
X_train_filter = constant_filter.transform(X_train)
X_test_filter = constant_filter.transform(X_test)


roc_auc = []
for features in X_train.columns:
    clf = RandomForestClassifier(n_estimators = 100, random_state=0)
    clf.fit(X_train[features].to_frame(), y_train)
    y_pred = clf.predict(X_test[features].to_frame())
    roc_auc.append(roc_auc_score(y_test, y_pred))


roc_values = pd.Series(roc_auc)
roc_values.index = X_train.columns
roc_values.sort_values(ascending = False, inplace =True)


sel = roc_values[roc_values>0.5]
sel


X_train_roc = X_train[sel.index]
X_test_roc = X_test[sel.index]

def run_randomForest(X_train, X_test, y_train, y_test):
    clf = RandomForestClassifier(n_estimators=100, random_state=0, n_jobs=1)
    clf.fit(X_train, y_train)
    y_pred1 = clf.predict(X_test)
    print('Accuracy on test set: ', accuracy_score(y_test, y_pred))
    print(roc_auc_score(y_test, RandomForestClassifier.predict_proba(X_test)[:,1]))

%time
run_randomForest(X_train_roc, X_test_roc, y_train, y_test)

However, one error keep appearing over and over again.

TypeError: predict_proba() missing 1 required positional argument: 'X'

Do you know how to fix it?
Thanks in advance!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

虐人心 2025-02-15 14:01:35

您应该使用clf.predict_proba（x_test），我也认为您也需要修复此部分：

y_pred1 = clf.predict(X_test)
print('Accuracy on test set: ', accuracy_score(y_test, y_pred))

您正在声明y_pred1，但是使用y_pred

You should use clf.predict_proba(X_test) instead, and also I think you need to fix this part too:

y_pred1 = clf.predict(X_test)
print('Accuracy on test set: ', accuracy_score(y_test, y_pred))

you are declaring y_pred1, but using y_pred

回复收藏 0 原文

彻夜缠绵 2025-02-15 14:01:35

问题是您正在使用RandomForestClassifier，但是您需要从为RandomforestClassifier创建的对象调用preadive_proba，该对象称为CLF
所以而不是
print（roc_auc_score（y_test，RandomForestClassifier.predict_proba（x_test）[：，1]）））
你应该写
print（roc_auc_score（y_test，clf.predict_proba（x_test）[：，1]）））

希望这会有所帮助

回复收藏 0 原文

~没有更多了~