如何确保Make_Column_TransFormer正确标记对象?
我建立了XGBoost模型,进行了预测,并评估了该模型的准确性。但是,我遇到了在新数据框架上使用该模型的问题。
新的数据框架代码:
new_data = [['Academic', 'A', 'Male', 'Less Interested', 'Urban', 56, 6950000, 83.0, 84.09,
False]]
new = pd.DataFrame(data=new_data, columns = ['type_school', 'school_accreditation', 'gender',
'interest', 'residence', 'parent_age', 'parent_salary', 'house_area', 'average_grades',
'parent_was_in_college'])
column_trans = make_column_transformer(
(OneHotEncoder(), ['type_school','school_accreditation',
'gender','interest','residence','parent_was_in_college']),
remainder='passthrough')
X_new = column_trans.fit_transform(new)
preds = optimal_params.predict(X_new)
运行上述代码后,我会收到以下错误:
"ValueError: feature_names mismatch: ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8',
'f9', 'f10', 'f11', 'f12', 'f13', 'f14', 'f15', 'f16', 'f17', 'f18'] ['f0', 'f1', 'f2', 'f3',
'f4', 'f5', 'f6', 'f7', 'f8', 'f9']
expected f17, f13, f18, f15, f10, f12, f16, f14, f11 in input data"
但是,column_trans是训练数据框架上使用的完全相同的错误,因此我不确定发生了什么。我的column_trans有什么问题吗?
I built an XGBoost model, made predictions, and evaluated the model's accuracy; however, I'm running into issues with using the model on a new DataFrame.
New DataFrame code:
new_data = [['Academic', 'A', 'Male', 'Less Interested', 'Urban', 56, 6950000, 83.0, 84.09,
False]]
new = pd.DataFrame(data=new_data, columns = ['type_school', 'school_accreditation', 'gender',
'interest', 'residence', 'parent_age', 'parent_salary', 'house_area', 'average_grades',
'parent_was_in_college'])
column_trans = make_column_transformer(
(OneHotEncoder(), ['type_school','school_accreditation',
'gender','interest','residence','parent_was_in_college']),
remainder='passthrough')
X_new = column_trans.fit_transform(new)
preds = optimal_params.predict(X_new)
After running the above code, I get the following error:
"ValueError: feature_names mismatch: ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8',
'f9', 'f10', 'f11', 'f12', 'f13', 'f14', 'f15', 'f16', 'f17', 'f18'] ['f0', 'f1', 'f2', 'f3',
'f4', 'f5', 'f6', 'f7', 'f8', 'f9']
expected f17, f13, f18, f15, f10, f12, f16, f14, f11 in input data"
However, the column_trans is the exact same used on the training DataFrame, so I'm not sure what's going on. Is there something off about my column_trans?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
据我了解,您不会保存
column_trans
,哪个fit
在您的培训模型上。这里的机制是
column_trans
)变换
您可以在此上找到有关这些内容的更多信息。 70211411/19275378“>链接
As I understand, you dont save your
column_trans
, whichfit
on your training model.The mechanism here is
column_trans
)transform
You can find more information about these things on this link
运行预测时,应将新数据仅使用
.transform
(不是.fit_transform
)进行转换。这是伪代码:When running prediction, then new data should be just transformed with
.transform
(not.fit_transform
). Here's pseudocode: