如何在Catboost分类器上适合Sklearn Pipeline具有嵌入功能
我有一个catboost分类器,可以预测一些嵌入功能,而AFAIK只能通过池指定这些嵌入功能(这意味着我必须创建一个池,然后通过catboost分类器的 method将池传递。为了使模型捡起)。
这些嵌入功能由tfidfvectorizer
生成,因此我想包装tfidfvectorizer
和分类器作为Sklearn pipeline
的一部分,以使其整理我的代码并有清晰的管道来训练/预测。
不幸的是,我无法将catboost pool
传递给Sklearn Pipeline
,因为当我这样做时,我会收到以下错误:
Expected 2D array, got scalar array instead:
array=<catboost.core.Pool object at 0x7f98f0256820>.
是否有任何方法可以解决此问题?
I have a Catboost Classifier that predicts on some embedding features, and AFAIK these embedding features can only be specified through Pools (meaning I have to create a pool and then pass the pool for the Catboost classifier's .fit
method in order for the model to pick them up).
These embedding features are generated by a TfidfVectorizer
, so I would like to wrap the TfidfVectorizer
and the classifier as part of an sklearn Pipeline
to tidy up my code and have a clear pipeline to train/predict.
Unfortunately, I cannot pass Catboost Pool
to an sklearn Pipeline
because when I do, I get the following error:
Expected 2D array, got scalar array instead:
array=<catboost.core.Pool object at 0x7f98f0256820>.
Is there any way around this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论