自定义列变压器。培训问题
我在尝试实施管道时遇到以下问题 对于预处理器,我想组合添加新列和处理所有其他列。它的工作原理
features = ['Pclass', 'Sex', 'Age', 'Parch', 'SibSp','Embarked']
target = ['Survived']
num_features = data[features].select_dtypes(include=['int64', 'float64']).columns
cat_features = data[features].select_dtypes(include=['object']).columns
X_train = data[features]
y_train = data['Survived']
class Add_family(BaseEstimator, TransformerMixin):
def __init__(self, add_family = True):
self.ad_family = add_family
def fit(self, X, y= None):
return self
def transform(self, X, y= None):
df=pd.DataFrame(X).copy()
if self.ad_family:
df['Family_size'] = df.apply(lambda x: x.Parch + x.SibSp + 1, axis=1)
def get_family_type(var):
if var == 1:
return 'alone'
elif var<=4:
return 'small'
else:
return 'big'
df['FamilyType'] = df.apply(lambda x: get_family_type(x.Family_size), axis = 1)
df = df.drop(columns=['Parch', 'SibSp'])
return df
num_transformer = Pipeline([('scaler', StandardScaler()),
('imputer',SimpleImputer(strategy='mean'))])
cat_transformer = Pipeline([('onehot', OneHotEncoder(handle_unknown='ignore'))])
col_transform = ColumnTransformer([
('cat', cat_transformer, make_column_selector(dtype_include=object)),
('num', num_transformer, make_column_selector(dtype_include=np.number))])
preprocessor = Pipeline([('Adder_features', Add_family(add_family=True)),
('transform', col_transform)])
data_f = preprocessor.fit_transform(X_train)
pd.DataFrame(data_f)
,但是当我尝试训练模型时,我会收到以下错误
lr = Pipeline([('prep', preprocessor),
('clf', LogisticRegression())])
lr.fit(X_train, y_train)
类型:无法解开非通行的非类型对象
I am having the following problem while trying to implement pipeline
For the preprocessor, I want to combine adding a new column and processing all other columns. It works as it should
features = ['Pclass', 'Sex', 'Age', 'Parch', 'SibSp','Embarked']
target = ['Survived']
num_features = data[features].select_dtypes(include=['int64', 'float64']).columns
cat_features = data[features].select_dtypes(include=['object']).columns
X_train = data[features]
y_train = data['Survived']
class Add_family(BaseEstimator, TransformerMixin):
def __init__(self, add_family = True):
self.ad_family = add_family
def fit(self, X, y= None):
return self
def transform(self, X, y= None):
df=pd.DataFrame(X).copy()
if self.ad_family:
df['Family_size'] = df.apply(lambda x: x.Parch + x.SibSp + 1, axis=1)
def get_family_type(var):
if var == 1:
return 'alone'
elif var<=4:
return 'small'
else:
return 'big'
df['FamilyType'] = df.apply(lambda x: get_family_type(x.Family_size), axis = 1)
df = df.drop(columns=['Parch', 'SibSp'])
return df
num_transformer = Pipeline([('scaler', StandardScaler()),
('imputer',SimpleImputer(strategy='mean'))])
cat_transformer = Pipeline([('onehot', OneHotEncoder(handle_unknown='ignore'))])
col_transform = ColumnTransformer([
('cat', cat_transformer, make_column_selector(dtype_include=object)),
('num', num_transformer, make_column_selector(dtype_include=np.number))])
preprocessor = Pipeline([('Adder_features', Add_family(add_family=True)),
('transform', col_transform)])
data_f = preprocessor.fit_transform(X_train)
pd.DataFrame(data_f)
But when I try to train the model I get the following error
lr = Pipeline([('prep', preprocessor),
('clf', LogisticRegression())])
lr.fit(X_train, y_train)
TypeError: cannot unpack non-iterable NoneType object
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论