在应用sklearn.com.pose.columntransformer上,如何保留列订单
我想使用Pipeline
和columntransformer
来自Sklearn库的模块在Numpy数组上应用缩放。 Scaleer应用于某些列。而且,我想具有相同的输入列顺序输出。
示例:
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import MinMaxScaler
X = np.array ( [(25, 1, 2, 0),
(30, 1, 5, 0),
(25, 10, 2, 1),
(25, 1, 2, 0),
(np.nan, 10, 4, 1),
(40, 1, 2, 1) ] )
column_trans = ColumnTransformer(
[ ('scaler', MinMaxScaler(), [0,2]) ],
remainder='passthrough')
X_scaled = column_trans.fit_transform(X)
问题是columnTransFormer
更改列的顺序。如何保留列的原始顺序?
我知道这 post 。但是,它是用于熊猫数据框架的。由于某些原因,我无法使用数据框,并且必须在代码中使用Numpy数组。
谢谢。
I want to use Pipeline
and ColumnTransformer
modules from sklearn library to apply scaling on numpy array. Scaler is applied on some of the columns. And, I want to have the output with same column order of input.
Example:
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import MinMaxScaler
X = np.array ( [(25, 1, 2, 0),
(30, 1, 5, 0),
(25, 10, 2, 1),
(25, 1, 2, 0),
(np.nan, 10, 4, 1),
(40, 1, 2, 1) ] )
column_trans = ColumnTransformer(
[ ('scaler', MinMaxScaler(), [0,2]) ],
remainder='passthrough')
X_scaled = column_trans.fit_transform(X)
The problem is that ColumnTransformer
changes the order of columns. How can I preserve the original order of columns?
I am aware of this post. But, it is for pandas DataFrame. For some reasons, I cannot use DataFrame and I have to use numpy array in my code.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个解决方案,通过添加一个变压器,该变压器将在列变换后应用逆列置换:
它依赖于解析
来读取后缀号码的初始列顺序。然后计算和应用逆置换。
用作:
替代解决方案不依赖字符串解析,而是读取列变压器的列切片:
Here is a solution by adding a transformer which will apply the inverse column permutation after the column transform:
It relies on parsing
to read the initial column order from the suffix number. Then computing and applying the inverse permutation.
To be used as:
Alternative solution not relying on string parsing but reading the column slices of the column transformer:
columnTransFormer
可用于重新排序列,但是您希望通过以所需的顺序传递列索引。配对columnTransFormer
与IdentityfunctionTransFormer
将其无能为力,只能重新排序列。 (您可以通过在初始化func
时通过不分配funcontransform
来创建身份functionTransformer
function> functionTransFormer ,在这种情况下,数据将通过而不会转换)。ColumnTransformer
can be used to reorder columns however you would like by passing it the column indices in the desired order. PairingColumnTransformer
with an identityFunctionTransformer
will make it do nothing but reorder the columns. (You can create an identityFunctionTransformer
by not assigningfunc
when initializingFunctionTransformer
, in which case the data will passed through without being transformed).