创建多项式特征矩阵

发布于 2025-02-13 01:52:56 字数 1460 浏览 4 评论 0原文

我正在尝试构建类似于R中的Python的Sklearn PolyenmialFeatures的多项式特征矩阵。不幸的是，我找不到任何具有相似函数的现有软件包。我不了解此类功能矩阵的基本统计数据 - 任何帮助或指针都非常感谢！

Sklearn文档将其解释为：生成一个新的特征矩阵，该矩阵由该特征的所有多项式组合组成，该特征的程度小于或等于指定度。例如，如果输入样本为二维和形式[a，b]，则第二个多项式特征为[1，a，b，a^2，ab，b^2]。

我尝试复制的Python代码如下：

x1 = 298 
x2 = 35
x3 = 0.05
x4 = 0.01

X = np.vstack([x1, np.log(x1), x2, x3, x4]).T

poly = PolynomialFeatures(degree=3)
X_ = poly.fit_transform(X)

X_

[24]:
array([[1.00000000e+00, 2.98000000e+02, 5.69709349e+00, 3.50000000e+01,
        5.00000000e-02, 1.00000000e-02, 8.88040000e+04, 1.69773386e+03,
        1.04300000e+04, 1.49000000e+01, 2.98000000e+00, 3.24568742e+01,
        1.99398272e+02, 2.84854674e-01, 5.69709349e-02, 1.22500000e+03,
        1.75000000e+00, 3.50000000e-01, 2.50000000e-03, 5.00000000e-04,
        1.00000000e-04, 2.64635920e+07, 5.05924690e+05, 3.10814000e+06,
        4.44020000e+03, 8.88040000e+02, 9.67214851e+03, 5.94206851e+04,
        8.48866929e+01, 1.69773386e+01, 3.65050000e+05, 5.21500000e+02,
        1.04300000e+02, 7.45000000e-01, 1.49000000e-01, 2.98000000e-02,
        1.84909847e+02, 1.13599060e+03, 1.62284371e+00, 3.24568742e-01,
        6.97893952e+03, 9.96991360e+00, 1.99398272e+00, 1.42427337e-02,
        2.84854674e-03, 5.69709349e-04, 4.28750000e+04, 6.12500000e+01,
        1.22500000e+01, 8.75000000e-02, 1.75000000e-02, 3.50000000e-03,
        1.25000000e-04, 2.50000000e-05, 5.00000000e-06, 1.00000000e-06]])

原文

I am trying to build a polynomial feature matrix similar to python's sklearn PolynomialFeatures in R. Unfortunately I could not find any existing packages with a similar function. I don't understand the underlying statistics of such a feature matrix - any help or pointers are very much appreciated!

The sklearn docs explain it as: Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

The python code I try to replicate is the following:

x1 = 298 
x2 = 35
x3 = 0.05
x4 = 0.01

X = np.vstack([x1, np.log(x1), x2, x3, x4]).T

poly = PolynomialFeatures(degree=3)
X_ = poly.fit_transform(X)

X_

[24]:
array([[1.00000000e+00, 2.98000000e+02, 5.69709349e+00, 3.50000000e+01,
        5.00000000e-02, 1.00000000e-02, 8.88040000e+04, 1.69773386e+03,
        1.04300000e+04, 1.49000000e+01, 2.98000000e+00, 3.24568742e+01,
        1.99398272e+02, 2.84854674e-01, 5.69709349e-02, 1.22500000e+03,
        1.75000000e+00, 3.50000000e-01, 2.50000000e-03, 5.00000000e-04,
        1.00000000e-04, 2.64635920e+07, 5.05924690e+05, 3.10814000e+06,
        4.44020000e+03, 8.88040000e+02, 9.67214851e+03, 5.94206851e+04,
        8.48866929e+01, 1.69773386e+01, 3.65050000e+05, 5.21500000e+02,
        1.04300000e+02, 7.45000000e-01, 1.49000000e-01, 2.98000000e-02,
        1.84909847e+02, 1.13599060e+03, 1.62284371e+00, 3.24568742e-01,
        6.97893952e+03, 9.96991360e+00, 1.99398272e+00, 1.42427337e-02,
        2.84854674e-03, 5.69709349e-04, 4.28750000e+04, 6.12500000e+01,
        1.22500000e+01, 8.75000000e-02, 1.75000000e-02, 3.50000000e-03,
        1.25000000e-04, 2.50000000e-05, 5.00000000e-06, 1.00000000e-06]])

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

骄兵必败 2025-02-20 01:52:56

使用poly例如，

c(1, poly(t(X), degree = 3, raw = TRUE))

请注意，订购将有所不同。

另请注意，Python代码不正确。如果x是列，则不要转置。在这种情况下，您将拥有每种语言的正确值：

poly.fit_transform(X.T) # Original X before transpose   
array([[1.00000000e+00, 2.98000000e+02, 8.88040000e+04, 2.64635920e+07],
       [1.00000000e+00, 5.69709349e+00, 3.24568742e+01, 1.84909847e+02],
       [1.00000000e+00, 3.50000000e+01, 1.22500000e+03, 4.28750000e+04],
       [1.00000000e+00, 5.00000000e-02, 2.50000000e-03, 1.25000000e-04],
       [1.00000000e+00, 1.00000000e-02, 1.00000000e-04, 1.00000000e-06]])

in R：

 X <- c(x1, log(x1), x2, x3, x4)
 cbind(intercept = 1, poly(X, 3, raw = TRUE))
     intercept          1           2            3
[1,]         1 298.000000 88804.00000 2.646359e+07
[2,]         1   5.697093    32.45687 1.849098e+02
[3,]         1  35.000000  1225.00000 4.287500e+04
[4,]         1   0.050000     0.00250 1.250000e-04
[5,]         1   0.010000     0.00010 1.000000e-06

Use poly eg

c(1, poly(t(X), degree = 3, raw = TRUE))

Note that the ordering will be different.

Also Note that the python code is incorrect. If X is a column, then do not transpose. in that case you will have the correct values from each language:

poly.fit_transform(X.T) # Original X before transpose   
array([[1.00000000e+00, 2.98000000e+02, 8.88040000e+04, 2.64635920e+07],
       [1.00000000e+00, 5.69709349e+00, 3.24568742e+01, 1.84909847e+02],
       [1.00000000e+00, 3.50000000e+01, 1.22500000e+03, 4.28750000e+04],
       [1.00000000e+00, 5.00000000e-02, 2.50000000e-03, 1.25000000e-04],
       [1.00000000e+00, 1.00000000e-02, 1.00000000e-04, 1.00000000e-06]])

in R:

 X <- c(x1, log(x1), x2, x3, x4)
 cbind(intercept = 1, poly(X, 3, raw = TRUE))
     intercept          1           2            3
[1,]         1 298.000000 88804.00000 2.646359e+07
[2,]         1   5.697093    32.45687 1.849098e+02
[3,]         1  35.000000  1225.00000 4.287500e+04
[4,]         1   0.050000     0.00250 1.250000e-04
[5,]         1   0.010000     0.00010 1.000000e-06

回复收藏 0 原文

~没有更多了~