将Sklearn.decocts和NP.Linalg.SVD的截短的SVD对齐

发布于 2025-01-29 21:03:41 字数 2485 浏览 3 评论 0原文

==========

= pa132＆amp; dq =+矩阵+是+实际+返回+by ++ truncatedSvd+is++dot+the+dot+of+the+u+ands+矩阵。＆amp; hl = zh-cn＆amp; sa = X&ved=2ahUKEwi-0oqdxej3AhWcx4sBHUmVCnAQ6AF6BAgJEAI#v=onepage&q=The%20matrix%20that%20is%20actually%20returned%20by%20%20TruncatedSVD%20is%20the%20dot%20product%20of%20the%20U%20andS%20matrices. ＆amp; f = false“ rel =“ nofollow noreferrer”> book ：

truncatedSvd实际返回的矩阵是u和矩阵的点产物。

然后，我尝试将U和Sigma倍增：

US = U.dot(Sigma)
print("==>> US: ", US)

这次它会产生相同的结果，只是在符号翻转时产生相同的结果。所以为什么截短的SVD不需要乘以VT ？

==========================

我正在学习SVD，我发现Numpy和Sklearn都提供了一些相关的API，然后我尝试使用它们来减少维度，以下是代码：

import numpy as np
np.set_printoptions(precision=2, suppress=True)
A = np.array([
    [1,1,1,0,0],
    [3,3,3,0,0],
    [4,4,4,0,0],
    [5,5,5,0,0],
    [0,2,0,4,4],
    [0,0,0,5,5],
    [0,1,0,2,2]])
U, s, VT = np.linalg.svd(A)
print("==>> U: ", U)
print("==>> VT: ", VT)

# create m x n Sigma matrix
Sigma = np.zeros((A.shape[0], A.shape[1]))
# populate Sigma with n x n diagonal matrix
square_len = min((A.shape[0], A.shape[1]))
Sigma[:square_len, :square_len] = np.diag(s)

print("==>> Sigma: ", Sigma)

n_elements = 2
U = U[:, :n_elements]
Sigma = Sigma[:n_elements, :n_elements]
VT = VT[:n_elements, :n_elements]

# reconstruct
B = U.dot(Sigma.dot(VT))
print("==>> B: ", B)

输出B是：

==>> B:  [[ 0.99  1.01]
 [ 2.98  3.04]
 [ 3.98  4.05]
 [ 4.97  5.06]
 [ 0.36  1.29]
 [-0.37  0.73]
 [ 0.18  0.65]]

然后，这是Sklearn代码：

import numpy as np
from sklearn.decomposition import TruncatedSVD

A = np.array([
    [1,1,1,0,0],
    [3,3,3,0,0],
    [4,4,4,0,0],
    [5,5,5,0,0],
    [0,2,0,4,4],
    [0,0,0,5,5],
    [0,1,0,2,2]]).astype(float)
svd = TruncatedSVD(n_components=2)
svd.fit(A)  # Fit model on training data A
print("==>> right singular vectors: ", svd.components_)
print("==>> svd.singular_values_: ", svd.singular_values_)
B = svd.transform(A)  # Perform dimensionality reduction on A.
print("==>> B: ", B)

其最后的输出结果是：

==>> B:  [[ 1.72 -0.22]
 [ 5.15 -0.67]
 [ 6.87 -0.9 ]
 [ 8.59 -1.12]
 [ 1.91  5.62]
 [ 0.9   6.95]
 [ 0.95  2.81]]

如我们所见，它们产生不同的结果（但我注意到它们的单数值相同，两个都是12.48 9.51），如何让它们相同，我会误会什么吗？

原文

=========update==========

I read an infomation in this book:

The matrix that is actually returned by TruncatedSVD is the dot product of the U andS matrices.

Then i try to just multiply U and Sigma:

US = U.dot(Sigma)
print("==>> US: ", US)

this time it produce the same result, just with sign flipping. So why Truncated SVD doesn't need multiplying VT ?

==========previous question===========

I am learning SVD, i found numpy and sklearn both provide some related APIs, then i try to use them to do dimensional reduction, below are the code:

import numpy as np
np.set_printoptions(precision=2, suppress=True)
A = np.array([
    [1,1,1,0,0],
    [3,3,3,0,0],
    [4,4,4,0,0],
    [5,5,5,0,0],
    [0,2,0,4,4],
    [0,0,0,5,5],
    [0,1,0,2,2]])
U, s, VT = np.linalg.svd(A)
print("==>> U: ", U)
print("==>> VT: ", VT)

# create m x n Sigma matrix
Sigma = np.zeros((A.shape[0], A.shape[1]))
# populate Sigma with n x n diagonal matrix
square_len = min((A.shape[0], A.shape[1]))
Sigma[:square_len, :square_len] = np.diag(s)

print("==>> Sigma: ", Sigma)

n_elements = 2
U = U[:, :n_elements]
Sigma = Sigma[:n_elements, :n_elements]
VT = VT[:n_elements, :n_elements]

# reconstruct
B = U.dot(Sigma.dot(VT))
print("==>> B: ", B)

The output B is :

==>> B:  [[ 0.99  1.01]
 [ 2.98  3.04]
 [ 3.98  4.05]
 [ 4.97  5.06]
 [ 0.36  1.29]
 [-0.37  0.73]
 [ 0.18  0.65]]

then this is sklearn code:

import numpy as np
from sklearn.decomposition import TruncatedSVD

A = np.array([
    [1,1,1,0,0],
    [3,3,3,0,0],
    [4,4,4,0,0],
    [5,5,5,0,0],
    [0,2,0,4,4],
    [0,0,0,5,5],
    [0,1,0,2,2]]).astype(float)
svd = TruncatedSVD(n_components=2)
svd.fit(A)  # Fit model on training data A
print("==>> right singular vectors: ", svd.components_)
print("==>> svd.singular_values_: ", svd.singular_values_)
B = svd.transform(A)  # Perform dimensionality reduction on A.
print("==>> B: ", B)

its last output result is:

==>> B:  [[ 1.72 -0.22]
 [ 5.15 -0.67]
 [ 6.87 -0.9 ]
 [ 8.59 -1.12]
 [ 1.91  5.62]
 [ 0.9   6.95]
 [ 0.95  2.81]]

As we can see, they produce different result (but i notice their singular values are the same, both are 12.48 9.51), how to make them same, does i misunderstand something ?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱本泡沫多脆弱 2025-02-05 21:03:41

我认为，使用np.linalg.svd IS进行数组 a 降低维度的正确方法是：

U, s, V = np.linalg.svd(A)
VT = V.T
B = A@VT[:,:n_elements]

：

array([[-1.72,  0.22],
       [-5.15,  0.67],
       [-6.87,  0.9 ],
       [-8.59,  1.12],
       [-1.91, -5.62],
       [-0.9 , -6.95],
       [-0.95, -2.81]])

现在b是您从truncatedSVD中获得的内容，但带有负标志。

I think the correct way to perform a dimensionality reduction of the array A with np.linalg.svd is:

U, s, V = np.linalg.svd(A)
VT = V.T
B = A@VT[:,:n_elements]

Now B is:

array([[-1.72,  0.22],
       [-5.15,  0.67],
       [-6.87,  0.9 ],
       [-8.59,  1.12],
       [-1.91, -5.62],
       [-0.9 , -6.95],
       [-0.95, -2.81]])

That is exactly what you get from the TruncatedSVD, but with negative sign.

回复收藏 0 原文

~没有更多了~