当前位置：文江博客话题详情

通过优化求矩阵

发布于 2024-08-07 08:27:23 字数 240 浏览 13 评论 0原文

我正在寻找解决以下问题的算法：

我有两组向量，并且我想找到最接近从输入向量到输出向量的变换的矩阵。

向量是 3x1，所以矩阵是 3x3。

这是普遍的问题。我的特殊问题是我有一组 RGB 颜色，以及另一组包含所需颜色的颜色。我正在尝试找到一种 RGB 到 RGB 的转换，使我的颜色更接近所需的颜色。

输入和输出向量之间存在对应关系，因此计算应最小化的误差函数是容易的部分。但我怎样才能最小化这个功能呢？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

静谧 2024-08-14 08:27:23

这是一个经典的线性代数问题，搜索的关键词是“多元线性回归”。

多年来我不得不多次编写这种变体的代码。例如，校准数字化仪平板电脑或手写笔触摸屏的代码使用相同的数学。

数学公式如下：

设 p 为输入向量，q 为相应的输出向量。

你想要的变换是一个3x3矩阵；称之为A。

对于单个输入和输出向量p和q，存在一个误差向量e

e = q - A x p

误差大小的平方是一个标量值：

eT x e = (q - A x p)T x (q - A< /strong> x p)

（其中 T 运算符是转置）。

您真正想要最小化的是集合上的 e 值的总和：

E = sum (e)

该最小值满足矩阵方程 < strong>D = 0 其中

D(i,j) = E 对 A 的偏导数(i, j)

假设您有 N 个输入和输出向量。

您的输入 3 向量集是一个 3xN 矩阵；将此矩阵称为P。
P 的第 i 列是第 i 个输入向量。

输出 3 向量集也是如此；将此矩阵称为Q。

当您研究完所有代数时，解决方案是

A = Q x PT x (P x PT) ^-1

（其中 ^-1 是逆运算符 - 抱歉没有上标或下标）

算法如下：

从集合中创建 3xN 矩阵 P输入向量。

从输出向量集创建 3xN 矩阵 Q。

矩阵乘法R = P x 转置(P)

计算R的逆

矩阵乘法A = Q x 转置(P) x 逆(R)

使用您选择的线性代数库的矩阵乘法和矩阵求逆例程。

但是，3x3 仿射变换矩阵能够缩放和旋转输入向量，但不能进行任何平移！对于您的问题来说，这可能不够普遍。通常最好在每个 3 向量的末尾附加一个“1”以形成 4 向量，并寻找能够最小化误差的最佳 3x4 变换矩阵。这不会造成伤害；它只会导致更好的数据拟合。

回复收藏 0 原文

暖风昔人 2024-08-14 08:27:23

您没有指定语言，但以下是我在 Matlab 中解决该问题的方法。

v1 是一个 3xn 矩阵，包含垂直向量中的输入颜色
v2 也是一个包含输出颜色的 3xn 矩阵

您想要求解该系统

M*v1 = v2
M = v2*inv(v1)

但是，v1 不能直接可逆，因为它不是方阵。 Matlab 将使用 mrdivide 运算自动求解该问题 (M = v2/v1)，其中 M 是最佳拟合解。

eg: 
>> v1 = rand(3,10);
>> M = rand(3,3);
>> v2 = M * v1;
>> v2/v1 - M

ans =

   1.0e-15 *

    0.4510    0.4441   -0.5551
    0.2220    0.1388   -0.3331
    0.4441    0.2220   -0.4441

>> (v2 + randn(size(v2))*0.1)/v1 - M
ans =

    0.0598   -0.1961    0.0931
   -0.1684    0.0509    0.1465
   -0.0931   -0.0009    0.0213

这提供了一个与语言无关的解决方案解决问题。

You don't specify a language, but here's how I would approach the problem in Matlab.

v1 is a 3xn matrix, containing your input colors in vertical vectors
v2 is also a 3xn matrix containing your output colors

You want to solve the system

M*v1 = v2
M = v2*inv(v1)

However, v1 is not directly invertible, since it's not a square matrix. Matlab will solve this automatically with the mrdivide operation (M = v2/v1), where M is the best fit solution.

eg: 
>> v1 = rand(3,10);
>> M = rand(3,3);
>> v2 = M * v1;
>> v2/v1 - M

ans =

   1.0e-15 *

    0.4510    0.4441   -0.5551
    0.2220    0.1388   -0.3331
    0.4441    0.2220   -0.4441

>> (v2 + randn(size(v2))*0.1)/v1 - M
ans =

    0.0598   -0.1961    0.0931
   -0.1684    0.0509    0.1465
   -0.0931   -0.0009    0.0213

This gives a more language-agnostic solution on how to solve the problem.

回复收藏 0 原文

韵柒 2024-08-14 08:27:23

一些线性代数应该足够了：

写出输入和输出之间的平均平方差（每个输入和输出值之间的每个差的平方和）。我假设这是“最佳近似”的定义，

这是 9 个未知矩阵系数的二次函数。

为了最小化它，请针对它们中的每一个导出它。

您将得到一个由 9 个方程组成的线性系统，您必须求解才能获得解（唯一的或空间变化，具体取决于输入集）

当差函数不是二次时，您可以执行相同的操作，但必须使用迭代方法求解方程组。

回复收藏 0 原文

深爱成瘾 2024-08-14 08:27:23

我认为这个答案更适合初学者：

有以下场景：

我们不知道矩阵 M，但我们知道向量 I_n 和相应的输出向量O_n。 n 的范围可以是 3 及以上。

如果我们有 3 个输入向量和 3 个输出向量（对于 3x3 矩阵），我们可以精确计算系数 α_r;c。这样我们就拥有了一个完全指定的系统。

但我们有超过 3 个向量，因此我们有一个超定方程组。

让我们写下这些方程。假设我们有这些向量：

我们知道，要得到向量O_n，我们必须与向量I_n进行矩阵乘法。
换句话说： M · I̅_n = O̅_n
如果我们扩展这个操作，我们得到（正规方程）：

我们不知道阿尔法，但我们知道其余的一切。事实上，有 9 个未知数，但有 12 个方程。这就是系统超定的原因。方程多于未知数。我们将使用所有方程来近似未知数，并且我们将使用平方和将更多方程聚合为更少的未知数。

所以我们将上面的方程组合成矩阵形式：

通过一些最小二乘代数魔法（回归），我们可以求解 b̅：

这就是该公式背后发生的事情：
转置矩阵并将其与其非转置部分相乘会创建一个方阵，并降低至较低维度 ([12x9] · [9x12] = [9x9])。
这个结果的倒数使我们能够求解 b̅。
将向量 y̅ 与转置的 x 相乘可将 y̅ 向量降低到较低的 [1x9] 维度。然后，通过将 [9x9] 逆乘以 [1x9] 向量，我们求解了 b̅ 的系统。

现在，我们采用 [1x9] 结果向量并从中创建一个矩阵。这是我们的近似变换矩阵。

python 代码：

import numpy as np
import numpy.linalg

INPUTS  = [[5,6,2],[1,7,3],[2,6,5],[1,7,5]]
OUTPUTS = [[3,7,1],[3,7,1],[3,7,2],[3,7,2]]

def get_mat(inputs, outputs, entry_len):
    n_of_vectors = inputs.__len__()
    noe = n_of_vectors*entry_len# Number of equations
    #We need to construct the input matrix.
    #We need to linearize the matrix. SO we will flatten the matrix array such as [a11, a12, a21, a22]
    #So for each row we combine the row's variables with each input vector.
    X_mat = []
    for in_n in range(0, n_of_vectors): #For each input vector
        #populate all matrix flattened variables. for 2x2 matrix - 4 variables, for 3x3 - 9 variables and so on.
        base = 0
        for col_n in range(0, entry_len): #Each original unknown matrix's row must be matched to all entries in the input vector
            row = [0 for i in range(0, entry_len ** 2)]
            for entry in inputs[in_n]:
                row[base] = entry
                base+=1
            X_mat.append(row)
    Y_mat = [item for sublist in outputs for item in sublist]

    X_np = np.array(X_mat)
    Y_np = np.array([Y_mat]).T
    solution = np.dot(np.dot(numpy.linalg.inv(np.dot(X_np.T,X_np)),X_np.T),Y_np)
    var_mat = solution.reshape(entry_len, entry_len) #create square matrix
    return var_mat


transf_mat = get_mat(INPUTS, OUTPUTS, 3) #3 means 3x3 matrix, and in/out vector size 3
print(transf_mat)
for i in range(0,INPUTS.__len__()):
    o = np.dot(transf_mat, np.array([INPUTS[i]]).T)
    print(f"{INPUTS[i]} x [M] = {o.T} ({OUTPUTS[i]})")

输出如下：

[[ 0.13654096  0.35890767  0.09530002]
 [ 0.31859558  0.83745124  0.22236671]
 [ 0.08322497 -0.0526658   0.4417611 ]]
[5, 6, 2] x [M] = [[3.02675088 7.06241873 0.98365224]] ([3, 7, 1])
[1, 7, 3] x [M] = [[2.93479472 6.84785436 1.03984767]] ([3, 7, 1])
[2, 6, 5] x [M] = [[2.90302805 6.77373212 2.05926064]] ([3, 7, 2])
[1, 7, 5] x [M] = [[3.12539476 7.29258778 1.92336987]] ([3, 7, 2])

您可以看到，它接受了所有指定的输入，获得了转换后的输出并将输出与参考向量进行匹配。结果并不精确，因为我们从过度指定的系统中得到了近似值。如果我们仅使用 3 个向量的 INPUT 和 OUTPUT，结果将是准确的。

This answer is better for beginners in my opinion:

Have the following scenario:

We don't know the matrix M, but we know the vector I_n and a corresponding output vector O_n. n can range from 3 and up.

If we had 3 input vectors and 3 output vectors (for 3x3 matrix), we could precisely compute the coefficients α_r;c. This way we would have a fully specified system.

But we have more than 3 vectors and thus we have an overdetermined system of equations.

Let's write down these equations. Say that we have these vectors:

We know, that to get the vector O_n, we must perform matrix multiplication with vector I_n.
In other words: M · I̅_n = O̅_n
If we expand this operation, we get (normal equations):

We do not know the alphas, but we know all the rest. In fact, there are 9 unknowns, but 12 equations. This is why the system is overdetermined. There are more equations than unknowns. We will approximate the unknowns using all the equations, and we will use the sum of squares to aggregate more equations into less unknowns.

So we will combine the above equations into a matrix form:

And with some least squares algebra magic (regression), we can solve for b̅:

This is what is happening behind that formula:
Transposing a matrix and multiplying it with its non-transposed part creates a square matrix, reduced to lower dimension ([12x9] · [9x12] = [9x9]).
Inverse of this result allows us to solve for b̅.
Multiplying vector y̅ with transposed x reduces the y̅ vector into lower [1x9] dimension. Then, by multiplying [9x9] inverse with [1x9] vector we solved the system for b̅.

Now, we take the [1x9] result vector and create a matrix from it. This is our approximated transformation matrix.

A python code:

import numpy as np
import numpy.linalg

INPUTS  = [[5,6,2],[1,7,3],[2,6,5],[1,7,5]]
OUTPUTS = [[3,7,1],[3,7,1],[3,7,2],[3,7,2]]

def get_mat(inputs, outputs, entry_len):
    n_of_vectors = inputs.__len__()
    noe = n_of_vectors*entry_len# Number of equations
    #We need to construct the input matrix.
    #We need to linearize the matrix. SO we will flatten the matrix array such as [a11, a12, a21, a22]
    #So for each row we combine the row's variables with each input vector.
    X_mat = []
    for in_n in range(0, n_of_vectors): #For each input vector
        #populate all matrix flattened variables. for 2x2 matrix - 4 variables, for 3x3 - 9 variables and so on.
        base = 0
        for col_n in range(0, entry_len): #Each original unknown matrix's row must be matched to all entries in the input vector
            row = [0 for i in range(0, entry_len ** 2)]
            for entry in inputs[in_n]:
                row[base] = entry
                base+=1
            X_mat.append(row)
    Y_mat = [item for sublist in outputs for item in sublist]

    X_np = np.array(X_mat)
    Y_np = np.array([Y_mat]).T
    solution = np.dot(np.dot(numpy.linalg.inv(np.dot(X_np.T,X_np)),X_np.T),Y_np)
    var_mat = solution.reshape(entry_len, entry_len) #create square matrix
    return var_mat


transf_mat = get_mat(INPUTS, OUTPUTS, 3) #3 means 3x3 matrix, and in/out vector size 3
print(transf_mat)
for i in range(0,INPUTS.__len__()):
    o = np.dot(transf_mat, np.array([INPUTS[i]]).T)
    print(f"{INPUTS[i]} x [M] = {o.T} ({OUTPUTS[i]})")

The output is as such:

[[ 0.13654096  0.35890767  0.09530002]
 [ 0.31859558  0.83745124  0.22236671]
 [ 0.08322497 -0.0526658   0.4417611 ]]
[5, 6, 2] x [M] = [[3.02675088 7.06241873 0.98365224]] ([3, 7, 1])
[1, 7, 3] x [M] = [[2.93479472 6.84785436 1.03984767]] ([3, 7, 1])
[2, 6, 5] x [M] = [[2.90302805 6.77373212 2.05926064]] ([3, 7, 2])
[1, 7, 5] x [M] = [[3.12539476 7.29258778 1.92336987]] ([3, 7, 2])

You can see, that it took all the specified inputs, got the transformed outputs and matched the outputs to the reference vectors. The results are not precise, since we have an approximation from the overspecified system. If we used INPUT and OUTPUT with only 3 vectors, the result would be exact.

回复收藏 0 原文

~没有更多了~