通过优化求矩阵

发布于 2024-08-07 08:27:23 字数 240 浏览 13 评论 0原文

我正在寻找解决以下问题的算法:

我有两组向量,并且我想找到最接近从输入向量到输出向量的变换的矩阵。

向量是 3x1,所以矩阵是 3x3。

这是普遍的问题。我的特殊问题是我有一组 RGB 颜色,以及另一组包含所需颜色的颜色。我正在尝试找到一种 RGB 到 RGB 的转换,使我的颜色更接近所需的颜色。

输入和输出向量之间存在对应关系,因此计算应最小化的误差函数是容易的部分。但我怎样才能最小化这个功能呢?

I am looking for algorithm to solve the following problem :

I have two sets of vectors, and I want to find the matrix that best approximate the transformation from the input vectors to the output vectors.

vectors are 3x1, so matrix is 3x3.

This is the general problem. My particular problem is I have a set of RGB colors, and another set that contains the desired color. I am trying to find an RGB to RGB transformation that would give me colors closer to the desired ones.

There is correspondence between the input and output vectors, so computing an error function that should be minimized is the easy part. But how can I minimize this function ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

静谧 2024-08-14 08:27:23

这是一个经典的线性代数问题,搜索的关键词是“多元线性回归”。

多年来我不得不多次编写这种变体的代码。例如,校准数字化仪平板电脑或手写笔触摸屏的代码使用相同的数学。


数学公式如下:

p 为输入向量,q 为相应的输出向量。

你想要的变换是一个3x3矩阵;称之为A

对于单个输入和输出向量pq,存在一个误差向量e

e = q - A x p

误差大小的平方是一个标量值:

eT x e = (q - A x p)T x (q - A< /strong> x p)

(其中 T 运算符是转置)。

您真正想要最小化的是集合上的 e 值的总和:

E = sum (e)

该最小值满足矩阵方程 < strong>D = 0 其中

D(i,j) = EA 的偏导数(i, j)

假设您有 N 个输入和输出向量。

您的输入 3 向量集是一个 3xN 矩阵;将此矩阵称为P
P 的第 i 列是第 i 个输入向量。

输出 3 向量集也是如此;将此矩阵称为Q

当您研究完所有代数时,解决方案是

A = Q x PT x (P x PT) ^-1

(其中 ^-1 是逆运算符 - 抱歉没有上标或下标)


算法如下:

从集合中创建 3xN 矩阵 P输入向量。

从输出向量集创建 3xN 矩阵 Q

矩阵乘法R = P x 转置(P)

计算R的逆

矩阵乘法A = Q x 转置(P) x 逆(R)

使用您选择的线性代数库的矩阵乘法和矩阵求逆例程。


但是,3x3 仿射变换矩阵能够缩放和旋转输入向量,但不能进行任何平移!对于您的问题来说,这可能不够普遍。通常最好在每个 3 向量的末尾附加一个“1”以形成 4 向量,并寻找能够最小化误差的最佳 3x4 变换矩阵。这不会造成伤害;它只会导致更好的数据拟合。

This is a classic linear algebra problem, the key phrase to search on is "multiple linear regression".

I've had to code some variation of this many times over the years. For example, code to calibrate a digitizer tablet or stylus touch-screen uses the same math.


Here's the math:

Let p be an input vector and q the corresponding output vector.

The transformation you want is a 3x3 matrix; call it A.

For a single input and output vector p and q, there is an error vector e

e = q - A x p

The square of the magnitude of the error is a scalar value:

eT x e = (q - A x p)T x (q - A x p)

(where the T operator is transpose).

What you really want to minimize is the sum of e values over the sets:

E = sum (e)

This minimum satisfies the matrix equation D = 0 where

D(i,j) = the partial derivative of E with respect to A(i,j)

Say you have N input and output vectors.

Your set of input 3-vectors is a 3xN matrix; call this matrix P.
The ith column of P is the ith input vector.

So is the set of output 3-vectors; call this matrix Q.

When you grind thru all of the algebra, the solution is

A = Q x PT x (P x PT) ^-1

(where ^-1 is the inverse operator -- sorry about no superscripts or subscripts)


Here's the algorithm:

Create the 3xN matrix P from the set of input vectors.

Create the 3xN matrix Q from the set of output vectors.

Matrix Multiply R = P x transpose (P)

Compute the inverseof R

Matrix Multiply A = Q x transpose(P) x inverse (R)

using the matrix multiplication and matrix inversion routines of your linear algebra library of choice.


However, a 3x3 affine transform matrix is capable of scaling and rotating the input vectors, but not doing any translation! This might not be general enough for your problem. It's usually a good idea to append a "1" on the end of each of the 3-vectors to make then a 4-vector, and look for the best 3x4 transform matrix that minimizes the error. This can't hurt; it can only lead to a better fit of the data.

暖风昔人 2024-08-14 08:27:23

您没有指定语言,但以下是我在 Matlab 中解决该问题的方法。

  • v1 是一个 3xn 矩阵,包含垂直向量中的输入颜色
  • v2 也是一个包含输出颜色的 3xn 矩阵

您想要求解该系统

M*v1 = v2
M = v2*inv(v1)

但是,v1 不能直接可逆,因为它不是方阵。 Matlab 将使用 mrdivide 运算自动求解该问题 (M = v2/v1),其中 M 是最佳拟合解。

eg: 
>> v1 = rand(3,10);
>> M = rand(3,3);
>> v2 = M * v1;
>> v2/v1 - M

ans =

   1.0e-15 *

    0.4510    0.4441   -0.5551
    0.2220    0.1388   -0.3331
    0.4441    0.2220   -0.4441

>> (v2 + randn(size(v2))*0.1)/v1 - M
ans =

    0.0598   -0.1961    0.0931
   -0.1684    0.0509    0.1465
   -0.0931   -0.0009    0.0213

提供了一个与语言无关的解决方案解决问题。

You don't specify a language, but here's how I would approach the problem in Matlab.

  • v1 is a 3xn matrix, containing your input colors in vertical vectors
  • v2 is also a 3xn matrix containing your output colors

You want to solve the system

M*v1 = v2
M = v2*inv(v1)

However, v1 is not directly invertible, since it's not a square matrix. Matlab will solve this automatically with the mrdivide operation (M = v2/v1), where M is the best fit solution.

eg: 
>> v1 = rand(3,10);
>> M = rand(3,3);
>> v2 = M * v1;
>> v2/v1 - M

ans =

   1.0e-15 *

    0.4510    0.4441   -0.5551
    0.2220    0.1388   -0.3331
    0.4441    0.2220   -0.4441

>> (v2 + randn(size(v2))*0.1)/v1 - M
ans =

    0.0598   -0.1961    0.0931
   -0.1684    0.0509    0.1465
   -0.0931   -0.0009    0.0213

This gives a more language-agnostic solution on how to solve the problem.

韵柒 2024-08-14 08:27:23

一些线性代数应该足够了:

写出输入和输出之间的平均平方差(每个输入和输出值之间的每个差的平方和)。我假设这是“最佳近似”的定义,

这是 9 个未知矩阵系数的二次函数。

为了最小化它,请针对它们中的每一个导出它。

您将得到一个由 9 个方程组成的线性系统,您必须求解才能获得解(唯一的或空间变化,具体取决于输入集)

当差函数不是二次时,您可以执行相同的操作,但必须使用迭代方法求解方程组。

Some linear algebra should be enough :

Write the average squared difference between inputs and outputs ( the sum of the squares of each difference between each input and output value ). I assume this as definition of "best approximate"

This is a quadratic function of your 9 unknown matrix coefficients.

To minimize it, derive it with respect to each of them.

You will get a linear system of 9 equations you have to solve to get the solution ( unique or a space variety depending on the input set )

When the difference function is not quadratic, you can do the same but you have to use an iterative method to solve the equation system.

深爱成瘾 2024-08-14 08:27:23

我认为这个答案更适合初学者:

有以下场景:

在此处输入图像描述

我们不知道矩阵 M,但我们知道向量 In 和相应的输出向量On。 n 的范围可以是 3 及以上。

如果我们有 3 个输入向量和 3 个输出向量(对于 3x3 矩阵),我们可以精确计算系数 αr;c。这样我们就拥有了一个完全指定的系统。

但我们有超过 3 个向量,因此我们有一个超定方程组。

让我们写下这些方程。假设我们有这些向量:
输入图片description here

我们知道,要得到向量On,我们必须与向量In进行矩阵乘法。
换句话说: M · I̅n = O̅n
如果我们扩展这个操作,我们得到(正规方程):
输入图片这里的描述

我们不知道阿尔法,但我们知道其余的一切。事实上,有 9 个未知数,但有 12 个方程。这就是系统超定的原因。方程多于未知数。我们将使用所有方程来近似未知数,并且我们将使用平方和将更多方程聚合为更少的未知数。

所以我们将上面的方程组合成矩阵形式:

在此处输入图像描述
通过一些最小二乘代数魔法(回归),我们可以求解 b̅:
输入图片此处描述
这就是该公式背后发生的事情:
转置矩阵并将其与其非转置部分相乘会创建一个方阵,并降低至较低维度 ([12x9] · [9x12] = [9x9])。
这个结果的倒数使我们能够求解 b̅。
将向量 y̅ 与转置的 x 相乘可将 y̅ 向量降低到较低的 [1x9] 维度。然后,通过将 [9x9] 逆乘以 [1x9] 向量,我们求解了 b̅ 的系统。

现在,我们采用 [1x9] 结果向量并从中创建一个矩阵。这是我们的近似变换矩阵。

python 代码:

import numpy as np
import numpy.linalg

INPUTS  = [[5,6,2],[1,7,3],[2,6,5],[1,7,5]]
OUTPUTS = [[3,7,1],[3,7,1],[3,7,2],[3,7,2]]

def get_mat(inputs, outputs, entry_len):
    n_of_vectors = inputs.__len__()
    noe = n_of_vectors*entry_len# Number of equations
    #We need to construct the input matrix.
    #We need to linearize the matrix. SO we will flatten the matrix array such as [a11, a12, a21, a22]
    #So for each row we combine the row's variables with each input vector.
    X_mat = []
    for in_n in range(0, n_of_vectors): #For each input vector
        #populate all matrix flattened variables. for 2x2 matrix - 4 variables, for 3x3 - 9 variables and so on.
        base = 0
        for col_n in range(0, entry_len): #Each original unknown matrix's row must be matched to all entries in the input vector
            row = [0 for i in range(0, entry_len ** 2)]
            for entry in inputs[in_n]:
                row[base] = entry
                base+=1
            X_mat.append(row)
    Y_mat = [item for sublist in outputs for item in sublist]

    X_np = np.array(X_mat)
    Y_np = np.array([Y_mat]).T
    solution = np.dot(np.dot(numpy.linalg.inv(np.dot(X_np.T,X_np)),X_np.T),Y_np)
    var_mat = solution.reshape(entry_len, entry_len) #create square matrix
    return var_mat


transf_mat = get_mat(INPUTS, OUTPUTS, 3) #3 means 3x3 matrix, and in/out vector size 3
print(transf_mat)
for i in range(0,INPUTS.__len__()):
    o = np.dot(transf_mat, np.array([INPUTS[i]]).T)
    print(f"{INPUTS[i]} x [M] = {o.T} ({OUTPUTS[i]})")

输出如下:

[[ 0.13654096  0.35890767  0.09530002]
 [ 0.31859558  0.83745124  0.22236671]
 [ 0.08322497 -0.0526658   0.4417611 ]]
[5, 6, 2] x [M] = [[3.02675088 7.06241873 0.98365224]] ([3, 7, 1])
[1, 7, 3] x [M] = [[2.93479472 6.84785436 1.03984767]] ([3, 7, 1])
[2, 6, 5] x [M] = [[2.90302805 6.77373212 2.05926064]] ([3, 7, 2])
[1, 7, 5] x [M] = [[3.12539476 7.29258778 1.92336987]] ([3, 7, 2])

您可以看到,它接受了所有指定的输入,获得了转换后的输出并将输出与参考向量进行匹配。结果并不精确,因为我们从过度指定的系统中得到了近似值。如果我们仅使用 3 个向量的 INPUT 和 OUTPUT,结果将是准确的。

This answer is better for beginners in my opinion:

Have the following scenario:

enter image description here

We don't know the matrix M, but we know the vector In and a corresponding output vector On. n can range from 3 and up.

If we had 3 input vectors and 3 output vectors (for 3x3 matrix), we could precisely compute the coefficients αr;c. This way we would have a fully specified system.

But we have more than 3 vectors and thus we have an overdetermined system of equations.

Let's write down these equations. Say that we have these vectors:
enter image description here

We know, that to get the vector On, we must perform matrix multiplication with vector In.
In other words: M · I̅n = O̅n
If we expand this operation, we get (normal equations):
enter image description here

We do not know the alphas, but we know all the rest. In fact, there are 9 unknowns, but 12 equations. This is why the system is overdetermined. There are more equations than unknowns. We will approximate the unknowns using all the equations, and we will use the sum of squares to aggregate more equations into less unknowns.

So we will combine the above equations into a matrix form:

enter image description here
And with some least squares algebra magic (regression), we can solve for b̅:
enter image description here
This is what is happening behind that formula:
Transposing a matrix and multiplying it with its non-transposed part creates a square matrix, reduced to lower dimension ([12x9] · [9x12] = [9x9]).
Inverse of this result allows us to solve for b̅.
Multiplying vector y̅ with transposed x reduces the y̅ vector into lower [1x9] dimension. Then, by multiplying [9x9] inverse with [1x9] vector we solved the system for b̅.

Now, we take the [1x9] result vector and create a matrix from it. This is our approximated transformation matrix.

A python code:

import numpy as np
import numpy.linalg

INPUTS  = [[5,6,2],[1,7,3],[2,6,5],[1,7,5]]
OUTPUTS = [[3,7,1],[3,7,1],[3,7,2],[3,7,2]]

def get_mat(inputs, outputs, entry_len):
    n_of_vectors = inputs.__len__()
    noe = n_of_vectors*entry_len# Number of equations
    #We need to construct the input matrix.
    #We need to linearize the matrix. SO we will flatten the matrix array such as [a11, a12, a21, a22]
    #So for each row we combine the row's variables with each input vector.
    X_mat = []
    for in_n in range(0, n_of_vectors): #For each input vector
        #populate all matrix flattened variables. for 2x2 matrix - 4 variables, for 3x3 - 9 variables and so on.
        base = 0
        for col_n in range(0, entry_len): #Each original unknown matrix's row must be matched to all entries in the input vector
            row = [0 for i in range(0, entry_len ** 2)]
            for entry in inputs[in_n]:
                row[base] = entry
                base+=1
            X_mat.append(row)
    Y_mat = [item for sublist in outputs for item in sublist]

    X_np = np.array(X_mat)
    Y_np = np.array([Y_mat]).T
    solution = np.dot(np.dot(numpy.linalg.inv(np.dot(X_np.T,X_np)),X_np.T),Y_np)
    var_mat = solution.reshape(entry_len, entry_len) #create square matrix
    return var_mat


transf_mat = get_mat(INPUTS, OUTPUTS, 3) #3 means 3x3 matrix, and in/out vector size 3
print(transf_mat)
for i in range(0,INPUTS.__len__()):
    o = np.dot(transf_mat, np.array([INPUTS[i]]).T)
    print(f"{INPUTS[i]} x [M] = {o.T} ({OUTPUTS[i]})")

The output is as such:

[[ 0.13654096  0.35890767  0.09530002]
 [ 0.31859558  0.83745124  0.22236671]
 [ 0.08322497 -0.0526658   0.4417611 ]]
[5, 6, 2] x [M] = [[3.02675088 7.06241873 0.98365224]] ([3, 7, 1])
[1, 7, 3] x [M] = [[2.93479472 6.84785436 1.03984767]] ([3, 7, 1])
[2, 6, 5] x [M] = [[2.90302805 6.77373212 2.05926064]] ([3, 7, 2])
[1, 7, 5] x [M] = [[3.12539476 7.29258778 1.92336987]] ([3, 7, 2])

You can see, that it took all the specified inputs, got the transformed outputs and matched the outputs to the reference vectors. The results are not precise, since we have an approximation from the overspecified system. If we used INPUT and OUTPUT with only 3 vectors, the result would be exact.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文