Levenberg-Marquardt 算法的局限性

发布于 2024-10-07 08:35:38 字数 3712 浏览 8 评论 0原文

我正在使用 Levenberg-Marquardt 算法来最小化 6 个参数的非线性函数。每次最小化我都得到了大约 50 个数据点，但我没有得到足够准确的结果。我的参数彼此相差几个数量级这一事实是否有那么重要？如果是，我应该去哪里寻找解决方案？如果不是，您在工作中遇到了 LMA 的哪些限制（这可能有助于查找我的应用程序的其他问题）？非常感谢您的帮助。

编辑：我试图解决的问题是确定最佳变换 T：

typedef struct 
{
    double x_translation, y_translation, z_translation; 
    double x_rotation, y_rotation, z_rotation;
} transform_3D;

将 3D 点集拟合到一堆 3D 线。详细地说，我得到了一组 3D 点的坐标和相应 3D 线的方程，它们应该穿过这些点（在理想情况下）。 LMA 最小化变换后的 3D 点到相应 3D 线的距离总和。变换函数如下：

cv::Point3d Geometry::transformation_3D(cv::Point3d point, transform_3D transformation)
{
    cv::Point3d p_odd,p_even;

    //rotation x
    p_odd.x=point.x;
    p_odd.y=point.y*cos(transformation.x_rotation)-point.z*sin(transformation.x_rotation); 
    p_odd.z=point.y*sin(transformation.x_rotation)+point.z*cos(transformation.x_rotation);

    //rotation y
    p_even.x=p_odd.z*sin(transformation.y_rotation)+p_odd.x*cos(transformation.y_rotation);
    p_even.y=p_odd.y;
    p_even.z=p_odd.z*cos(transformation.y_rotation)-p_odd.x*sin(transformation.y_rotation);

    //rotation z
    p_odd.x=p_even.x*cos(transformation.z_rotation)-p_even.y*sin(transformation.z_rotation);
    p_odd.y=p_even.x*sin(transformation.z_rotation)+p_even.y*cos(transformation.z_rotation);
    p_odd.z=p_even.z;

    //translation
    p_even.x=p_odd.x+transformation.x_translation;
    p_even.y=p_odd.y+transformation.y_translation;
    p_even.z=p_odd.z+transformation.z_translation;

    return p_even;
}

希望这个解释会有所帮助...

Edit2：

下面粘贴了一些示例数据。 3D 线由中心点和方向向量描述。所有线的中心点均为 (0,0,0)，每个向量的“uz”坐标等于 1。方向向量的“ux”坐标集：

-1.0986, -1.0986, -1.0986,
-1.0986, -1.0990, -1.0986,
-1.0986, -1.0986, -0.9995,
-0.9996, -0.9996, -0.9995,
-0.9995, -0.9995, -0.9996,
-0.9003, -0.9003, -0.9004,
-0.9003, -0.9003, -0.9003,
-0.9003, -0.9003, -0.8011,
-0.7020, -0.7019, -0.6028,
-0.5035, -0.5037, -0.4045,
-0.3052, -0.3053, -0.2062,
-0.1069, -0.1069, -0.1075,
-0.1070, -0.1070, -0.1069,
-0.1069, -0.1070, -0.0079,
-0.0079, -0.0079, -0.0078,
-0.0078, -0.0079, -0.0079,
 0.0914,  0.0914,  0.0913,
 0.0913,  0.0914,  0.0915,
 0.0914,  0.0914

方向向量的“uy”坐标集：

-0.2032,  -0.0047,    0.1936,
0.3919,    0.5901,    0.7885,
0.9869,    1.1852,    -0.1040,
0.0944,    0.2927,    0.4911,
0.6894,    0.8877,    1.0860,
-0.2032,  -0.0047,    0.1936,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852,    1.0860,
0.9869,    1.1852,    1.0861,
0.9865,    1.1853,    1.0860,
0.9870,    1.1852,    1.0861,
-0.2032,  -0.0047,    0.1937,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852,    -0.1039,
0.0944,    0.2927,    0.4911,
0.6894,    0.8877,    1.0860,
-0.2032,  -0.0047,    0.1935,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852

以及 (xyzxyzxyz ...) 形式的 3D 点集：

 {{0, 0, 0}, {0, 16, 0},   {0, 32, 0}, 
 {0, 48, 0}, {0, 64, 0},   {0, 80, 0},
 {0, 96, 0}, {0, 112,0},   {8, 8, 0},
 {8, 24, 0}, {8, 40, 0},   {8, 56, 0}, 
 {8, 72, 0}, {8, 88, 0},   {8, 104, 0}, 
 {16, 0, 0}, {16, 16,0},   {16, 32, 0}, 
{16, 48, 0}, {16, 64, 0},  {16, 80, 0}, 
{16, 96, 0}, {16, 112, 0}, {24, 104, 0}, 
{32, 96, 0}, {32, 112, 0}, {40, 104, 0},
{48, 96, 0}, {48, 112, 0}, {56, 104, 0},
{64, 96, 0}, {64, 112, 0}, {72, 104, 0}, 
{80, 0, 0},  {80, 16, 0},  {80, 32, 0},
{80,48, 0},  {80, 64, 0},  {80, 80, 0}, 
{80, 96, 0}, {80, 112, 0}, {88,  8, 0}, 
{88, 24, 0}, {88, 40, 0},  {88, 56, 0},
{88, 72, 0}, {88, 88, 0},  {88, 104, 0},
{96, 0, 0},  {96, 16, 0},  {96, 32, 0}, 
{96, 48,0},  {96, 64, 0},  {96, 80, 0}, 
{96, 96, 0}, {96, 112, 0}}

这是一种具有非常小的旋转的“简单”建模数据。

原文

I am using Levenberg-Marquardt algorithm to minimize a non-linear function of 6 parameters. I have got about 50 data points for each minimization, but I do not get sufficiently accurate results. Does the fact, that my parameters differ from each other by a few orders of magnitudes can be so much significant? If yes, where should I look for the solution? If no, what kind of limitations of LMA you met in your work (it may help to find other problems with my applictaion)?
Many Thanks for your help.

Edit: The problem I am trying to solve is to determine the best transformation T:

typedef struct 
{
    double x_translation, y_translation, z_translation; 
    double x_rotation, y_rotation, z_rotation;
} transform_3D;

to fit the set of 3D points to the bunch of 3D lines. In detail I have got a set of coordinates of 3D points and equations of corresponding 3D lines, which should go through those points (in ideal situation). The LMA is minimizing the summ of distances of the transfomed 3D points to corresponding 3D lines.
The transform function is as follows:

cv::Point3d Geometry::transformation_3D(cv::Point3d point, transform_3D transformation)
{
    cv::Point3d p_odd,p_even;

    //rotation x
    p_odd.x=point.x;
    p_odd.y=point.y*cos(transformation.x_rotation)-point.z*sin(transformation.x_rotation); 
    p_odd.z=point.y*sin(transformation.x_rotation)+point.z*cos(transformation.x_rotation);

    //rotation y
    p_even.x=p_odd.z*sin(transformation.y_rotation)+p_odd.x*cos(transformation.y_rotation);
    p_even.y=p_odd.y;
    p_even.z=p_odd.z*cos(transformation.y_rotation)-p_odd.x*sin(transformation.y_rotation);

    //rotation z
    p_odd.x=p_even.x*cos(transformation.z_rotation)-p_even.y*sin(transformation.z_rotation);
    p_odd.y=p_even.x*sin(transformation.z_rotation)+p_even.y*cos(transformation.z_rotation);
    p_odd.z=p_even.z;

    //translation
    p_even.x=p_odd.x+transformation.x_translation;
    p_even.y=p_odd.y+transformation.y_translation;
    p_even.z=p_odd.z+transformation.z_translation;

    return p_even;
}

Hope this explanation will help a bit...

Edit2:

Some exemplary data is pasted below. 3D lines are described by the center point and the directional vector. Center point for all lines are (0,0,0) and 'uz' coordinate for each vector is equal to 1.
Set of 'ux' coordinates of directional vectors:

-1.0986, -1.0986, -1.0986,
-1.0986, -1.0990, -1.0986,
-1.0986, -1.0986, -0.9995,
-0.9996, -0.9996, -0.9995,
-0.9995, -0.9995, -0.9996,
-0.9003, -0.9003, -0.9004,
-0.9003, -0.9003, -0.9003,
-0.9003, -0.9003, -0.8011,
-0.7020, -0.7019, -0.6028,
-0.5035, -0.5037, -0.4045,
-0.3052, -0.3053, -0.2062,
-0.1069, -0.1069, -0.1075,
-0.1070, -0.1070, -0.1069,
-0.1069, -0.1070, -0.0079,
-0.0079, -0.0079, -0.0078,
-0.0078, -0.0079, -0.0079,
 0.0914,  0.0914,  0.0913,
 0.0913,  0.0914,  0.0915,
 0.0914,  0.0914

Set of 'uy' coordinates of directional vectors:

-0.2032,  -0.0047,    0.1936,
0.3919,    0.5901,    0.7885,
0.9869,    1.1852,    -0.1040,
0.0944,    0.2927,    0.4911,
0.6894,    0.8877,    1.0860,
-0.2032,  -0.0047,    0.1936,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852,    1.0860,
0.9869,    1.1852,    1.0861,
0.9865,    1.1853,    1.0860,
0.9870,    1.1852,    1.0861,
-0.2032,  -0.0047,    0.1937,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852,    -0.1039,
0.0944,    0.2927,    0.4911,
0.6894,    0.8877,    1.0860,
-0.2032,  -0.0047,    0.1935,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852

and set of 3D points in (x. y. z. x. y. z. x. y. z. ...) form:

 {{0, 0, 0}, {0, 16, 0},   {0, 32, 0}, 
 {0, 48, 0}, {0, 64, 0},   {0, 80, 0},
 {0, 96, 0}, {0, 112,0},   {8, 8, 0},
 {8, 24, 0}, {8, 40, 0},   {8, 56, 0}, 
 {8, 72, 0}, {8, 88, 0},   {8, 104, 0}, 
 {16, 0, 0}, {16, 16,0},   {16, 32, 0}, 
{16, 48, 0}, {16, 64, 0},  {16, 80, 0}, 
{16, 96, 0}, {16, 112, 0}, {24, 104, 0}, 
{32, 96, 0}, {32, 112, 0}, {40, 104, 0},
{48, 96, 0}, {48, 112, 0}, {56, 104, 0},
{64, 96, 0}, {64, 112, 0}, {72, 104, 0}, 
{80, 0, 0},  {80, 16, 0},  {80, 32, 0},
{80,48, 0},  {80, 64, 0},  {80, 80, 0}, 
{80, 96, 0}, {80, 112, 0}, {88,  8, 0}, 
{88, 24, 0}, {88, 40, 0},  {88, 56, 0},
{88, 72, 0}, {88, 88, 0},  {88, 104, 0},
{96, 0, 0},  {96, 16, 0},  {96, 32, 0}, 
{96, 48,0},  {96, 64, 0},  {96, 80, 0}, 
{96, 96, 0}, {96, 112, 0}}

This is kind of an "easy" modelled data with very small rotations.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

段念尘 2024-10-14 08:35:38

那么，使用 Levenberg-Marquardt 的正确方法是，您需要对参数进行良好的初始估计（“种子”）。回想一下 LM 是 Newton-Raphson 的变体；与此类迭代算法一样，起点的质量将决定迭代的成败；要么收敛到你想要的，要么收敛到完全不同的东西（并不是不太可能发生，特别是如果你有很多参数），或者射向那边的狂野蓝色（发散）。

无论如何，如果您能提及您正在拟合的模型函数，以及可能的数据散点图，将会更有帮助；这可能会对找到可行的解决方案大有帮助。

回复收藏 0 原文

Hello爱情风 2024-10-14 08:35:38

我建议您尝试使用不同的方法来间接找到旋转参数，即使用 4x4 仿射变换矩阵来合并平移和旋转参数。

这消除了正弦和余弦函数的非线性（您可以在事后弄清楚）。

困难的部分是限制变换矩阵的剪切或缩放，这是您不想要的。

回复收藏 0 原文

起风了 2024-10-14 08:35:38

在这里，您可以对问题进行建模并使用 Mathematica 运行。

我使用了“Levenberg-Marquardt”方法。

这就是我要求你提供数据的原因。有了我的数据，您的问题总是会更容易:)

xnew[x_, y_, z_] := 
  RotationMatrix[rx, {1, 0, 0}].RotationMatrix[
     ry, {0, 1, 0}].RotationMatrix[rz, {0, 0, 1}].{x, y, z} + {tx, ty, tz};

(* Generate Sample Data*)
(* Angles 1/2,1/3,1/5 *)
(* traslation -> {1,2,3} *)
(* Minimum mean Noise 5% *)

data = Table[{{x, y, z},
  RotationMatrix[1/2, {1, 0, 0}].
  RotationMatrix[1/3, {0, 1, 0}].
  RotationMatrix[1/5, {0, 0, 1}].{x, y, z} +{1, 2, 3} +RandomReal[{-.05, .05}, 3]},
  {x, 0, 1, .1}, {y, 0, 1, .1}, {z, 0, 1, .1}];

data = Flatten[data, 2];

(* Now find the parameters*)
FindMinimum[
 Sum[SquaredEuclideanDistance[xnew[i[[1]] /. List -> Sequence], 
   i[[2]]], {i, data}]
 , {rx, ry, rz, tx, ty, tz}, Method -> "LevenbergMarquardt"]

输出：（

{3.2423, {rx -> 0.500566, ry -> 0.334012, rz -> 0.199902, 
          tx -> 0.99985,  ty -> 1.99939,  tz -> 3.00021}}

实际值的 1/1000 内）

编辑

我对您的数据进行了一些处理。
问题是您的系统状况非常糟糕。您需要更多数据才能有效计算如此小的旋转。

这些是我得到的结果：

以度为单位的旋转：

rx = 179.99999999999999999999984968493536659553226696793
ry = 180.00000000000000000000006934755799995159952661222
rz = 180.0006286861217378980724139120849587855611645627

平移

tx = 48.503663696727576867196234527227830090575281353092
ty = 63.974139455057300403798198525151849767949596684232
tz = -0.99999999999999999999997957276716543927459921348549

我应该计算误差，但我现在没有时间。

顺便说一句，rz = Pi + 0.000011（弧度）

HTH！

Here you have your problem modeled and running with Mathematica.

I used the "Levenberg-Marquardt" method.

This is why I asked for your data. With MY data, YOUR problems are always going to be easier:)

xnew[x_, y_, z_] := 
  RotationMatrix[rx, {1, 0, 0}].RotationMatrix[
     ry, {0, 1, 0}].RotationMatrix[rz, {0, 0, 1}].{x, y, z} + {tx, ty, tz};

(* Generate Sample Data*)
(* Angles 1/2,1/3,1/5 *)
(* traslation -> {1,2,3} *)
(* Minimum mean Noise 5% *)

data = Table[{{x, y, z},
  RotationMatrix[1/2, {1, 0, 0}].
  RotationMatrix[1/3, {0, 1, 0}].
  RotationMatrix[1/5, {0, 0, 1}].{x, y, z} +{1, 2, 3} +RandomReal[{-.05, .05}, 3]},
  {x, 0, 1, .1}, {y, 0, 1, .1}, {z, 0, 1, .1}];

data = Flatten[data, 2];

(* Now find the parameters*)
FindMinimum[
 Sum[SquaredEuclideanDistance[xnew[i[[1]] /. List -> Sequence], 
   i[[2]]], {i, data}]
 , {rx, ry, rz, tx, ty, tz}, Method -> "LevenbergMarquardt"]

Out:

{3.2423, {rx -> 0.500566, ry -> 0.334012, rz -> 0.199902, 
          tx -> 0.99985,  ty -> 1.99939,  tz -> 3.00021}}

(Within 1/1000 of the real values)

Edit

I worked a little with your data.
The problem is that your system is very bad conditioned. You need much more data to effectively calculate such small rotations.

These are the results I got:

Rotations in degrees:

rx = 179.99999999999999999999984968493536659553226696793
ry = 180.00000000000000000000006934755799995159952661222
rz = 180.0006286861217378980724139120849587855611645627

Traslations

tx = 48.503663696727576867196234527227830090575281353092
ty = 63.974139455057300403798198525151849767949596684232
tz = -0.99999999999999999999997957276716543927459921348549

I should calculate the errors, but I've no time right now.

BTW, rz = Pi + 0.000011 (in radians)

HTH!

回复收藏 0 原文

缱倦旧时光 2024-10-14 08:35:38

好吧，我使用 ceres-solver 来解决这个问题，但我确实对你的数据进行了修改。我使用“uz=0.0”而不是“uz=1.0”，这使得这完全是一个二维数据拟合。

我得到以下结果。
反式：-88.6384，-16.3879，0
rot: 0, 0, -6.97813e-05

得到这些结果后，手动计算变换点到对应线的正交距离之和，得到0.0280452。

struct CostFunctor {
    CostFunctor(const double p[3],  double ux, double uy){
        p_[0] = p[0];p_[1] = p[1];p_[2] = p[2];
        n_[0] = ux; n_[1] = uy;
        n_[2] = 0.0;
        normalize(n_);
    }

    template <typename T>
    bool operator()(const T* const x, T* residual) const {
        T pDash[3];
        T pIn[3];
        T temp[3];
        pIn[0] = T(p_[0]);
        pIn[1] = T(p_[1]);
        pIn[2] = T(p_[2]);
        //transform the input point p_ to pDash
        xform(x, &pIn[0], &pDash[0]);
        //find dot(pDash, n), where n is the direction of line
        T pDashDotN = T(pDash[0]) * T(n_[0]) + T(pDash[1]) * T(n_[1]) + T(pDash[2]) * T(n_[2]);
        //projection of pDash along line
        temp[0] = pDashDotN * n_[0];temp[1] = pDashDotN * n_[1];temp[2] = pDashDotN * n_[2];
        //orthogonal vector from projection to point
        temp[0] = pDash[0] - temp[0];temp[1] = pDash[1] - temp[1];temp[2] = pDash[2] - temp[2];
        //squared error
        residual[0] = temp[0] * temp[0] + temp[1] * temp[1] + temp[2] * temp[2];
    return true;
    }
    //untransformed point
    double p_[3];

    double ux_;
    double uy_;
    //direction of line
    double n_[3];
};


template<typename T>
void  xform(const T *x, const T * inPoint, T *outPoint3) {
    T xTheta = x[3];
    T pOdd[3], pEven[3];
    pOdd[0] = inPoint[0];
    pOdd[1] = inPoint[1] * cos(xTheta) + inPoint[2] * sin(xTheta);
    pOdd[2] = -inPoint[1] * sin(xTheta) + inPoint[2] * cos(xTheta);

    T yTheta = x[4];
    pEven[0] = pOdd[0] * cos(yTheta) + pOdd[2] * sin(yTheta);
    pEven[1] = pOdd[1];
    pEven[2] = -pOdd[0] * sin(yTheta) + pOdd[2] * cos(yTheta);


    T zTheta = x[5];

    pOdd[0] = pEven[0] * cos(zTheta) - pEven[1] * sin(zTheta);
    pOdd[1] = pEven[0] * sin(zTheta) + pEven[1] * cos(zTheta);
    pOdd[2] = pEven[2];

    T xTrans = x[0], yTrans = x[1], zTrans = x[2];
    pOdd[0] += xTrans;
    pOdd[1] += yTrans;
    pOdd[2] += zTrans;

    outPoint3[0] = pOdd[0];
    outPoint3[1] = pOdd[1];
    outPoint3[2] = pOdd[2];
}

Well, I used ceres-solver to solve this, but I did make a modification in your data . Instead of "uz=1.0", I used "uz=0.0" which makes this entirely a 2d data fitting.

I got the following results.
trans: -88.6384, -16.3879, 0
rot: 0, 0, -6.97813e-05

After getting these results, manually calculated the sum of orthogonal distance of transformed points to the corresponding lines and got 0.0280452.

struct CostFunctor {
    CostFunctor(const double p[3],  double ux, double uy){
        p_[0] = p[0];p_[1] = p[1];p_[2] = p[2];
        n_[0] = ux; n_[1] = uy;
        n_[2] = 0.0;
        normalize(n_);
    }

    template <typename T>
    bool operator()(const T* const x, T* residual) const {
        T pDash[3];
        T pIn[3];
        T temp[3];
        pIn[0] = T(p_[0]);
        pIn[1] = T(p_[1]);
        pIn[2] = T(p_[2]);
        //transform the input point p_ to pDash
        xform(x, &pIn[0], &pDash[0]);
        //find dot(pDash, n), where n is the direction of line
        T pDashDotN = T(pDash[0]) * T(n_[0]) + T(pDash[1]) * T(n_[1]) + T(pDash[2]) * T(n_[2]);
        //projection of pDash along line
        temp[0] = pDashDotN * n_[0];temp[1] = pDashDotN * n_[1];temp[2] = pDashDotN * n_[2];
        //orthogonal vector from projection to point
        temp[0] = pDash[0] - temp[0];temp[1] = pDash[1] - temp[1];temp[2] = pDash[2] - temp[2];
        //squared error
        residual[0] = temp[0] * temp[0] + temp[1] * temp[1] + temp[2] * temp[2];
    return true;
    }
    //untransformed point
    double p_[3];

    double ux_;
    double uy_;
    //direction of line
    double n_[3];
};


template<typename T>
void  xform(const T *x, const T * inPoint, T *outPoint3) {
    T xTheta = x[3];
    T pOdd[3], pEven[3];
    pOdd[0] = inPoint[0];
    pOdd[1] = inPoint[1] * cos(xTheta) + inPoint[2] * sin(xTheta);
    pOdd[2] = -inPoint[1] * sin(xTheta) + inPoint[2] * cos(xTheta);

    T yTheta = x[4];
    pEven[0] = pOdd[0] * cos(yTheta) + pOdd[2] * sin(yTheta);
    pEven[1] = pOdd[1];
    pEven[2] = -pOdd[0] * sin(yTheta) + pOdd[2] * cos(yTheta);


    T zTheta = x[5];

    pOdd[0] = pEven[0] * cos(zTheta) - pEven[1] * sin(zTheta);
    pOdd[1] = pEven[0] * sin(zTheta) + pEven[1] * cos(zTheta);
    pOdd[2] = pEven[2];

    T xTrans = x[0], yTrans = x[1], zTrans = x[2];
    pOdd[0] += xTrans;
    pOdd[1] += yTrans;
    pOdd[2] += zTrans;

    outPoint3[0] = pOdd[0];
    outPoint3[1] = pOdd[1];
    outPoint3[2] = pOdd[2];
}

回复收藏 0 原文

~没有更多了~