从成对距离集中确定点

发布于 2024-07-07 03:00:03 字数 113 浏览 8 评论 0原文

给定点之间的距离矩阵,是否有一种算法可以确定具有这些距离的一组 n 维点? (或者至少最小化误差)

有点像收费公路问题的 n 维版本。

我能想到的最好的方法是使用多维缩放。

given a matrix of distances between points is there an algorithm for determining a set of n-dimensional points that has these distances? (or at least minimises the error)

sort of like a n-dimensional version of the turnpike problem.

The best I can come up with is using multidimensional scaling.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

饮湿 2024-07-14 03:00:03

多维缩放 (MDS) 的方向是正确的,但 MDS 对于大型数据集来说是不切实际的,因为它的时间复杂度是点数的二次方。 您可能想看看 FastMap,它具有线性时间复杂度并且更适合索引。 看:

Christos Faoutsos 和 King-Ip Lin:
“FastMap:一种快速算法
索引、数据挖掘和
传统与可视化
多媒体数据集,位于 Proc 中。
SIGMOD,1995,doi:10.1145/223784.223812

You are on the right track with multi-dimensional scaling (MDS), but MDS is impractical for large datasets, as its time complexity is quadratic in the number of points. You may want to look at FastMap, which has linear time complexity and is better suited to indexing. See:

Christos Faloutsos and King-Ip Lin:
"FastMap: a Fast Algorithm for
Indexing, Data-Mining and
Visualization of Traditional and
Multimedia Datasets, in Proc.
SIGMOD
, 1995, doi:10.1145/223784.223812

顾挽 2024-07-14 03:00:03

您可以“作弊”并为此使用迭代数值方法。 最初将所有点置于一些“随机”位置,然后循环遍历它们,将它们按所需距离的比例彼此远离。 这会更喜欢一些点,但在应用它们之前对移动进行平均,然后应用平均值将消除这个问题。 这是一个 O(n²) 算法,但实现和理解非常简单。 在下面的二维示例中,错误为 << 10%,但如果给出的距离不切实际,它可能表现得不太好。

C++ 示例:

#include <conio.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>

#define DAMPING_FACTOR 0.99f

class point
{
public:
    float x;
    float y;
public:
    point() : x(0), y(0) {}
};

// symmetric matrix with distances
float matrix[5][5] =    {
                            { 0.0f, 4.5f, 1.5f, 2.0f, 4.0f },
                            { 4.5f, 0.0f, 4.0f, 3.0f, 3.5f },
                            { 1.5f, 4.0f, 0.0f, 1.0f, 5.0f },
                            { 2.0f, 3.0f, 1.0f, 0.0f, 4.5f },
                            { 4.0f, 3.5f, 5.0f, 4.5f, 0.0f }
                        };
int main(int argc, char** argv)
{
    point p[5];
    for(unsigned int i = 0; i < 5; ++i)
    {
        p[i].x = (float)(rand()%100)*0.1f;
        p[i].y = (float)(rand()%100)*0.1f;
    }

    // do 1000 iterations
    float dx = 0.0f, dy = 0.0f, d = 0.0f;
    float xmoves[5], ymoves[5];

    for(unsigned int c = 0; c < 1000; ++c)
    {
        for(unsigned int i = 0; i < 5; ++i) xmoves[i] = ymoves[i] = 0.0f;
        // iterate across each point x each point to work out the results of all of the constraints in the matrix
        // collect moves together which are slightly less than enough (DAMPING_FACTOR) to correct half the distance between each pair of points
        for(unsigned int i = 0; i < 5; ++i)
        for(unsigned int j = 0; j < 5; ++j)
        {
            if(i==j) continue;
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            d = sqrt(dx*dx + dy*dy);
            dx /= d;
            dy /= d;
            d = (d - matrix[i][j])*DAMPING_FACTOR*0.5f*0.2f;

            xmoves[i] -= d*dx;
            ymoves[i] -= d*dy;

            xmoves[j] += d*dx;
            ymoves[j] += d*dy;
        }

        // apply all at once
        for(unsigned int i = 0; i < 5; ++i)
        {
            p[i].x += xmoves[i];
            p[i].y += ymoves[i];
        }
    }

    // output results
    printf("Result:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            printf("%f ", sqrt(dx*dx + dy*dy));
        }
        printf("\r\n");
    }

    printf("\r\nDesired:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            printf("%f ", matrix[i][j]);
        }
        printf("\r\n");
    }

    printf("Absolute difference:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            printf("%f ", abs(sqrt(dx*dx + dy*dy) - matrix[i][j]));
        }
        printf("\r\n");
    }

    printf("Press any key to continue...");

    while(!_kbhit());

    return 0;
}

You can "cheat" and use an iterative numerical method for this. Take all of the points to be in some "random" positions initially, and then loop through them, moving them away from each other proportionally to the required distance. This will prefer some points, but taking an average of the moves before applying them, then applying the average will remove this problem. This is an O(n²) algorithm, but very simple to implement and understand. In the 2-d example below the error is << 10%, though it may not behave so well if the distances given are unrealistic.

C++ Example:

#include <conio.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>

#define DAMPING_FACTOR 0.99f

class point
{
public:
    float x;
    float y;
public:
    point() : x(0), y(0) {}
};

// symmetric matrix with distances
float matrix[5][5] =    {
                            { 0.0f, 4.5f, 1.5f, 2.0f, 4.0f },
                            { 4.5f, 0.0f, 4.0f, 3.0f, 3.5f },
                            { 1.5f, 4.0f, 0.0f, 1.0f, 5.0f },
                            { 2.0f, 3.0f, 1.0f, 0.0f, 4.5f },
                            { 4.0f, 3.5f, 5.0f, 4.5f, 0.0f }
                        };
int main(int argc, char** argv)
{
    point p[5];
    for(unsigned int i = 0; i < 5; ++i)
    {
        p[i].x = (float)(rand()%100)*0.1f;
        p[i].y = (float)(rand()%100)*0.1f;
    }

    // do 1000 iterations
    float dx = 0.0f, dy = 0.0f, d = 0.0f;
    float xmoves[5], ymoves[5];

    for(unsigned int c = 0; c < 1000; ++c)
    {
        for(unsigned int i = 0; i < 5; ++i) xmoves[i] = ymoves[i] = 0.0f;
        // iterate across each point x each point to work out the results of all of the constraints in the matrix
        // collect moves together which are slightly less than enough (DAMPING_FACTOR) to correct half the distance between each pair of points
        for(unsigned int i = 0; i < 5; ++i)
        for(unsigned int j = 0; j < 5; ++j)
        {
            if(i==j) continue;
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            d = sqrt(dx*dx + dy*dy);
            dx /= d;
            dy /= d;
            d = (d - matrix[i][j])*DAMPING_FACTOR*0.5f*0.2f;

            xmoves[i] -= d*dx;
            ymoves[i] -= d*dy;

            xmoves[j] += d*dx;
            ymoves[j] += d*dy;
        }

        // apply all at once
        for(unsigned int i = 0; i < 5; ++i)
        {
            p[i].x += xmoves[i];
            p[i].y += ymoves[i];
        }
    }

    // output results
    printf("Result:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            printf("%f ", sqrt(dx*dx + dy*dy));
        }
        printf("\r\n");
    }

    printf("\r\nDesired:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            printf("%f ", matrix[i][j]);
        }
        printf("\r\n");
    }

    printf("Absolute difference:\r\n");
    for(unsigned int i = 0; i < 5; ++i)
    {
        for(unsigned int j = 0; j < 5; ++j)
        {
            dx = p[i].x - p[j].x;
            dy = p[i].y - p[j].y;
            printf("%f ", abs(sqrt(dx*dx + dy*dy) - matrix[i][j]));
        }
        printf("\r\n");
    }

    printf("Press any key to continue...");

    while(!_kbhit());

    return 0;
}
行至春深 2024-07-14 03:00:03

集体智能编程,第 11 页中有一个用于执行此操作的算法。 49,“查看二维数据”,可适用于 n 维。

嘿——这是多维尺度——所以我猜你走在正确的轨道上。

There is an algorithm for doing this in Programming Collective Intelligence, p. 49, "Viewing Data in Two Dimensions", which could be adapted for n-dimensions.

Hey -- it's multidimensional scaling -- so I guess you are on the right track.

追我者格杀勿论 2024-07-14 03:00:03

我无法编辑原文,因为我没有足够的代表,但我尝试在这里重述问题。

OP 有一个输入 NxN 距离矩阵。 他想要创建一个输出数组,大小为 N,由代表点的 N 维坐标组成,其中每个点之间的距离存储在输入矩阵中。

请注意,这在一般情况下是无法解决的:

假设我有一个像这样的矩阵,

   A  B  C  
A  x  1  2  
B     x  0  
C        x  

A 距离 B 1 个距离单位(比如 1 米),A 距离 C 1 米。但是 B 和 C 在同一位置点。

在这种特殊情况下,最小误差总和为 1 米,并且有无数种解决方案可以实现该结果

I can't edit the original, because I don't have enough rep, but I've tried to restate the problem here.

The OP has an input NxN matrix of distances. He wants to create an output array, size N, of N-dimensional coordinates representing points, where the distance between each point is stored in the input matrix.

Note that this is not solvable in the general case:

Suppose I have a matrix like this

   A  B  C  
A  x  1  2  
B     x  0  
C        x  

A is 1 unit of distance (say 1 metre) away from B, and A is one metre away from C. But B and C are in the same spot.

In this particular case the minimal sum of errors is 1 metre, and there are an infinite variety of solutions which achieve that result

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文