生成二维非简并点集 - C++

发布于 2024-11-26 11:22:02 字数 213 浏览 0 评论 0原文

我想在 2D 平面中创建一大组非退化的随机点云(整个集合中没有 3 个点在一条直线上)。我有一个简单的解决方案,它生成一个随机浮点对 P_new(x,y) 并检查到目前为止生成的每对点 (P1, P2) 是否位于同一行。这需要 O(n^2) 检查添加到列表中的每个新点,使得整个复杂性为 O(n^3),如果我想生成超过 4000 个点(需要超过 40 分钟),这会非常慢。 有没有更快的方法来生成这些非退化点集?

I want to create a large set of random point cloud in 2D plane that are non-degenerate (no 3 points in a straight line in the whole set). I have a naive solution which generates a random float pair P_new(x,y) and checks with every pair of points (P1, P2) generated till now if point (P1, P2, P) lie in same line or not. This takes O(n^2) checks for each new point added to the list making the whole complexity O(n^3) which is very slow if I want to generate more than 4000 points (takes more than 40 mins).
Is there a faster way to generate these set of non-degenerate points?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

呆° 2024-12-03 11:22:02

您可以计算和比较线性方程的系数,而不是在每个循环迭代中检查可能的点共线性。该系数应存储在可快速搜索的容器中。我考虑使用 std::set,但 unordered_map 可以适合其中任何一个,并且可以带来更好的结果。

综上所述,我建议使用以下算法:

  1. 生成随机点p
  2. 计算穿过 p 和现有点的线的系数(我的意思是通常<代码>A,B&C)。这里需要进行n次计算;
  3. 尝试在先前计算的集合中找到新计算的值。此步骤最多需要 n*log(n^2) 次操作。
  4. 如果搜索结果为负,则添加新值并将其系数添加到相应的集合中。其成本也约为O(log(n))

整个复杂度降低至O(n^2*log(n))
该算法需要额外存储n^2*sizeof(Coefficient)内存。但如果您只想计算 4000 个点,这似乎没问题。

Instead of checking the possible points collinearity on each cycle iteration, you could compute and compare coefficients of linear equations. This coefficients should be store in container with quick search. I consider using std::set, but unordered_map could fit either and could lead to even better results.

To sum it up, I suggest the following algorithm:

  1. Generate random point p;
  2. Compute coefficients of lines crossing p and existing points (I mean usual A,B&C). Here you need to do n computations;
  3. Trying to find newly computed values inside of previously computed set. This step requires n*log(n^2) operations at maximum.
  4. In case of negative search result, add new value and add its coefficients to corresponding sets. Its cost is about O(log(n)) too.

The whole complexity is reduced to O(n^2*log(n)).
This algorithm requires additional storing of n^2*sizeof(Coefficient) memory. But this seems to be ok if you are trying to compute 4000 points only.

萤火眠眠 2024-12-03 11:22:02

O(n^2 log n) 算法可以通过以下方式轻松构造:

对于集合中的每个点 P:

  1. 按极角对其他点进行排序(叉积作为比较函数,标准思想,参见 2D 凸)例如船体礼品包装算法)。在此步骤中,您应该仅考虑满足

    的 Q 点

    <前><代码>Qx> PX || Qy>=Py

  2. 迭代排序列表,相等的点位于同一行。

Q。排序的时间复杂度为 O(n log n),步骤 2 的时间复杂度为 O(n)。这给出了删除退化点的 O(n^2 log n) 。

O(n^2 log n) algorithm can be easily constructed in the following way:

For each point P in the set:

  1. Sort other points by polar angle (cross-product as a comparison function, standard idea, see 2D convex hull gift-wrapping algorithm for example). In this step you should consider only points Q that satisfy

    Q.x > P.x || Q.y >= P.y
    
  2. Iterate over sorted list, equal points are lying on the same line.

Sorting is done in O(n log n), step 2. is O(n). This gives O(n^2 log n) for removing degenerate points.

萌无敌 2024-12-03 11:22:02

确定一组点是否退化是 3SUM -难题。 (列出的第一个问题是确定三条线是否包含公共点;射影对偶下的等效问题是三个点是否属于一条公共线。)因此,希望生成和测试解决方案能够满足这一要求是不合理的。明显快于 n2

您对分发有什么要求?

Determining whether a set of points is degenerate is a 3SUM-hard problem. (The very first problem listed is determining whether three lines contains a common point; the equivalent problem under projective duality is whether three points belong to a common line.) As such, it's not reasonable to hope that a generate-and-test solution will be significantly faster than n2.

What are your requirements for the distribution?

爱给你人给你 2024-12-03 11:22:02
  1. 生成随机点 Q

  2. 为之前的点 P 计算 (dx, dy) = P - Q

  3. B = (asb(dx) > abs(dy) ? dy/dx : dx/dy)

  4. 按 B 值对点 P 的列表进行排序,以便与 Q 形成一条线的点将在排序列表内的附近位置。

  5. 遍历排序列表,检查 Q 与正在考虑的当前 P 值和一些比给定距离更近的下一个值形成一条线。

Perl 实现:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

use Math::Vector::Real;
use Math::Vector::Real::Random;
use Sort::Key::Radix qw(nkeysort);

use constant PI => 3.14159265358979323846264338327950288419716939937510;

@ARGV <= 2 or die "Usage:\n  $0 [n_points [tolerance]]\n\n";

my $n_points = shift // 4000;
my $tolerance = shift // 0.01;

$tolerance = $tolerance * PI / 180;
my $tolerance_arctan = 3 / 2 * $tolerance;
#     I got to that relation using no so basic maths in a hurry.
#     it may be wrong!

my $tolerance_sin2 = sin($tolerance) ** 2;

sub cross2d {
    my ($p0, $p1) = @_;
    $p0->[0] * $p1->[1] - $p1->[0] * $p0->[1];
}

sub line_p {
    my ($p0, $p1, $p2) = @_;
    my $a0 = $p0->abs2 || return 1;
    my $a1 = $p1->abs2 || return 1;
    my $a2 = $p2->abs2 || return 1;
    my $cr01 = cross2d($p0, $p1);
    my $cr12 = cross2d($p1, $p2);
    my $cr20 = cross2d($p2, $p0);
    $cr01 * $cr01 / ($a0 * $a1) < $tolerance_sin2 or return;
    $cr12 * $cr12 / ($a1 * $a2) < $tolerance_sin2 or return;
    $cr20 * $cr20 / ($a2 * $a0) < $tolerance_sin2 or return;
    return 1;
}

my ($c, $f1, $f2, $f3) = (0, 1, 1, 1);

my @p;
GEN: for (1..$n_points) {
    my $q = Math::Vector::Real->random_normal(2);
    $c++;
    $f1 += @p;
    my @B = map {
        my ($dx, $dy) = @{$_ - $q};
        abs($dy) > abs($dx) ? $dx / $dy : $dy / $dx;
    } @p;

    my @six = nkeysort { $B[$_] } 0..$#B;

    for my $i (0..$#six) {
        my $B0 = $B[$six[$i]];
        my $pi = $p[$six[$i]];
        for my $j ($i + 1..$#six) {
            last if $B[$six[$j]] - $B0 > $tolerance_arctan;
            $f2++;
            my $pj = $p[$six[$j]];
            if (line_p($q - $pi, $q - $pj, $pi - $pj)) {
                $f3++;
                say "BAD: $q $pi-$pj";
                redo GEN;
            }
        }
    }
    push @p, $q;
    say "GOOD: $q";
    my $good = @p;
    my $ratiogood = $good/$c;
    my $ratio12 = $f2/$f1;
    my $ratio23 = $f3/$f2;
    print STDERR "gen: $c, good: $good, good/gen: $ratiogood, f2/f1: $ratio12, f3/f2: $ratio23                                  \r";
}
print STDERR "\n";

容差表示考虑三个点是否排成 π - max_angle(Q, Pi, Pj) 时可接受的度数误差。

它没有考虑向量相减时可能发生的数值不稳定性(即|Pi-Pj|可能比|Pi|小几个数量级)。消除该问题的一个简单方法是还要求任意两个给定点之间的最小距离。

将容差设置为 1e-6,程序只需几秒钟即可生成 4000 个点。将其转换为 C/C++ 可能会使其速度提高两个数量级。

  1. generate random point Q

  2. for previous points P calculate (dx, dy) = P - Q

  3. and B = (asb(dx) > abs(dy) ? dy/dx : dx/dy)

  4. sort the list of points P by its B value, so that points that form a line with Q will be in near positions inside the sorted list.

  5. walk over the sorted list checking where Q forms a line with the current P value being considered and some next values that are nearer than a given distance.

Perl implementation:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

use Math::Vector::Real;
use Math::Vector::Real::Random;
use Sort::Key::Radix qw(nkeysort);

use constant PI => 3.14159265358979323846264338327950288419716939937510;

@ARGV <= 2 or die "Usage:\n  $0 [n_points [tolerance]]\n\n";

my $n_points = shift // 4000;
my $tolerance = shift // 0.01;

$tolerance = $tolerance * PI / 180;
my $tolerance_arctan = 3 / 2 * $tolerance;
#     I got to that relation using no so basic maths in a hurry.
#     it may be wrong!

my $tolerance_sin2 = sin($tolerance) ** 2;

sub cross2d {
    my ($p0, $p1) = @_;
    $p0->[0] * $p1->[1] - $p1->[0] * $p0->[1];
}

sub line_p {
    my ($p0, $p1, $p2) = @_;
    my $a0 = $p0->abs2 || return 1;
    my $a1 = $p1->abs2 || return 1;
    my $a2 = $p2->abs2 || return 1;
    my $cr01 = cross2d($p0, $p1);
    my $cr12 = cross2d($p1, $p2);
    my $cr20 = cross2d($p2, $p0);
    $cr01 * $cr01 / ($a0 * $a1) < $tolerance_sin2 or return;
    $cr12 * $cr12 / ($a1 * $a2) < $tolerance_sin2 or return;
    $cr20 * $cr20 / ($a2 * $a0) < $tolerance_sin2 or return;
    return 1;
}

my ($c, $f1, $f2, $f3) = (0, 1, 1, 1);

my @p;
GEN: for (1..$n_points) {
    my $q = Math::Vector::Real->random_normal(2);
    $c++;
    $f1 += @p;
    my @B = map {
        my ($dx, $dy) = @{$_ - $q};
        abs($dy) > abs($dx) ? $dx / $dy : $dy / $dx;
    } @p;

    my @six = nkeysort { $B[$_] } 0..$#B;

    for my $i (0..$#six) {
        my $B0 = $B[$six[$i]];
        my $pi = $p[$six[$i]];
        for my $j ($i + 1..$#six) {
            last if $B[$six[$j]] - $B0 > $tolerance_arctan;
            $f2++;
            my $pj = $p[$six[$j]];
            if (line_p($q - $pi, $q - $pj, $pi - $pj)) {
                $f3++;
                say "BAD: $q $pi-$pj";
                redo GEN;
            }
        }
    }
    push @p, $q;
    say "GOOD: $q";
    my $good = @p;
    my $ratiogood = $good/$c;
    my $ratio12 = $f2/$f1;
    my $ratio23 = $f3/$f2;
    print STDERR "gen: $c, good: $good, good/gen: $ratiogood, f2/f1: $ratio12, f3/f2: $ratio23                                  \r";
}
print STDERR "\n";

The tolerance indicates the acceptable error in degrees when considering if three points are in a line as π - max_angle(Q, Pi, Pj).

It does not take into account the numerical instabilities that can happen when subtracting vectors (i.e |Pi-Pj| may be several orders of magnitude smaller than |Pi|). An easy way to eliminate that problem would be to also require a minimum distance between any two given points.

Setting tolerance to 1e-6, the program just takes a few seconds to generate 4000 points. Translating it to C/C++ would probably make it two orders of magnitude faster.

暗藏城府 2024-12-03 11:22:02

O(n) 解决方案:

  1. 从 0..1 中选取一个随机数 r
  2. 添加到云中的点则为 P(cos(2 × π × r), sin(2 × π × r))

O(n) solution:

  1. Pick a random number r from 0..1
  2. The point added to the cloud is then P(cos(2 × π × r), sin(2 × π × r))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文