两组可能不同尺寸之间的距离测量

发布于 2024-10-08 05:41:54 字数 911 浏览 0 评论 0原文

我有两组整数,A 和 B,大小不一定相同。根据我的需要,我将每 2 个元素 a 和 b(整数)之间的距离设为 abs(ab)

我将两个集合之间的距离定义如下:

  1. 如果集合大小相同,则最小化所有对 [a,b](a 与 A 之间的距离和 b 与 B 之间的距离)的总和,最小化所有可能的“对”分区”(有 n! 个可能的分区)。
  2. 如果集合的大小不同,则假设 A 的大小为 m,B 的大小为 n,其中 m < 1。 n,然后最小化 B 的所有大小为 m 的子集与 (1) 的距离。

我的问题是,根据上面写的定义,以下算法(只是直观的猜测)是否给出了正确的答案。

  • 构造一个大小为 m X n 的矩阵 D,其中 D(i,j) = abs(A(i)-B(j))
  • 找到D中最小的元素,将其累加,并删除该元素所在的行和列。累积下一个最小的条目,并继续累积,直到删除所有行和列。

例如,如果 A={0,1,4}B={3,4},则 D 为(其中元素上方和左侧):

3 4

0 3 4

1 2 3

4 1 0

距离为 0 + 2 = 2,来自将 44 配对,将 31 配对。

I have 2 sets of integers, A and B, not necessarily of the same size. For my needs, I take the distance between each 2 elements a and b (integers) to be just abs(a-b).

I am defining the distance between the two sets as follows:

  1. If the sets are of the same size, minimize the sum of distances of all pairs [a,b] (a from A and b from B), minimization over all possible 'pairs partitions' (there are n! possible partitions).
  2. If the sets are not of the same size, let's say A of size m and B of size n, with m < n, then minimize the distance from (1) over all subsets of B which are of size m.

My question is, is the following algorithm (just an intuitive guess) gives the right answer, according to the definition written above.

  • Construct a matrix D of size m X n, with D(i,j) = abs(A(i)-B(j))
  • Find the smallest element of D, accumulate it, and delete the row and the column of that element. Accumulate the next smallest entry, and keep accumulating until all rows and columns are deleted.

for example, if A={0,1,4} and B={3,4}, then D is (with the elements above and to the left):

3 4

0 3 4

1 2 3

4 1 0

And the distance is 0 + 2 = 2, coming from pairing 4 with 4 and 3 with 1.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

孤独陪着我 2024-10-15 05:41:54

请注意,此问题有时称为滑雪板和滑雪者问题,其中有 n 个滑雪板和 m 个长度和高度不同的滑雪者。目标是使滑雪板与滑雪者相匹配,以使高度和滑雪板长度之间的差异之和最小化。

为了解决这个问题,您可以使用最小权重二分匹配,这需要 O(n^3) 时间。

更好的是,您可以使用下面的简单动态编程算法使用 O(n) 额外内存实现 O(n^2) 时间。

最佳情况下,如果已经使用此 论文

O(n^2)动态规划算法:

if (size(B) > size(A))
  swap(A, B);
sort(A);
sort(B);
opt = array(size(B));
nopt = array(size(B));
for (i = 0; i < size(B); i++)
  opt[i] = abs(A[0] - B[i]);
for (i = 1; i < size(A); i++) {
  fill(nopt, infinity);
  for (j = 1; j < size(B); j++) {
    nopt[j] = min(nopt[j - 1], opt[j - 1] + abs(A[i] - B[j]));
  swap(opt, nopt);
}
return opt[size(B) - 1];

在上面的外部for循环的每次迭代i之后,opt[j]包含最优解使用元素 {B[0],..., B[j]} 匹配 {A[0],..., A[i]}

该算法的正确性依赖于这样的事实:在任何最优匹配中,如果a1与b1匹配,a2与b2匹配,且a1<0。 a2,则 b1 <= b2。

Note that this problem is referred to sometimes as the skis and skiers problem, where you have n skis and m skiers of varying lengths and heights. The goal is to match skis with skiers so that the sum of the differences between heights and ski lengths is minimized.

To solve the problem you could use minimum weight bipartite matching, which requires O(n^3) time.

Even better, you can achieve O(n^2) time with O(n) extra memory using the simple dynamic programming algorithm below.

Optimally, you can solve the problem in linear time if the points are already sorted using the algorithm described in this paper.

O(n^2) dynamic programming algorithm:

if (size(B) > size(A))
  swap(A, B);
sort(A);
sort(B);
opt = array(size(B));
nopt = array(size(B));
for (i = 0; i < size(B); i++)
  opt[i] = abs(A[0] - B[i]);
for (i = 1; i < size(A); i++) {
  fill(nopt, infinity);
  for (j = 1; j < size(B); j++) {
    nopt[j] = min(nopt[j - 1], opt[j - 1] + abs(A[i] - B[j]));
  swap(opt, nopt);
}
return opt[size(B) - 1];

After each iteration i of the outer for loop above, opt[j] contains the optimal solution matching {A[0],..., A[i]} using the elements {B[0],..., B[j]}.

The correctness of this algorithm relies on the fact that in any optimal matching if a1 is matched with b1, a2 is matched with b2, and a1 < a2, then b1 <= b2.

我不咬妳我踢妳 2024-10-15 05:41:54

为了获得最优,解决D上的分配问题。

分配问题在二分图中找到完美匹配,使得总边权重最小化,这完美地映射到您的问题。 P.编辑

解释了OP的问题如何映射到作业上。

为了简化解释,用特殊元素 e_k 扩展较小的集合。

设 A 为工作人员集合,B 为任务集合(内容只是标签)。

令成本为 A 和 B 中的元素(即 D 的条目)之间的距离。 e_k与任何东西之间的距离都是0。

然后,我们希望找到A和B的完美匹配(即每个工人都与一个任务匹配),使得成本最小化。这就是分配问题。

In order to get the optimum, solve the assignment problem on D.

The assignment problem finds a perfect matching in a bipartite graph such that the total edge weight is minimized, which maps perfectly to your problem. It is also in P.

EDIT to explain how OP's problem maps onto assignment.

For simplicity of explanation, extend the smaller set with special elements e_k.

Let A be the set of workers, and B be the set of tasks (the contents are just labels).

Let the cost be the distance between an element in A and B (i.e. an entry of D). The distance between e_k and anything is 0.

Then, we want to find a perfect matching of A and B (i.e. every worker is matched with a task), such that the cost is minimized. This is the assignment problem.

忆沫 2024-10-15 05:41:54

不,这不是最佳答案,例如:
A: {3,7}B:{0,4} 您将选择:{(3,4),(0,7)}< /strong> 且距离为 8,但您应该选择 {(3,0),(4,7)},此时距离为 6 >。

No It's not a best answer, for example:
A: {3,7} and B:{0,4} you will choose: {(3,4),(0,7)} and distance is 8 but you should choose {(3,0),(4,7)} in this case distance is 6.

战皆罪 2024-10-15 05:41:54

您的答案给出了最小值的良好近似值,但不一定是最佳最小值。您采用的是“贪婪”方法,这种方法通常更容易,并且给出了良好的结果,但不能保证最佳答案。

Your answer gives a good approximation to the minimum, but not necessarily the best minimum. You are following a "greedy" approach which is generally much easier, and gives good results, but can not guarantee the best answer.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文