在列表中查找匹配对的算法

发布于 2024-08-12 05:07:13 字数 1708 浏览 5 评论 0原文

我将在下面以我想要的精确形式表述该问题:

给出: 两个浮点列表 ND,长度 k 相同(k 是 2 的倍数)。 已知对于所有i=0,...,k-1,存在j != i使得D[j]*D[ i] == N[i]*N[j]。 (我使用从零开始的索引)

返回: 一个(长度k/2)对(i,j)的列表,使得D[j]*D[i] == N[i]* N[j]。 返回的对可能不是唯一的(任何有效的对列表都可以)

该算法的应用是查找广义回文特征值问题的特征值的倒数对。 等式条件等价于 N[i]/D[i] == D[j]/N[j],但在分母为零时也有效(这是一种确定的可能性)。特征值问题中的简并性导致这些对不唯一。

更一般地说,该算法相当于:

给定: 长度为 k 的列表 Xk 是 2 的倍数)。 已知对于所有 i=0,...,k-1,存在 j != i 使得 IsMatch(X[i], X[j]) 返回 true,其中 IsMatch 是一个布尔匹配函数,保证对所有 j != i 至少一个返回 true代码>我。

返回: 一个(长度k/2)对(i,j)的列表,使得所有对的IsMatch(i,j) == true在列表中。 返回的对可能不是唯一的(任何有效的对列表都可以)

显然,我的第一个问题可以根据第二个问题来表述 IsMatch(u,v) := { (u - 1/v) = = 0 }。现在,由于浮点精度的限制,永远不会有完全相等的情况,所以我想要最小化匹配误差的解决方案。换句话说,假设 IsMatch(u,v) 返回值 u - 1/v 并且我希望算法返回一个列表,IsMatch 返回该列表的最小集的错误。这是一个组合优化问题。我想我可以首先天真地计算所有可能的索引对 ij 之间的匹配误差,但随后我需要选择最小误差集,然后我不知道我会怎么做。

澄清

IsMatch 函数是自反的(IsMatch(a,b) 暗示 IsMatch(b,a)),但不具有传递性。然而,它是 3-传递的:IsMatch(a,b) && IsMatch(b,c) && IsMatch(c,d) 隐含 IsMatch(a,d)

附录

这个问题显然等同于图论中的最小权完美匹配问题。然而,就我而言,我知道应该有一个“良好”的完美匹配,因此边缘权重的分布不是完全随机的。我觉得应该以某种方式使用这些信息。现在的问题是,对于最小权重完美匹配问题是否有一个好的实现,可以使用我的先验知识在搜索的早期找到解决方案。我也愿意接受任何此类算法的简单实现的指导。

I will phrase the problem in the precise form that I want below:

Given:
Two floating point lists N and D of the same length k (k is multiple of 2).
It is known that for all i=0,...,k-1, there exists j != i such that D[j]*D[i] == N[i]*N[j]. (I'm using zero-based indexing)

Return:
A (length k/2) list of pairs (i,j) such that D[j]*D[i] == N[i]*N[j].
The pairs returned may not be unique (any valid list of pairs is okay)

The application for this algorithm is to find reciprocal pairs of eigenvalues of a generalized palindromic eigenvalue problem.
The equality condition is equivalent to N[i]/D[i] == D[j]/N[j], but also works when denominators are zero (which is a definite possibility). Degeneracies in the eigenvalue problem cause the pairs to be non-unique.

More generally, the algorithm is equivalent to:

Given:
A list X of length k (k is multiple of 2).
It is known that for all i=0,...,k-1, there exists j != i such that IsMatch(X[i],X[j]) returns true, where IsMatch is a boolean matching function which is guaranteed to return true for at least one j != i for all i.

Return:
A (length k/2) list of pairs (i,j) such that IsMatch(i,j) == true for all pairs in the list.
The pairs returned may not be unique (any valid list of pairs is okay)

Obviously, my first problem can be formulated in terms of the second with IsMatch(u,v) := { (u - 1/v) == 0 }. Now, due to limitations of floating point precision, there will never be exact equality, so I want the solution which minimizes the match error. In other words, assume that IsMatch(u,v) returns the value u - 1/v and I want the algorithm to return a list for which IsMatch returns the minimal set of errors. This is a combinatorial optimization problem. I was thinking I can first naively compute the match error between all possible pairs of indexes i and j, but then I would need to select the set of minimum errors, and I don't know how I would do that.

Clarification

The IsMatch function is reflexive (IsMatch(a,b) implies IsMatch(b,a)), but not transitive. It is, however, 3-transitive: IsMatch(a,b) && IsMatch(b,c) && IsMatch(c,d) implies IsMatch(a,d).

Addendum

This problem is apparently identically the minimum weight perfect matching problem in graph theory. However, in my case I know that there should be a "good" perfect matching, so the distribution of edge weights is not totally random. I feel that this information should be used somehow. The question now is if there is a good implementation to the min-weight-perfect-matching problem that uses my prior knowledge to arrive at a solution early in the search. I'm also open to pointers towards a simple implementation of any such algorithm.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

幼儿园老大 2024-08-19 05:07:14

好的,我最终使用了

complex_t num = N[i]*N[j] - D[i]*D[j];
complex_t den1 = N[j]*D[i];
complex_t den2 = N[i]*D[j];
if(std::abs(den1) < std::abs(den2)){
    costs[j*(j-1)/2+i] = std::abs(-num/den2);
}else if(std::abs(den1) == 0){
    costs[j*(j-1)/2+i] = std::sqrt(std::numeric_limits<double>::max());
}else{
    costs[j*(j-1)/2+i] = std::abs(num/den1);
}

/src/newC/nclMatch.c&q=minimum%20weight%20perfect%20matching%20c_match" rel="nofollow noreferrer">此移植的 Fortran 代码,我只需使用以下 效果很好,而且对于我的目的来说足够快。

Okay, I ended up using this ported Fortran code, where I simply specify the dense upper triangular distance matrix using:

complex_t num = N[i]*N[j] - D[i]*D[j];
complex_t den1 = N[j]*D[i];
complex_t den2 = N[i]*D[j];
if(std::abs(den1) < std::abs(den2)){
    costs[j*(j-1)/2+i] = std::abs(-num/den2);
}else if(std::abs(den1) == 0){
    costs[j*(j-1)/2+i] = std::sqrt(std::numeric_limits<double>::max());
}else{
    costs[j*(j-1)/2+i] = std::abs(num/den1);
}

This works great and is fast enough for my purposes.

雨落星ぅ辰 2024-08-19 05:07:14

您应该能够对 (D[i],N[i]) 对进行排序。您不需要除以零 - 您只需相乘即可,如下所示:

bool order(i,j) {
  float ni= N[i]; float di= D[i];
  if(di<0) { di*=-1; ni*=-1; }

  float nj= N[j]; float dj= D[j];
  if(dj<0) { dj*=-1; nj*=-1; }

  return ni*dj < nj*di;
}

然后,扫描排序列表以找到两个分隔点:(N == D) 和 (N == -D);您可以从那里开始匹配互惠对,使用:

abs(D[i]*D[j]-N[i]*N[j])<epsilon

作为有效性检查。将 (N == 0) 和 (D == 0) 点留在最后;你认为它们是消极的还是积极的并不重要,因为它们都会相互匹配。

编辑:或者,您可以分别处理 (N==0) 和 (D==0) 情况,将它们从列表中删除。然后,您可以使用 (N[i]/D[i]) 对其余索引进行排序。您可能仍然希望从 1.0 和 -1.0 开始,以确保可以将接近零的情况与完全零的情况相匹配。

You should be able to sort the (D[i],N[i]) pairs. You don't need to divide by zero -- you can just multiply out, as follows:

bool order(i,j) {
  float ni= N[i]; float di= D[i];
  if(di<0) { di*=-1; ni*=-1; }

  float nj= N[j]; float dj= D[j];
  if(dj<0) { dj*=-1; nj*=-1; }

  return ni*dj < nj*di;
}

Then, scan the sorted list to find two separation points: (N == D) and (N == -D); you can start matching reciprocal pairs from there, using:

abs(D[i]*D[j]-N[i]*N[j])<epsilon

as a validity check. Leave the (N == 0) and (D == 0) points for last; it doesn't matter whether you consider them negative or positive, as they will all match with each other.

edit: alternately, you could just handle (N==0) and (D==0) cases separately, removing them from the list. Then, you can use (N[i]/D[i]) to sort the rest of the indices. You still might want to start at 1.0 and -1.0, to make sure you can match near-zero cases with exactly-zero cases.

落日海湾 2024-08-19 05:07:13

我希望我能解决你的问题。

好吧,如果 IsMatch(i, j) 和 IsMatch(j, l)IsMatch(i, l)。更一般地,IsMatch 关系是传递的、交换的和自反的,即。它是一个等价关系。该算法会转换为列表中出现次数最多的元素(使用 IsMatch 而不是 =)。

I hope I got your problem.

Well, if IsMatch(i, j) and IsMatch(j, l) then IsMatch(i, l). More generally, the IsMatch relation is transitive, commutative and reflexive, ie. its an equivalence relation. The algorithm translates to which element appears the most times in the list (use IsMatch instead of =).

爱人如己 2024-08-19 05:07:13

(如果我理解这个问题...)
这是匹配两个列表中的每对产品的一种方法。

  1. 将每对 N 相乘,并将其保存到带有乘积以及构成乘积的元素的下标的结构中。
  2. 将每对 D 相乘,并将其与乘积以及构成乘积的元素的下标保存到结构的第二个实例中。
  3. 对产品上的两个结构进行排序。
  4. 对两个已排序的结构数组进行合并类型传递。每次您从一个数组中找到与另一个数组足够接近的产品时,您可以记录每个排序列表中的两个下标以进行匹配。
  5. 您还可以使用 ismatch 函数的一个排序列表,对产品进行二分搜索。

(If I understand the problem...)
Here is one way to match each pair of products in the two lists.

  1. Multiply each pair N and save it to a structure with the product, and the subscripts of the elements making up the product.
  2. Multiply each pair D and save it to a second instance of the structure with the product, and the subscripts of the elements making up the product.
  3. Sort both structions on the product.
  4. Make a merge-type pass through both sorted structure arrays. Each time you find a product from one array that is close enough to the other, you can record the two subscripts from each sorted list for a match.
  5. You can also use one sorted list for an ismatch function, doing a binary search on the product.
剧终人散尽 2024-08-19 05:07:13

好吧……将每对 D 相乘,并将其与乘积以及构成乘积的元素的下标保存到结构的第二个实例中。

well。。Multiply each pair D and save it to a second instance of the structure with the product, and the subscripts of the elements making up the product.

红颜悴 2024-08-19 05:07:13

我刚刚问了我的CS朋友,他提出了下面的算法。他在这里没有帐户(并且显然不愿意创建帐户),但我认为他的答案值得分享。

// We will find the best match in the minimax sense; we will minimize
// the maximum matching error among all pairs. Alpha maintains a
// lower bound on the maximum matching error. We will raise Alpha until
// we find a solution. We assume MatchError returns an L_1 error.

// This first part finds the set of all possible alphas (which are
// the pairwise errors between all elements larger than maxi-min
// error.
Alpha = 0
For all i:
    min = Infinity
    For all j > i:
        AlphaSet.Insert(MatchError(i,j))
        if MatchError(i,j) < min
            min = MatchError(i,j)
    If min > Alpha
        Alpha = min

Remove all elements of AlphaSet smaller than Alpha

// This next part increases Alpha until we find a solution
While !AlphaSet.Empty()
    Alpha = AlphaSet.RemoveSmallest()
    sol = GetBoundedErrorSolution(Alpha)
    If sol != nil
        Return sol

// This is the definition of the helper function. It returns
// a solution with maximum matching error <= Alpha or nil if
// no such solution exists.
GetBoundedErrorSolution(Alpha) :=
    MaxAssignments = 0
    For all i:
        ValidAssignments[i] = empty set;
        For all j > i:
            if MatchError <= Alpha
                ValidAssignments[i].Insert(j)
                ValidAssignments[j].Insert(i)

        // ValidAssignments[i].Size() > 0 due to our choice of Alpha
        // in the outer loop

        If ValidAssignments[i].Size() > MaxAssignments
            MaxAssignments = ValidAssignments[i].Size()
    If MaxAssignments = 1
        return ValidAssignments
    Else
        G = graph(ValidAssignments)
        // G is an undirected graph whose vertices are all values of i
        // and edges between vertices if they have match error less
        // than or equal to Alpha
        If G has a perfect matching
            // Note that this part is NP-complete.
            Return the matching
        Else
            Return nil

它依赖于能够计算出图的完美匹配,这是 NP 完全的,但至少它被简化为一个已知的问题。预计该解决方案是 NP 完全的,但这没关系,因为实际上给定列表的大小非常小。我会等待几天更好的答案,或者等待有人扩展如何以合理的方式找到完美的匹配。

I just asked my CS friend, and he came up with the algorithm below. He doesn't have an account here (and apparently unwilling to create one), but I think his answer is worth sharing.

// We will find the best match in the minimax sense; we will minimize
// the maximum matching error among all pairs. Alpha maintains a
// lower bound on the maximum matching error. We will raise Alpha until
// we find a solution. We assume MatchError returns an L_1 error.

// This first part finds the set of all possible alphas (which are
// the pairwise errors between all elements larger than maxi-min
// error.
Alpha = 0
For all i:
    min = Infinity
    For all j > i:
        AlphaSet.Insert(MatchError(i,j))
        if MatchError(i,j) < min
            min = MatchError(i,j)
    If min > Alpha
        Alpha = min

Remove all elements of AlphaSet smaller than Alpha

// This next part increases Alpha until we find a solution
While !AlphaSet.Empty()
    Alpha = AlphaSet.RemoveSmallest()
    sol = GetBoundedErrorSolution(Alpha)
    If sol != nil
        Return sol

// This is the definition of the helper function. It returns
// a solution with maximum matching error <= Alpha or nil if
// no such solution exists.
GetBoundedErrorSolution(Alpha) :=
    MaxAssignments = 0
    For all i:
        ValidAssignments[i] = empty set;
        For all j > i:
            if MatchError <= Alpha
                ValidAssignments[i].Insert(j)
                ValidAssignments[j].Insert(i)

        // ValidAssignments[i].Size() > 0 due to our choice of Alpha
        // in the outer loop

        If ValidAssignments[i].Size() > MaxAssignments
            MaxAssignments = ValidAssignments[i].Size()
    If MaxAssignments = 1
        return ValidAssignments
    Else
        G = graph(ValidAssignments)
        // G is an undirected graph whose vertices are all values of i
        // and edges between vertices if they have match error less
        // than or equal to Alpha
        If G has a perfect matching
            // Note that this part is NP-complete.
            Return the matching
        Else
            Return nil

It relies on being able to compute a perfect matching of a graph, which is NP-complete, but at least it is reduced to a known problem. It is expected that the solution be NP-complete, but this is OK since in practice the size of the given lists are quite small. I'll wait around for a better answer for a few days, or for someone to expand on how to find the perfect matching in a reasonable way.

谜兔 2024-08-19 05:07:13

您想要找到 j 使得 D(i)*D(j) = N(i)*N(j) {我假设 * 是普通实数乘法}

假设所有 N(i) 均非零,令

Z(i) = D(i)/N(i)。

问题:找到 j,使得 Z(i) = 1/Z(j)。

将集合分为正数和负数并分别处理。

为了清晰起见,记录日志。 z(i) = log Z(i)。

间接排序。然后,在排序视图中,您应该有类似 -5 -3 -1 +1 +3 +5 的内容。读出 +/- 对,这应该会给你原始索引。

我错过了什么,还是问题很简单?

You want to find j such that D(i)*D(j) = N(i)*N(j) {I assumed * is ordinary real multiplication}

assuming all N(i) are nonzero, let

Z(i) = D(i)/N(i).

Problem: find j, such that Z(i) = 1/Z(j).

Split set into positives and negatives and process separately.

take logs for clarity. z(i) = log Z(i).

Sort indirectly. Then in the sorted view you should have something like -5 -3 -1 +1 +3 +5, for example. Read off +/- pairs and that should give you the original indices.

Am I missing something, or is the problem easy?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文