最优化问题-向量映射
A
和 B
是 N
维向量 (N=10
) 的集合,|B|> ;=|A|
(|A|=10^2
,|B|=10^5
)。相似性度量 sim(a,b) 是点积(必需)。任务如下:对于A
中的每个向量a
,找到B
中的向量b
,使得相似度之和所有对中的ss
是最大的。
我的第一次尝试是贪婪算法:
- 找到相似度最高的对并从 A,B 中删除该对
- 重复 (1) 直到 A 为空
但这种贪婪算法在这种情况下不是最佳的:
a_1=[1, 0] a_2=[.5, .4] b_1=[1, 1] b_2=[.9, 0] sim(a_1,b_1)=1 sim(a_1,b_2)=.9 sim(a_2,b_1)=.9 sim(a_2, b_2)=.45
算法返回 [a_1,b_1]
和 [a_2, b_2]
,ss=1.45
,但最佳解决方案产生 ss=1.8
。
有没有有效的算法来解决这个问题?谢谢
A
and B
are sets of N
dimensional vectors (N=10
), |B|>=|A|
(|A|=10^2
, |B|=10^5
). Similarity measure sim(a,b) is dot product (required). The task is following: for each vector a
in A
find vector b
in B
, such that sum of similarities ss
of all pairs is maximal.
My first attempt was greedy algorithm:
- find the pair with the highest similarity and remove that pair from A,B
- repeat (1) until A is empty
But such greedy algorithm is suboptimal in this case:
a_1=[1, 0] a_2=[.5, .4] b_1=[1, 1] b_2=[.9, 0] sim(a_1,b_1)=1 sim(a_1,b_2)=.9 sim(a_2,b_1)=.9 sim(a_2, b_2)=.45
Algorithm returns [a_1,b_1]
and [a_2, b_2]
, ss=1.45
, but optimal solution yields ss=1.8
.
Is there efficient algo to solve this problem? Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这本质上是加权匹配问题。 org/wiki/Bipartite_graph" rel="nofollow">二分图。假设权重函数
f
是点积 (|ab|
)。我不认为你的权重函数的特殊结构会大大简化问题,所以你几乎要找到最大匹配。
您可以在这篇维基百科文章中找到解决此问题的一些基本算法。尽管乍一看它们似乎对您的数据不可行(
V = 10^5
、E = 10^7
),但我仍然会研究它们:其中一些可能允许您利用您的“蹩脚”顶点集,其中一部分的数量级小于另一部分。这篇文章似乎也相关,尽管没有列出任何算法。
不完全是一个解决方案,但希望它有所帮助。
This is essentially a matching problem in weighted bipartite graph. Just assume that weight function
f
is a dot product (|ab|
).I don't think the special structure of your weight function will simplify problem a lot, so you're pretty much down to finding a maximum matching.
You can find some basic algorithms for this problem in this wikipedia article. Although at first glance they don't seem viable for your data (
V = 10^5
,E = 10^7
), I would still research them: some of them might allow you to take advantage of your 'lame' set of vertixes, with one part orders of magnitude smaller than the other.This article also seems relevant, although doesn't list any algorithms.
Not exactly a solution, but hope it helps.
我在这里第二个尼基塔,这是一个分配(或匹配)问题。我不确定这对于您的问题在计算上是否可行,但您可以使用匈牙利算法,也称为 Munkres 分配算法,其中分配成本
(i,j)
是ai
和点积的负数>bj
。除非您碰巧知道 A 和 B 的元素是如何形成的,否则我认为这是解决您的问题最有效的已知算法。I second Nikita here, it is an assignment (or matching) problem. I'm not sure this is computationally feasible for your problem, but you could use the Hungarian algorithm, also known as Munkres' assignment algorithm, where the cost of assignment
(i,j)
is the negative of the dot product ofai
andbj
. Unless you happen to know how the elements of A and B are formed, I think this is the most efficient known algorithm for your problem.