匹配算法
我正在编写一个应用程序,它将一群用户分成两对,以便一起执行任务。每个用户都可以指定有关其伴侣的各种偏好,例如
- 性别
- 语言
- 年龄
- 位置(通常在距用户居住地 X 英里/公里内)
理想情况下,我希望用户能够指定这些偏好中的每一个是否是“好的”拥有”或“必须拥有”,例如“我更愿意与以英语为母语的人匹配,但我不能与女性匹配”。
我的目标是最大限度地提高比赛的整体平均质量。例如,假设系统中有 4 个用户,A、B、C、D。这些用户可以通过 3 种方式进行匹配:
Option 1 Match Score A-B 5 C-D 4 --- Average 4.5 Option 2 Match Score A-C 2 B-D 3 --- Average 2.5 Option 3 Match Score A-D 1 B-C 9 --- Average 5
因此,在这个人为的示例中,将选择第三个选项,因为它具有最高的整体匹配质量,尽管A和D根本不是很匹配。
是否有一种算法可以帮助我:
- 计算上面显示的“匹配分数”
- 选择将最大化平均匹配分数的配对(同时尊重每个用户的绝对约束)
每个用户都匹配并不是绝对必要的,因此给定一个在显着降低匹配的整体质量和让少数用户没有匹配之间做出选择,我会选择后者。
显然,我希望计算匹配的算法尽快完成,因为系统中的用户数量可能非常大。
最后,这个计算比赛分数和最大化总体平均分的系统只是我自己想出的一个启发法。如果有更好的方法来计算配对,请告诉我。
更新
我描述的问题似乎类似于稳定婚姻问题,其中存在一个众所周知的解决方案。然而,在这个问题中,我不要求所选择的对是稳定的。我的目标是选择配对以使平均“比赛得分”最大化
I'm writing an application which divides a population of users into pairs for the purpose of performing a task together. Each user can specify various preferences about their partner, e.g.
- gender
- language
- age
- location (typically, within X miles/kilometers from where the user lives)
Ideally, I would like the user to be able to specify whether each of these preferences is a "nice to have" or a "must have", e.g. "I would prefer to be matched with a native English speaker, but I must not be matched with a female".
My objective is to maximise the overall average quality of the matches. For example, assume there are 4 users in the system, A, B, C, D. These users can be matched in 3 ways:
Option 1 Match Score A-B 5 C-D 4 --- Average 4.5 Option 2 Match Score A-C 2 B-D 3 --- Average 2.5 Option 3 Match Score A-D 1 B-C 9 --- Average 5
So in this contrived example, the 3rd option would be chosen because it has the highest overall match quality, even though A and D are not very well matched at all.
Is there an algorithm that can help me to:
- calculate the "match scores" shown above
- choose the pairings that will maximise the average match score (while respecting each user's absolute constraints)
It is not absolutely necessary that each user is matched, so given a choice between significantly lowering the overall quality of the matches, and leaving a few users without a match, I would choose the latter.
Obviously, I would like the algorithm that calculates the matches to complete as quickly as possible, because the number of users in the system could be quite large.
Finally, this system of computing match scores and maximising the overall average is just a heurisitic I've come up with myself. If there's a much better way to calculate the pairings, please let me know.
Update
The problem I've described seems to be a similar to the stable marriage problem for which there is a well-known solution. However, in this problem I do not require the chosen pairs to be stable. My goal is to choose the pairs so that the average "match score" is maximized
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
从表面上看,你的问题不是二分的,因此在我看来,你正在寻找一般图中的最大权重匹配。我并不羡慕编写这篇文章的任务,因为 Edmond 的开花收缩算法不容易理解或有效实现。该算法有多种实现,其中一个例子是 C++ 库 LEMON (http://lemon .cs.elte.hu/trac/lemon)。如果您想要最大基数最大权重匹配,则必须使用最大权重匹配算法并向每条边添加较大的权重(所有权重的总和)以强制最大基数作为第一优先级。
或者,正如您在上面的评论之一中提到的,您的匹配项不是线性的,因此线性规划已经过时,您始终可以采用约束规划方法,该方法不需要项是线性的。
By the looks of it your problem is not bipartite, therefore it would seem to me that you are looking for a maximum weight matching in a general graph. I don't envy the task of writing this as Edmond's blossum shrinking algorithm is not easy to understand or implement efficiently. There are implementations of this algorithm out there, one such example being the C++ library LEMON (http://lemon.cs.elte.hu/trac/lemon). If you want a maximum cardinality maximum weight matching you will have to use the maximum weight matching algorithm and add a large weight (sum of all the weights) to each edge to force maximum cardinality as the first priority.
Alternatively as you mentioned in one of the comments above that your match terms are not linear and so linear programming is out, you could always take a constraint programming approach which does not require that the terms be linear.
您一直在关注哪些最大匹配算法?我一开始读你的问题太匆忙了:看来你不一定将自己限制在二分图上。这似乎更棘手。
What maximum match algorithms have you been looking at? I read your question too hastily at first: it seems you don't necessarily restrict yourself to a bipartite graph. This seems trickier.
我相信这个问题可以表示为线性规划问题。然后你可以使用Simplex方法来解决它。
I believe this problem could be represented as a linear programming problem. And then you can use Simplex method to solve it.
为了在任意图中找到最大匹配,Edmond 匹配算法有一个加权变体:
http: //en.wikipedia.org/wiki/Edmonds's_matching_algorithm#Weighted_matching
请参阅那里的脚注。
To find a maximum matching in an arbitrary graph there is a weighted variant of Edmond's matching algorithm:
http://en.wikipedia.org/wiki/Edmonds's_matching_algorithm#Weighted_matching
See the footnotes there.
我在此处提供了类似问题的可能解决方案。这是一种测量差异性的算法——测量的数据与预期数据越相似,得到的数字就越小。
对于您的应用程序,您可以将一个人的偏好设置为预期数据,而与您进行比较的每个人都将是测量数据。在运行比较之前,您可能需要过滤“测量数据”以消除您在原始问题中提到的“不得与女性匹配”等情况。
另一种选择是使用卡方算法。
I provided a possible solution to a similar problem here. It's an algorithm for measuring dissimilarity--the more similar measured data is to expected data, the smaller your resulting number will be.
For your application you would set a person's preferences as the expected data and each other person you compare against would be the measured data. You would want to filter the 'measured data' to eliminate those cases like "must not be matched with a female", that you mention in your original question, before running the comparison.
Another option could be using a Chi-Square algorithm.