基于比较的排名算法

发布于 2024-09-27 16:12:56 字数 808 浏览 1 评论 0原文

我想对项目集合(大小可能大于 100,000)进行排名或排序,其中集合中的项目没有内在(可比较)价值,而是我所拥有的只是任意两个项目之间的比较由用户以主观方式提供。

示例:考虑一个包含元素 [a, b, c, d] 的集合以及用户 b > 的比较一个,a> d,d> c.该集合的正确顺序是[b, a, d, c]

这个例子很简单,但是可能有更复杂的情况:

  • 由于比较是主观的,因此用户也可以说 c > 。 b.。在这种情况下,这将导致与上述顺序发生冲突。
  • 此外,您可能没有“连接”所有项目的比较,即 b > > a,d>; c.在这种情况下,顺序是不明确的。它可以是 [b, a, d, c][d, c, b, a]。在这种情况下,任一顺序都是可以接受的。

如果可能的话,最好以某种方式考虑同一比较的多个实例,并给予那些出现次数较多的实例更多的权重。但没有这个条件的解决方案仍然可以接受。

扎克伯格的 FaceMash 应用程序使用了该算法的类似应用程序,他根据比较对人员进行排名(如果我理解正确的话),但我无法找到该算法实际上是什么。

是否已经存在一种算法可以解决上述问题?如果是这样的话,我不想花精力去想出一种算法。如果没有特定的算法,您是否可以向我指出某些类型的算法或技术?

I would like to rank or sort a collection of items (with size potentially greater than 100,000) where items in the collection have no intrinsic (comparable) value, instead all I have is the comparisons between any two items which have been provided by users in a subjective manner.

Example: Consider a collection with elements [a, b, c, d] and comparisons by users b > a, a > d, d > c. The correct order of this collection would be [b, a, d, c].

This example is simple, however there could be more complicated cases:

  • Since the comparisons are subjective, a user could also say that c > b. In which case that would cause a conflict with the ordering above.
  • Also you may not have comparisons that “connects” all the items, i.e. b > a, d > c. In which case the ordering is ambiguous. It could be [b, a, d, c] or [d, c, b, a]. In this case either ordering is acceptable.

If possible it would be nice to somehow take into account multiple instances of the same comparison and give those with higher occurrences more weight. But a solution without this condition would still be acceptable.

A similar application of this algorithm was used by Zuckerberg's FaceMash application where he ranked people based on comparisons (if I understood it correctly), but I have not been able to find what that algorithm actually was.

Is there an algorithm which already exists that can solve the problem above? I would not like to spend effort trying to come up with one if that is the case. If there is no specific algorithm, is there perhaps certain types of algorithms or techniques which you can point me to?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

辞旧 2024-10-04 16:12:56

这是另一个领域已经出现的问题:竞技游戏!这里的目标也是基于一系列一对一比较的基础上为每个玩家分配一个全局“排名”。当然,困难在于比较是不可传递的(我在你的问题中将“主观”理解为“由人类提供”)。卡斯帕罗夫击败了费舍尔(不知道还有其他国际象棋棋手!)鲍勃有可能击败卡斯帕罗夫。

这使得依赖传递性的算法变得无用(即 a > b 和 b > c => a > c),因为您最终(可能)得到一个高度循环图。

为了解决这个问题,人们设计了几种评级系统。

最著名的系统可能是针对国际象棋棋手的 Elo 算法/分数。它的后代(例如,Glicko 评级系统)更加复杂,并考虑了输赢记录——换句话说,评级的可靠性如何?这类似于您对玩更多“游戏”的记录赋予更重的权重的想法。 Glicko 还构成了 Xbox Live 上使用的TrueSkill 系统的基础用于多人视频游戏。

This is a problem that has already occurred in another arena: competitive games! Here, too, the goal is to assign each player a global "rank" on the basis of a series of 1 vs. 1 comparisons. The difficulty, of course, is that the comparisons are not transitive (I take "subjective" to mean "provided by a human being" in your question). Kasparov beats Fischer beats (don't know another chess player!) Bob beats Kasparov, potentially.

This renders useless algorithms that rely on transitivity (i.e. a > b and b > c => a > c) as you end up with (likely) a highly cyclic graph.

Several rating systems have been devised to tackle this problem.

The most well-known system is probably the Elo algorithm/score for competitive chess players. Its descendants (for instance, the Glicko rating system) are more sophisticated and take into account statistical properties of the win/loss record---in other words, how reliable is a rating? This is similar to your idea of weighting more heavily records with more "games" played. Glicko also forms the basis for the TrueSkill system used on Xbox Live for multiplayer video games.

禾厶谷欠 2024-10-04 16:12:56

您可能对最小反馈弧集问题感兴趣。本质上,问题是如果元素按某种顺序线性排序,则找到“走错路”的最小比较次数。这与查找必须删除以使图成为非循环的最小边数相同。不幸的是,精确地解决这个问题是 NP 困难的。

讨论该问题的几个链接:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.8157&rep=rep1&type=pdf

http://en.wikipedia.org/wiki/Feedback_arc_set

You may be interested in the minimum feedback arc set problem. Essentially the problem is to find the minimum number of comparisons that "go the wrong way" if the elements are linearly ordered in some ordering. This is the same as finding the minimum number of edges that must be removed to make the graph acyclic. Unfortunately, solving the problem exactly is NP-hard.

A couple of links that discuss the problem:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.8157&rep=rep1&type=pdf

http://en.wikipedia.org/wiki/Feedback_arc_set

月依秋水 2024-10-04 16:12:56

我用谷歌搜索了这一点,查找第 12.3 章,拓扑排序和深度优先搜索

http://www.cs.cmu.edu/~avrim/451f09/lectures/lect1006.pdf

您的一组关系描述了一个有向无环图(希望是无环的),因此图拓扑排序正是您所需要的。

I googled this out, look for chapter 12.3, Topological sorting and Depth-first Search

http://www.cs.cmu.edu/~avrim/451f09/lectures/lect1006.pdf

Your set of relations describe a directed acyclic graph (hopefully acyclic) and so graph topological sorting is exactly what you need.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文