根据评分从集合中挑选选项的算法?

发布于 2024-12-20 13:46:55 字数 354 浏览 2 评论 0原文

假设我有一个对象集合。我还有另一个点赞集合,每个点赞都由特定用户针对特定对象点赞。因此,随着时间的推移,通过用户评分,每个对象都有不同数量的喜欢(全部大于 0)。

我想从这个集合中选择一个对象。喜欢较多的对象应该更频繁地选择,但有时也应该选择喜欢较少的对象,给它们一个机会。

我现在想到的算法是按喜欢对对象进行排序,并生成一个随机数,并使用该数字在一个范围内选择一个随机对象。假设我有一百个对象,50% 的时间选择 0-10 的对象,25% 的时间选择 10-15,25% 的时间选择 15-100。

该算法的明显问题是可扩展性。当有 1000000 个对象时,返回所有对象的数组需要时间。有人有更好的解决方案吗?数据库是用mongodb实现的。

Let's say I have a collection of objects. I have another collection of likes, each one by a specific user and toward a specific object. Thus, over time through user ratings, each object has a variable amount of likes (all greater than 0).

I want to choose an object from this collection. Objects with more likes should be chosen more frequently, but objects with lower likes should also be sometimes chosen to give them a chance.

The algorithm I have in mind now is to order the objects by likes, and generate a random number, and use the number to choose a random object within a range. Assuming I had a hundred objects, 50% of the time objects from 0-10 are chosen, 25% of the time 10-15, and 25% of the time 15-100.

The obvious problem with this algorithm is scalability. When theirs 1000000s of objects, returning an array of all of them takes time. Does anyone have a better solution? THe database is implemented in mongodb.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

缱绻入梦 2024-12-27 13:46:55

我会稍微反规范化,并为被喜欢的对象添加一个“喜欢”计数器字段。当对象被喜欢时增加它,当对象不喜欢时减少它。

db.test.insert({
    stuff: "likable stuff",
    likes: 7
})

然后我还会有另一个字段,表示该对象因点赞而所在的存储桶。例如,对象一开始将此字段设置为“普通”,在某人获得 10 个赞后,他们将成为“精英”。 (或任何你想要的)当它们达到该阈值时更新它。这里的想法是,在写入方面进行工作将使读取变得更容易。

db.test.insert({
    stuff: "likable stuff",
    likes: 7,
    status: "ordinary/elite",
})

好的,现在选择您根据喜欢的数量定义的组中的对象集很容易,对吧? db.collection.find({ status: 'elite' })

要随机化这些集合中的文档选择,您可以随机跳过给定数量的记录,但这会导致性能糟糕并且无法扩展。

但是,您可以使用一个技巧,将随机生成的数字存储在文档本身中。

让我们将其中一个插入到测试数据库中并检查一下。

db.test.insert({
    stuff: "likable stuff",
    likes: 7,
    status: "ordinary/elite",
    random: Math.random()
})

现在让我们看一下文档:

{
    stuff: "likable stuff",
    likes: 7,
    status: "ordinary/elite",
    random: 0.9375813045563468
}

好的,这就是它真正酷的地方。执行 findOne() 查询,其中 status: Elite and rand_num: $gt { another random generated number btw 0 and 1 }。

db.collection.find({ status: "elite", random: { "$gt": new_rand_num } })

如果 findOne() 查询没有返回结果,请使用 $ 再次执行因为您一定会在至少一个方向上找到一份文档。

现在让我们对状态和随机进行索引。

db.collection.ensureIndex({ status: 1, random: 1} })

你觉得怎么样?

I would denormalize a bit and add a 'like' counter field to the objects being liked. Increment it when the objects get liked, decrement it when the objects are un-liked.

db.test.insert({
    stuff: "likable stuff",
    likes: 7
})

Then I would also have another field that represents the bucket that object is in as a result of the likes. So, for example, objects start out with this field set to 'ordinary' and after someone got 10 likes they would become 'elite'. (or whatever you want) Update it when they reach that threshold. The idea here is that doing work on the write will make the reads that much easier to do.

db.test.insert({
    stuff: "likable stuff",
    likes: 7,
    status: "ordinary/elite",
})

Ok so now selecting the set of objects that are in the groups you defined based on # of likes is easy right? db.collection.find({ status: 'elite' })

To randomize document selection within these sets you could randomly skip a given amount of records, but that will lead to awful performance and won't scale.

However, there is a trick you can do whereby you store randomly generated numbers in the documents themselves.

Let's insert one of these guys into a test db and check it out

db.test.insert({
    stuff: "likable stuff",
    likes: 7,
    status: "ordinary/elite",
    random: Math.random()
})

Let's take a look at the document now:

{
    stuff: "likable stuff",
    likes: 7,
    status: "ordinary/elite",
    random: 0.9375813045563468
}

Ok, here is where this gets really cool. Do a findOne() query where status: elite and rand_num: $gt { another randomly generated number btw 0 and 1 }.

db.collection.find({ status: "elite", random: { "$gt": new_rand_num } })

If the findOne() query doesn't return a result, do it again with $lt as you will be sure to find a document in at least one of the directions.

Now lets index on status and random.

db.collection.ensureIndex({ status: 1, random: 1} })

What do you think?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文