如何实现个性化事件流?

发布于 2024-08-13 20:43:37 字数 533 浏览 7 评论 0原文

我将为一个非政府组织开发一个灵感网站,我希望实现某种类似 Facebook 的事件流,其中包括“Michael 推荐的苹果派”、“John 评论”等活动巧克力蛋糕”、“焦糖软糖是 Alice 在 8 小时前发布的”,等等。

问题是这些事件是基于兴趣的,所以有人只对以下内容感兴趣焦糖和樱桃,不应该看到苹果派或巧克力蛋糕。这有很多排列,并且动态生成用户的个性化事件流将意味着一些相当昂贵的数据库查询。

所以我的想法是通过在发生操作事件时进行某种后台处理来预先生成接收用户和发布的事件(可能是一个简单的 SQL JOIN 表)之间的关系。

权衡数百个用户对事件的偏好所需的工作必然是大量的,因此它不能作为触发工作的 POST 请求的一部分来完成,因此我必须在一个不同的过程。我目前正在寻找 Gearman 来完成这项任务,但我非常愿意接受建议。

我并不是在寻找有人为我做我的工作,但如果有人有任何构建此类东西的经验,我很想听听你的想法。

I'm going to work on an inspiration site for a NGO, and I'm looking to implement some sort of Facebook-esque event stream, with events like “Michael recommended apple pie”, “John commented on chocolate cake”, “Caramel fudge was posted 8 hours ago by Alice”, etc.

The thing is that these events are interest-based, so someone only be interested in caramel and cherries and should not see apple pies or chocolate cakes then. There are a lot of permutations for this, and generating a user’s personalized event stream on the fly would mean some rather expensive database-queries.

So my thinking was to pre-generate a relation between the receiving user and the posted event (probably a simple SQL JOIN-table) by doing some sort of background processing whenever an action event is happening.

The work required to weigh the preferences of hundreds of users against an event is bound to be substantial, so it cannot be done as a part of POST request that triggers the work, so I’ll have to do a lot of the work in a different process. I’m currently looking at Gearman for this task, but I’m very open to suggestions.

I’m not looking for someone to do my work for me, but if anyone has any prior experience with building this sort of thing, I'd love to hear your thoughts.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

旧城烟雨 2024-08-20 20:43:37

我有一些在社交网站上构建新闻流的经验,是的,当您有多种类型的事件和多个兴趣级别(或隐私设置或用户权限)时,查询可能会很快变得非常复杂。

假设事件被查看的次数多于事件生成的次数,那么在事件发生时(而不是每次有人请求新闻流时)进行一些非规范化并计算事件的潜在查看者确实有意义。

我建议运行一个后台进程,将这些事件对象(与其创建者相关)转换为更简单的消息对象(与其读者相关,即在新闻流中看到它们的人)。您最终可能会在每个事件中收到许多消息,但这将使向前端发出的请求更快,并将工作转移到后台进程。

我没有使用过 Gearman,但如果它允许您在后台进程中加载​​应用程序的环境并通过队列接收事件进行处理,那么这可能是一个好主意。

我的简单解决方案是使用 beanstalkd 和我自己的 PHP 脚本来推出自己的解决方案。

I have had some experience of building a news stream on a social networking site and yes, queries can get very complex very quickly when you have multiple types of events and multiple levels of interest (or privacy settings, or user permissions).

On the assumption that events are viewed more often than they are generated, it does make sense to do some denormalisation and calculate an event's potential viewers when the event happens, rather than every time somebody requests the news stream.

I would suggest running a background process which converts these event objects (related to their creators) into simpler message objects (related to their reader, the people who see them on the news stream). You may end up with many messages per event, but this will make requests to to the front-end much quicker, and offload the work onto the background processes.

I've not used Gearman, but if it is the sort of thing which allows you to load up your app's environment in a background process and receive the events to process through a queue, then it's probably a good idea.

My simple solution was to roll my own using beanstalkd and my own PHP scripts.

夏夜暖风 2024-08-20 20:43:37

不知道您的数据库是如何构造的(您可能想告诉我们更多信息),但

SELECT events.* FROM events, event_tags, user_tags
     WHERE event_tags.event_id = events.id 
         AND event_tags.tag_id = user_tags.tag_id
         AND  user_tags.user_id = <$user_id>

假设您到处都有索引,那么显而易见的事情对我来说似乎并不那么沉重

Don't know how your DB is structured (you might want to tell us more), but something obvious like

SELECT events.* FROM events, event_tags, user_tags
     WHERE event_tags.event_id = events.id 
         AND event_tags.tag_id = user_tags.tag_id
         AND  user_tags.user_id = <$user_id>

doesn't seem extremely heavy to me, assuming you have indices all over the place

老街孤人 2024-08-20 20:43:37

这听起来像是可以通过适当的索引来解决的问题。我将围绕数据库能够处理它的假设构建解决方案,但将服务放在数据库前面,并让所有客户端都经历这一点。如果事情开始变得太慢,您可以在这一层引入各种类型的缓存。与大多数性能决策一样,试图预先做好可能不是一个好主意。

This sounds like something that can be solved with a proper index. I would build the solution around the presumption that the database is capable of handling it, but place a service in front of the database and let all clients go through this point. If things begin going too slow, you can introduce various types of caching in this layer. As with most performance decisions, trying to do it right up front is probably not a good idea.

青衫儰鉨ミ守葔 2024-08-20 20:43:37

Facebook 开发了自己的数据库来完成这类事情并将其开源,我对此不太了解,但我猜它可能是 值得一看

Facebook developed their own database to do this sort of thing and open sourced it, I don't know much about it but I'm guessing it might be worth a look.

静赏你的温柔 2024-08-20 20:43:37

您是否查看过 Activity 模块?以下是其项目页面的摘录:

... 跟踪人们在您的网站上所做的事情,并通过块、专用表格和 RSS 提供这些活动的小型提要。该模块是可扩展的,因此任何其他模块都可以与其集成。生成的消息可通过管理界面进行自定义,并且是上下文相关的。

我会对你的想法感到好奇,因为在不久的将来需要做这样的事情。

Have you looked at the Activity module? Here is an excerpt from its project page:

... keeps track of the things people do on your site and provides mini-feeds of these activities in blocks, in a specialized table, and via RSS. The module is extensible so that any other module can integrate with it. The messages that are produced are customizable via the admin interface and are context sensitive.

I'll be curious about what you come up with because need to do something like this in semi-near future.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文