活动流/提要,是否非规范化?
我知道这个问题的变体之前已经被问过很多次了(我已经读过它们,其中两个是:1,2),但我就是无法理解任何感觉像是正确解决方案的东西。
从多对多关系、扇出、多态关联、NoSQL 解决方案、消息队列、非规范化以及它们的组合,一切都被提出了。
我知道这个问题是非常具体的,所以我将简要解释一下我的问题:
- 触发许多事件的许多活动。
- 关注、创建、点赞、评论、编辑、删除等。
- 用户可以关注其他用户的活动(他们触发的事件)。
- 请求最多的事件将是最新的事件。
- 需要能够查看过去的事件。
- 在按日期描述排序之后,不需要对提要进行排序或搜索。
- 可扩展性是一个问题(性能和可扩展性)。
与此同时,我最终采用了一个非规范化的设置,基本上由一个事件表组成,其中包含:id
、date
、user_id
,操作
,root_id
,object_id
,对象
,数据
。
user_id
是触发事件的人。action
是操作。root_id
是object
所属的用户。object
是对象类型。data
包含在用户流中呈现事件所需的最少量信息。
然后,为了获取所需的事件,我只需抓取所有行,其中 user_id
是我们要抓取其流的用户的 ID。
它有效,但非规范化感觉是错误的。多态关联似乎也是如此。 Fanout 似乎介于两者之间,但感觉很混乱。
通过我对这个问题的所有搜索,并阅读这里的众多问题,我只是无法点击任何内容并感觉像是正确的解决方案。
任何人可以提供的任何经验、见解或帮助都将非常受到赞赏。谢谢。
I know variations of this question have been asked many times before (and I've read them, 2 of them being: 1, 2), but I just can't wrap my head around anything that just feels like the right solution.
Everything has been suggested from many to many relations, to fanout, to polymorphic associations, to NoSQL solutions, to message queues, to denormalization and combinations of them all.
I know this question is very situational, so I'll briefly explain mine:
- Many activities that trigger many events.
- Following, creating, liking, commenting, editing, deleting, etc.
- A user can follow another user's activity (the events they trigger).
- The most requested events will be the most recent events.
- The ability to view past events is desired.
- No sorting or searching of the feed is desired past ordering by date desc.
- Scalability is a concern (performance and expandability).
For the mean time, I ended up going with a denormalized setup basically being made up of an events table consisting of: id
, date
, user_id
, action
, root_id
, object_id
, object
, data
.
user_id
being the person that triggered the event.action
being the action.root_id
being the user the object
belongs to.object
being the object type.data
containing the minimum amount of information needed to render the event in a user's stream.
Then to get the desired events, I just grab all rows in which the user_id
is the id of a user being followed by whose stream we're grabbing.
It works, but the denormalization just feels wrong. Polymorphic associations seem similarly so. Fanout seems to be somewhere in between, but feels very messy.
With all my searching on the issue, and reading the numerous questions here on SO, I just can't get anything to click and feel like the right solution.
Any experience, insight, or help anyone can offer is greatly appreciated. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我从未处理过社交活动源,但根据您的描述,它们与维护棘手的业务活动日志非常相似。
就我个人而言,我倾向于使用适用活动类型的单独表、每种类型的修订/日志表以及后者中的每个表来管理更中心的事件日志表。
后者允许构建提要,看起来很像您提出的解决方案:event_id、event_at、event_name、event_by、event_summary、event_type。 (event_type 字段是一个包含表或对象名称的 varchar。)
您可能不需要维护您案例中所有内容的历史记录(当然,这不太适合朋友请求,而适合销售和库存变动),但我认为,维护某种中央事件日志表(除了其他适用的表以掌握规范化数据之外)是正确的方法。
通过查看审核日志相关问题,您可能会获得一些有趣的见解:
https://stackoverflow.com/search?q=audit+日志
I've never dealt with social activity feeds, but based on your description they're quite similar to maintaining tricky business activity logs.
Personally, it's a case I tend to manage with separate tables for applicable activity types, a revisions/logs table for each of these types, and each of the latter with a reference to a more central event logs table.
The latter allows to build the feed and looks a lot like the solution you came up with: event_id, event_at, event_name, event_by, event_summary, event_type. (The event_type field is a varchar containing the name of the table or object.)
You probably don't need to maintain the history of everything in your case (surely this is less appropriate for friends-requests than for sales and stock movements), but maintaining some kind of central event logs table (in addition to other applicable tables to have the normalized data at hand) is, I think, the correct approach.
You might get some interesting insights by looking at audit log related questions:
https://stackoverflow.com/search?q=audit+log
我认为使用 NoSQL/Memcached 的组合可能会满足您的需求。请参阅此 URL 了解更多想法:
http://www.slideshare.net/danmckinley /etsy-activity-feeds-architecture
I think using a combination of NoSQL/Memcached may suit your needs. Please see this URL for further ideas:
http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture