存储全站用户活动的有效方法
我尝试在 stackoverflow 上搜索以及进行大量搜索,但无法找到问题的答案(我想我正在搜索错误的关键字/术语)。
我们正在构建推荐引擎,虽然我们最初将所有用户活动记录在自定义日志中(我们使用 ruby/rails),但我们需要对该文件进行 EOD 扫描并根据用户进行排列。我们还有一些其他用户数据来自其他一些地方(他的 Facebook 活动、Twitter 时间线等),因此通过 EOD,我们希望将特定用户的所有数据保存在某处,然后在所有数据上运行我们的分析器代码。用户的数据来生成推荐。
问题是我们正在生成大量数据,虽然目前我们使用 mysql 表来存储所有这些数据,但我们不确定我们可以继续这样做多久,因为我们的用户群增长(我们仍在内部测试它,有大约 10 个活跃的用户)。另外,作为热心的开发人员,我们希望尝试一些新的东西来满足我们的需求。
任何朝这个方向的指示都会非常有帮助。
I tried searching through on stackoverflow as well as googling around a lot, but am not able to find answers to my problem (I guess I'm searching for the wrong keywords / terms).
We are in the process of building a recommendation engine, and while we are initially logging all user activity in custom logs (we use ruby / rails), we need to do an EOD scanning of that file and arrange according to the user. We also have some other user data coming in from some other places (his fb activity, twitter timeline, etc), and hence by EOD we want all data for a particular user to be saved somewhere and then run our analyzer code on all of the user's data to generate the recommendations.
The problem is that we are generating a lot of data, and while for the time being we are using a mysql table to store all this data, we are not sure till how much time can we continue to do this, as our user-base grows (we are still testing it out internally with about 10 users with a lot of activity). Plus, as eager developers we would like to try out something new that can suffice our needs.
Any pointers in this direction will be very helpful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
查看 Amazon Elastic Map Red。它就是为这种类型的事情而建造的。
Check out Amazon Elastic Map Reduce. It was built for this very type of thing.