每小时更新数据库(mongo)的最佳方法?
我正在准备一个小应用程序,它将汇总我网站上的用户数据(通过 socket.io)。我想每小时将所有数据插入到我的 monogDB 中。
最好的方法是什么? setInterval(60000) 似乎有点蹩脚:)
I am preparing a small app that will aggregate data on users on my website (via socket.io). I want to insert all data to my monogDB every hour.
What is the best way to do that? setInterval(60000) seems to be a lil bit lame :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
为什么不直接使用一个
curl
请求来触发数据库写入呢?您可以将该命令放在每小时的 cron 作业上并侦听本地端口。Why not just hit the server with a
curl
request that triggers the database write? You can put the command on an hourly cron job and listen on a local port.您可以让 mongo 存储上次复制数据的时间,并且每次收到任何请求时,您都可以检查自上次复制数据以来已经过去了多长时间。
或者您可以尝试 setInterval(checkRestore, 60000) 每分钟检查一次。 checkRestore() 将查询服务器以查看上次更新时间是否超过一小时。有几种方法可以做到这一点。
存储日期的一种简单方法是将其存储为 Date.now() (https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Date) 的值,然后检查 db.日志.find({lastUpdate:{$lt:Date.now()-6000000}}).
我想我在那里混淆了一些不同的解决方案,但希望类似的东西能起作用!
You could have mongo store the last time you copied your data and each time any request comes in you could check to see how long it's been since you last copied your data.
Or you could try a setInterval(checkRestore, 60000) for once a minute checks. checkRestore() would query the server to see if the last updated time is greater than an hour old. There are a few ways to do that.
An easy way to store the date is to just store it as the value of Date.now() (https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Date) and then check for something like db.logs.find({lastUpdate:{$lt:Date.now()-6000000}}).
I think I confused a few different solutions there, but hopefully something like that will work!
如果您使用 Node,一个不错的类似 CRON 的工具是 Forever。它使用相同的 CRON 模式来处理重复的作业。
If you're using Node, a nice CRON-like tool to use is Forever. It uses to same CRON patterns to handle repetition of jobs.
例如,您可以使用 cron 并按计划作业运行您的 Node.js 应用程序。
编辑:
如果程序必须连续运行,那么 setTimeout 可能是少数可能的选择之一(实现起来非常简单)。否则,您可以将数据卸载到某些临时存储系统(例如 redis),然后定期运行其他 Node.js 程序来移动数据,但这可能会引入对其他数据库系统的新依赖关系,并根据您的场景增加复杂性。在这种情况下,Redis 也可以作为某种故障安全解决方案,以防主 Node.js 应用程序意外终止并丢失部分或全部数据批次。
You can use cron for example and run your node.js app as scheduled job.
EDIT:
In case where the program have to run continuously, then probably setTimeout is one of the few possible choices (which is quite simple to implement). Otherwise you can offload your data to some temporary storage system, for example redis and then regularly run other node.js program to move your data, however this may introduce new dependency on other DB system and increase complexity depending on your scenario. Redis can also be in this case as some kind of failsafe solution in case when your main node.js app will unexpectedly be terminated and lose part or all of your data batch.
您应该实时聚合,而不是每小时一次。
我会看一下 BuddyMedia 的演示,了解他们如何进行精确到分钟的实时聚合。我正在将这种方法的改编版本用于我的实时指标,并且效果非常好。
http://www.slideshare.net/pstokes2/social-analytics-with-mongodb
You should aggregate in real time, not once per hour.
I'd take a look at this presentation by BuddyMedia to see how they are doing real time aggregation down to the minute. I am using an adapted version of this approach for my realtime metrics and it works wonderfully.
http://www.slideshare.net/pstokes2/social-analytics-with-mongodb