Dropbox 等同步服务、文件索引背后的理论?
我意识到,通过直接使用 Amazon S3 服务,我可以为自己节省很多钱。我认为创建自己的 Windows 同步应用程序会很有趣,它将我的文件同步到 S3,而不是购买 GoodSync 或 Jungle Disk 这样的客户端。
我发现我可以使用 FileSystemWatcher 来监视文件和目录的更改,但我正在寻找 Dropbox 等其他服务如何索引其文件背后的理论。例如将文件的文件大小与客户端 PC 上某个索引中记录的大小进行比较,然后使用此信息来确定是否同步。
我正在使用 C#,对我可以使用的不同库或代码示例的引用会很有帮助,但我主要是在寻找索引文件的最佳方法,并希望有人为我指明正确的方向。
谢谢
I have realised that by using the Amazon S3 service directly, I can save myself a lot of money. Instead of buying a client like GoodSync or Jungle Disk I thought it would be interesting to create my own Windows syncing application, which would sync my files to S3.
I have discovered that I can use FileSystemWatcher
to monitor for changes to files and directories, but I am looking for the theory behind how other services like Dropbox index their files. Things like comparing the file size of a file with the size recorded in an index somewhere on the client PC, then using this information to determine whether to sync or not.
I am using C# and references to different libraries or code samples I could use would be helpful, but I am mainly looking for the best way to index files and for someone to point me in the right direction.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我自己也曾走过这条路。事实上,既然 Mozy 放弃了他们的无限计划,并且 Carbonite 选择不备份某些文件...例如 3GP 文件和 *.dat 文件,除非您定期进入并手动添加它们,我对在线备份非常不满。
但你的问题是关于同步。 Dropbox 做得最好。但它很贵。但我不确定 S3 会更便宜。
无论如何,你都会遇到很多障碍。根据我的经验,我遇到的问题是:
1) 传播删除
2) FileSystemWatcher 简单地丢失了事件,例如快速将文件添加到文件夹然后删除它们
3) 等等。
现在关于如何我的一些想法将再次解决这个问题:
1)在本地保留一个小型 SQLite 数据库用于文件名/路径
2) 在发送到 S3 之前将文件复制到 tmp 目录。
3)关于文件更改/更新/删除/等,将元信息存储在 SQLite 中
无论如何只是一些想法。
I've went down this path myself. In fact, now that Mozy dropped their unlimited plan and Carbonite chooses to NOT backup certain files...like 3GP files and *.dat files unless you routinely go in and manually add them, I am very disgruntled with online backups.
But your question was on syncing. Dropbox does it the best. But it's expensive. But I'm not sure S3 would be any cheaper.
Anyway, you will have a lot of hurdles. In my experiences, the problems I ran into are:
1) Propagating deletes
2) FileSystemWatcher simply missing events such as rapidly adding files to a folder then deleting them
3) etc..
Now some ideas on how I would tackle this again:
1) Keep a small SQLite db for files names/path locally
2) Copy files to a tmp directory before sending to S3.
3) On file changes/updates/deletions/etc store that meta information in SQLite
Anyway just some ideas.