文件复制解决方案
考虑一个 Windows 托管的构建过程,该过程将定期将文件删除到磁盘,以便复制到同一数据中心中的多个其他 Windows 服务器。 其他机器将运行 IIS,并向大众提供这些文件。
语料库的总大小将是数百万个文件、数百 GB 的数据。 它必须处理目标服务器上可能存在的争用、WAN 上的潜在链接、冷启动干净服务器
到目前为止我想到的解决方案:
- 排队系统和守护程序定期唤醒并复制或作为服务运行。
- SAN - 昂贵、复杂、更昂贵
- ROBOCOPY,定时作业 - 简单但有效。 许多内部/不确定状态,例如复制时的位置、错误
- 现成的复制品。 软件 - 比 SAN 便宜,但
- UNC 共享文件夹仍然昂贵并且没有 repl。 更高的延迟、更低的成本——仍然需要集群解决方案。
- DFS 复制。
其他人还用过什么?
Thinking about a Windows-hosted build process that will periodically drop files to disk to be replicated to several other Windows Servers in the same datacenter. The other machines would run IIS, and serve those files to the masses.
The total corpus size would be millions of files, 100's of GB of data. It'd have to deal with possible contention on the target servers, latent links e.g. over a WAN, cold-start clean servers
Solutions I've thought about so far :
- queue'd system and daemons either wake periodically and copy or run as services.
- SAN - expensive, complex, more expensive
- ROBOCOPY, on a timed job - simple but effective. Lots of internal/indeterminate state e.g. where its at in copying, errors
- Off the shelf repl. software - less expensive than SAN but still expensive
- UNC shared folders and no repl. Higher latency, lower cost - still need a clustering solution too.
- DFS Replication.
What else have other folks used?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我已经使用 rsync 脚本成功完成了此类工作,在我们的案例中使用了 1000 台机器。 我相信 Windows 上有一个 rsync 服务器,但除了 Linux 之外我没有在任何其他地方使用过它。
I've used rsync scripts with good success for this type of work, 1000's of machines in our case. I believe there is an rsync server for windows, but I have not used it on anything other than Linux.
尽管我们没有这些数百万亿的数据需要管理,但我们正在我们的主要公司及其海外代理机构之间连夜发送和收集大量文件。 我们使用 allwaysync 一段时间了。 它允许文件夹/ftp 同步。 它有一个漂亮的界面,允许对文件夹和文件进行分析和比较,当然也可以进行安排。
Though we do not have these millions of giga of data to manage, we are sending and collecting lots of files overnight between our main company and its agencies abroad. We have been using allwaysync for a while. It allows folders/ftp synchronization. It has a nice interface that allow folders and files analysis and comparisons, and it can be of course scheduled.
UNC 共享文件夹和无复制有很多缺点,特别是如果 IIS 将使用 UNC 路径作为站点的主目录。 在压力下,您会遇到 http://support.microsoft.com/default.aspx /kb/810886 因为针对共享文件夹的服务器的并发会话数。 此外,您还会遇到 IIS 站点启动缓慢的情况,因为 IIS 需要扫描/索引/缓存(取决于 IIS 版本和 ASP 设置)UNC 文件夹。
我已经看到 DFS 的测试非常有希望,没有表现出上述任何限制。
UNC shared folders and no replication has many downsides, especially if IIS is going to use UNC paths as home directories for sites. Under stress, you will run into http://support.microsoft.com/default.aspx/kb/810886 because of the number of simultaneous sessions against the server sharing the folder. Also, you will experience slow IIS site startups since IIS is going to want to scan/index/cache (depending on IIS version and ASP settings) the UNC folder.
I've seen tests with DFS that are very promising, exhibition none of the above restrictions.
我们在我的组织中使用 ROBOCOPY 来传递文件。 它运行得非常流畅,我觉得值得推荐。
此外,你并没有做任何太疯狂的事情。 如果您也精通 Perl,我相信您可以编写一个快速脚本来满足您的需求。
We use ROBOCOPY in my organization to pass files around. It runs very seamlessly and I feel it worth a recommendation.
Additionally, you are not doing anything too crazy. If you are also proficient in perl, I am sure you could write a quick script that will fulfill your needs.