使用 torrent 协议进行分散备份
我正在考虑创建客户端,该客户端将使用当今在 torrent 下载客户端(例如 uTorrrent 或 Vuze)中使用的 torrent 协议来创建:
客户端软件将:
- 选择要备份的文件
- 为每个文件创建类似于 torrent 的描述符文件
- 根据关键短语提供可选的文件加密
- 让您选择想要与其他客户交易的冗余 (冗余将基于给予和接受原则。如果您想要备份 100MB 五次,您必须在系统中提供额外的 500MB 自己的存储空间。文件备份不会仅分布在 5 个客户端之间,但它会将利用尽可能多的客户端,根据设置中指定的物理距离提供存储交换)
可选:
我正在考虑包括边缘文件共享。如果您希望在备份存储中共享非加密文件,并且希望客户端打开端口 80 以进行公共 HTTP 共享。但这很棘手,因为我很难想出简单的方案,让访问者选择最近的备份客户端。
包含文件管理器,允许使用 torrent 协议在两个系统之间进行文件传输(类似于带有 GUI 的 FTP)。
我正在考虑将其创建为服务 API 项目(有点像 http://www.elasticsearch.org )它可以与任何容器(例如 tomcat 和 spring)或简单的 Swing 集成。
这将是 P2P 开源项目。由于我对 torrent 协议的理解并不完全有信心,所以问题是:
上述内容对于 torrent 技术的当前状态是否可行(以及我应该在哪里为该项目招募 java
开发人员)
如果这是错误的发布位置,请将其移至更合适的网站。
I'm playing with an idea of creating client that would use the torrent protocol used today in torrent download client such as uTorrrent or Vuze to create:
Client software that would:
- Select files you would like to backup
- Create torrent like descriptor files for each file
- Offer optional encryption of your files based on key phrase
- Let you select redundancy you would like to trade with other clients
(Redundancy would be based on give-and-take principle. If you want to backup 100MB five times you would have to offer extra 500MB of your own storage space in your system. The file backup would not get distributed only amongst 5 clients but it would utilize as many clients as possible offering storage in exchange based on physical distance specified in settings)
Optionally:
I'm thinking to include edge file sharing. If you would have non encrypted files shared in you backup storage and would prefer clients that have their port 80 open for public HTTP sharing. But this gets tricking since I have hard time coming up with simple scheme where the visitor would pick the closest backup client.
Include file manager that would allow file transfers (something like FTP with GUI) style between two systems using torrent protocol.
I'm thinking about creating this as service API project (sort of like http://www.elasticsearch.org ) that could be integrated with any container such as tomcat and spring or just plain Swing.
This would be P2P open source project. Since I'm not completely confident in my understanding of torrent protocol the question is:
Is the above feasible with current state of the torrent technology (and where should I look to recruit java
developers for this project)
If this is the wrong spot to post this please move it to more appropriate site.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您正在考虑使用错误的技术来完成这项工作。您想要的是使用范德蒙德矩阵的纠删码。这使您能够获得相同级别的数据丢失保护,而无需存储几乎同样多的副本。 Luigi Rizzo 提供了一个开源实现,效果非常好。
这段代码允许您将 8MB 数据块切割成任意数量的 1MB 块,这样任何 八个块都可以重建原始数据。这使您能够获得与存储数据大小三倍相同的保护级别,甚至无需将存储数据大小增加一倍。
您可以按照您想要的方式调整参数。 Luigi Rizzo 的实现限制为 256 个块。但您可以控制块大小以及重建数据所需的块数量。
您不需要生成或存储所有可能的块。如果将 80MB 的数据块切割成 8MB 的块,使得任何 10 个块都可以恢复原始数据,则最多可以构建 256 个这样的块。您可能只需要 20 个左右。
You are considering the wrong technology for the job. What you want is an erasure code using Vandermonde matrixes. What this allows you to do is get the same level of protection against lost data without needing to store nearly as many copies. There's an open source implementation by Luigi Rizzo that works perfectly.
What this code allows you to do is take a 8MB chunk of data and cut it into any number of 1MB chunks such that any eight of them can reconstruct the original data. This allows you to get the same level of protection as tripling the size of the data stored without even doubling the size of the data stored.
You can tune the parameters any way you want. With Luigi Rizzo's implementation, there's a limit of 256 chunks. But you can control the chunk size and the number of chunks required to reconstruct the data.
You do not need to generate or store all the possible chunks. If you cut an 80MB chunk of data into 8MB chunks such that any ten can recover the original data, you can construct up to 256 such chunks. You will likely only want 20 or so.
您可能很难执行互惠存储功能,我认为这对于大规模采用至关重要(最后,您在谷物食品盒中获得的那三个 TB 驱动器有很好的用途!)您可能希望研究BitCoin 看看是否有任何工具可以窃取或采用以满足您自己的分布式不可否认证明的需要的 贮存。
You might have great difficulty enforcing the reciprocal storage feature, which I believe is critical to large-scale adoption (finally, a good use for those three terabyte drives that you get in cereal boxes!) You might wish to study the mechanisms of BitCoin to see if there are any tools you can steal or adopt for your own needs for distributed non-repudiable proof of storage.