使用java传输文件
我需要在我的 java 程序中将大量小文件传输到远程计算机。我想知道是否有人可以建议最好的方法...我需要传输大量小文件并且它必须非常快。我应该使用一些现有的协议实现吗?也许是ftp?
一件重要的事情是,大多数文件始终是相同的,或者差异很小,所以我正在考虑使用 git 来实现这一目的。有人有这样的经验吗?
I need to transfer lots of small files to a remote computer within my java program. I was wondering if somebody could suggest the best way to do so... I need to transfer lots of small files and it has to be really fast. Should I use some existing protocol implementation? maybe ftp?
One important thing is that most files would be the same all the time, or the difference would be minor so I was thinking of using git for that purpose. Does anyone have experience with sth like this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
根据您的描述,rsync 绝对适合您的要求,比其他替代方案要优越得多已提供。
From your description, rsync is an absolutely perfect fit for your requirements, much superior to the alternatives that have been offered.
谁正在接收您发送的文件?另一个应用程序?您可以使用消息传递软件,例如 active MQ
或
坚持使用 Java Net API 进行 FTP。
想知道为什么要涉及 git。它是否提供任何 API 来查找 delta 等?我不这么认为。据我所知,git是一个版本控制系统。
Who is receiving the files that you send? another application? You may use a messaging software such as active MQ
or
Stick with java net APIs for FTP.
Wondering why you want to involve git. Does it provide any API to find delta etc? I don't think so. git is a version control system as far as I know.
传输大量小文件的最有效方法是作为存档;例如 ZIP 或 TAR。如果您的网络速度相对较慢,则在传输之前压缩存档将使文件产生很大的差异。但如果网络真的很快,压缩实际上可能会使传输文件的总时间更长。另一个产生巨大差异的因素是文件系统读取和(尤其是)创建文件的速率。
Git 协议可以非常快,但它通过仅发送已更改的文件并(在可能的情况下)发送差异而不是完整文件来实现这一点。此方法不能用于常规文件传输。 Rdist 和 rsync 是较旧的 UNIX/Linux 工具,它们采用与 Git 和其他版本控制系统相同(不同)的方法来传输文件。它们不会帮助你,就像 Git 不会帮助你一样……一般来说。
The most efficient way to transfer lots of small files is as an archive; e.g. ZIP or TAR. If your network is relatively slow, compressing the archive before transmission will make a big difference files. But if the network is really fast, compression may actually make the total time to transfer the files longer. The other factor that makes a big difference is the rate at which the file system can read and (especially) create files.
The Git protocol can be really fast, but it achieves this by only sending files that have changed, and (where possible) sending differences instead of complete files. This approach cannot be used for regular file transfer. Rdist and rsync are older UNIX / Linux tools that take the same (differential) approach to transferring files as Git and other version control systems. They won't help you for the same reasons as Git won't ... in general.
您对压缩这些文件然后使用 ftp 感觉如何?你有可能在接收方解压吗?
Git 是版本控制系统,如果您以后不检查这些文件,则无需在您的文件之上添加 git 的文件。我宁愿使用 ftp。
这是一篇关于 java ftp 库的精彩 文章 (或者你可以使用系统调用控制台 ftp 客户端,但我不喜欢这个想法)
How do you feel about compressing those files and then using ftp? Do you have possibility to decompress on receiver's side?
Git is version control system, there's no need of adding git's files on top of those files of yours, if you will not check out the files later. I'd rather use ftp.
Here's a nice article about java ftp libraries (or you can use a system call to a console ftp client, but I don't like this idea)
Apache VFS 项目是一个 Java 库,您可以在程序中使用它在文件系统之间复制文件(例如,将本地文件复制到 FTP/SCP/HTTP。)
可以配置复制,以便仅复制源中比目标更新的文件,从而减少发送的数据量。
链接
The Apache VFS project is a java library that you can use from your program to copy files between file systems.(E.g. copy local files to FTP/SCP/HTTP.)
Copying can be configured so that only files in the source that are newer than the destination are copied, reducing the amount of data sent.
Links