通过网络 http POST 文件的效率是否比复制低得多?

发布于 2024-09-27 07:44:17 字数 297 浏览 3 评论 0原文

我们正在开发一个 Windows 服务,它将充当一种“从属”进程。这个过程基本上是下载一个 pdf,将其分成几个 pdf,然后需要将该 pdf 发送回去。

我们当前使用 http 请求来检索 pdf,并使用一些 POST 来发回文件。这样从属服务几乎可以在任何机器上运行,并且可以轻松添加更多从属服务以减轻必要的负载。

我的问题是:使用 http 进行这样的文件传输明显比仅使用复制命令(仅当从机位于同一机器/网络上时才有效)慢。

使用普通命令是可行的,但我喜欢能够将新从站添加到任何地方的灵活性。

We're developing a windows service which will act as a sort of 'slave' process. This process basically downloads a pdf, splits it into several pdfs then needs to send that pdf back.

We're currently using a http request to retrieve the pdf and a number of POSTs to send the files back. This is so the slave service can be run from pretty much any machine and more slaves can be easily added to lighten the load as necesscary.

My question is: is using http for file transfers like this significantly slower than, for example, just using copy commands (which would work only if the slave is on the same machine/network).

Using normal commands is feasible but I like the flexibility in just being able to add a new slave to anywhere.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

深居我梦 2024-10-04 07:44:17

我完全基于直觉的想法是,某些协议(例如 NFS)在某些情况下会比 HTTP 快一些,甚至可能快得多。但我认为这些数据还不够。我认为您需要弄清楚差异对您有多大影响,然后进行一些快速测试。更好的是,凭直觉判断哪个更适合您的需求 — HTTP 肯定更容易通过防火墙和开放互联网,甚至可能通过 VPN — 并且首先尝试一下。如果你碰壁了,尝试其他选择。

更新:在我发布这篇文章后,我想起了Backblaze,在线备份服务,使用 HTTPS 进行其存储设备之间的所有内部数据传输。这在这篇文章中有记录: PB 级预算:
如何构建廉价的云存储
— 跳至“运行免费软件的 Backblaze Storage Pod”。关于 HTTPS 相对于较低级别协议的优势,有一些很好的想法。他们必须快速传输大量数据。因此,如果它对他们有用,那么它也很有可能对你有用。

My completely gut-based thought would be that some protocols, such as NFS, would in some circumstances be somewhat and maybe even significantly faster than HTTP. But I wouldn't think that's enough data to go by. I think you need to just figure out how much of a difference would matter to you, and then run some quick tests. Better yet, make a gut call on which one's a better fit for your needs — HTTP is certainly much easier to get through firewalls and over the open internet, and maybe even over a VPN — and just try that first. If you hit a wall, experiment with other options.

Update: right after I posted this, I remembered that Backblaze, the online backup service, uses HTTPS for all its internal data transfer to and from their storage appliances. This is documented in this post: Petabytes on a budget:
How to build cheap cloud storage
— jump down to "A Backblaze Storage Pod Runs Free Software". There's some good thinking there on the advantages of HTTPS over a lower-level protocol. And they have to transfer a lot of data, quickly. So if it works for them, there's a good chance it'll work for you.

寄居者 2024-10-04 07:44:17

HTTP 的开销并不是很高,除非您的服务器以一种非常不寻常的方式配置(或者它正在处理如此多的流量,以至于这些请求排在来自外部世界的其他 HTTP 请求后面)。

如果您正在谈论专门为此目的而设置的机器(或者,我应该说,进程)恰好使用 HTTP 作为其传输协议,我怀疑您会在传输数据时看到任何明显的延迟。

The overhead of HTTP isn't very high unless your server is configured in a profoundly unusual way (or it's handling so much traffic that these requests are getting queued behind other HTTP requests from the outside world).

If you're talking about a machine (or, I should say, a process) setup specifically for this purpose that just happens to use HTTP as its transfer protocol, I doubt you'll see any noticeable delays in transferring the data.

树深时见影 2024-10-04 07:44:17

rsync 或类似的二进制协议会更快,因为构建 http 请求时不涉及任何开销。它还具有其他一些不错的功能,例如速率限制,这样您就不会过度负担目标主机。

更重要的是,您不必消耗运行网络服务器的资源,也不必担心像 apache 这样的服务的正常运行时间/管理。

但是,如果当前的解决方案对您来说足够快,则没有理由修复未损坏的部分。

rsync or a similar binary protocol will be faster because there's no overhead involved in building the http request. It also has some other nice features like rate limiting so you don't overtax the target host.

More importantly, you don't have to consume resources running a webserver and worry about uptime/management of a service like apache.

However, if the current solution is fast enough for you there's no reason to fix what isn't broken.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文