为什么通过 SFTP 传输文件的时间比 FTP 长?

发布于 2024-12-27 05:02:52 字数 193 浏览 4 评论 0原文

我手动将文件复制到服务器,并将同一文件复制到 SFTP 服务器。 文件大小为 140MB。

FTP:我的速率约为 11MB/s

SFTP:我的速率约为 4.5MB/s

我了解文件在发送之前必须进行加密。这是对文件传输的唯一影响吗? (实际上这并不完全是传输时间,而是加密时间)。

我对这样的结果感到惊讶。

I manually copy a file to a server, and the same one to an SFTP server.
The file is 140MB.

FTP: I have a rate arround 11MB/s

SFTP: I have a rate arround 4.5MB/s

I understand the file has to be encrypted before being sent. Is it the only impact on the file transfer? (and actually this is not exactly transfer time, but encryption time).

I am suprised of such results.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

生活了然无味 2025-01-03 05:02:52

我是 HPN-SSH 的作者,这里的一位评论者要求我发表意见。我想先介绍一些背景知识。首先,请务必记住,SSHv2 是一种多路复用协议 - 单个 TCP 连接上的多个通道。因此,SSH 通道本质上不知道 TCP 使用的底层流量控制算法。这意味着 SSHv2 必须实现自己的流量控制算法。最常见的实现基本上是重新实现滑动窗口。这意味着 SSH 滑动窗口位于 TCP 滑动窗口之上。最终的结果是接收缓冲区的有效大小是两个滑动窗口的接收缓冲区中的最小值。 Stock OpenSSH 的最大接收缓冲区大小为 2MB,但这实际上最终接近 ~1.2MB。大多数现代操作系统都有一个可以增长(使用自动调整接收缓冲区)的缓冲区,最大有效大小为 4MB。为什么这很重要?如果接收缓冲区大小小于带宽延迟乘积 (BDP),那么无论您的系统有多快,您都将永远无法完全填满管道。

由于 SFTP 在 TCP 和 SSH 流量控制的基础上添加了另一层流量控制,因此情况变得更加复杂。 SFTP 使用未完成消息的概念。每条消息可以是命令、命令的结果或批量数据流。未完成的消息可能达到特定的数据报大小。因此,您最终会得到您可能认为是另一个接收缓冲区的东西。该接收缓冲区的大小是数据报大小*最大未完成消息(两者都可以在命令行上设置)。默认为 32k * 64 (2MB)。因此,在使用 SFTP 时,您必须确保 TCP 接收缓冲区、SSH 接收缓冲区和 SFTP 接收缓冲区都有足够的大小(不能太大,否则可能会在交互式会话中出现过度缓冲问题)。

HPN-SSH 通过最大缓冲区大小约为 16MB 来直接解决 SSH 缓冲区问题。更重要的是,通过轮询 proc 条目以获得 TCP 连接的缓冲区大小(基本上是在第 3 层和第 4 层之间戳一个洞),缓冲区会动态增长到适当的大小。这可以避免几乎所有情况下的过度缓冲。在 SFTP 中,我们将未完成请求的最大数量提高到 256。至少我们应该这样做 - 看起来该更改没有按预期传播到 6.3 补丁集(尽管是在 6.2 中)。我会尽快修复该问题)。没有 6.4 版本,因为 6.3 完全针对 6.4 进行了修补(这是 6.3 的 1 行安全修复)。您可以从 sourceforge 获取补丁集。

我知道这听起来很奇怪,但正确调整缓冲区大小是性能方面最重要的变化。尽管许多人认为加密在大多数情况下并不是性能不佳的真正根源。您可以通过将数据传输到越来越远的源(就 RTT 而言)来向自己证明这一点。您会注意到 RTT 越长,吞吐量就越低。这清楚地表明这是一个依赖于 RTT 的性能问题。

不管怎样,通过这一改变,我开始看到高达 2 个数量级的改进。如果您了解 TCP,您就会明白为什么会产生如此大的差异。这与数据报的大小或数据包的数量或类似的东西无关。之所以如此,是因为为了有效利用网络路径,您必须拥有一个等于两台主机之间传输的数据量的接收缓冲区。这也意味着,如果路径不够快且不够长,您可能看不到任何改进。如果 BDP 小于 1.2MB,HPN-SSH 可能对您没有任何价值。

如果您需要端到端的完整加密,并行 AES-CTR 密码可以提高多核系统的性能。通常我建议人们(或控制服务器和客户端)使用 NONE 密码开关(加密身份验证,以明文方式传递批量数据),因为大多数数据并不那么敏感。但是,这仅适用于 SCP 等非交互式会话。它在 SFTP 中不起作用。

还有一些其他性能改进,但没有什么比正确调整缓冲区大小和加密工作更重要的了。当我有一些空闲时间时,我可能会管道化 HMAC 进程(目前对性能的最大拖累)并做一些更小的优化工作。

那么,如果 HPN-SSH 如此出色,为什么 OpenSSH 没有采用它呢?这是一个很长的故事,了解 OpenBSD 团队的人可能已经知道答案。我理解他们的许多原因 - 这是一个大补丁,需要他们进行额外的工作(而且他们是一个小团队),他们不太关心性能而不是安全性(尽管 HPN-SSH 没有安全隐患) ),等等等等。但是,即使 OpenSSH 不使用 HPN-SSH,Facebook 也会使用。谷歌、雅虎、苹果、有史以来最大的大型研究数据中心、美国宇航局、美国国家海洋和大气管理局、政府、军队和大多数金融机构也是如此。目前已经经过很好的审查。

如果有人有任何疑问,请随时提问,但我可能无法及时了解此论坛的最新情况。您可以随时通过 HPN-SSH 电子邮件地址向我发送邮件(谷歌搜索)。

I'm the author of HPN-SSH and I was asked by a commenter here to weigh in. I'd like to start with a couple of background items. First off, it's important to keep in mind that SSHv2 is a multiplexed protocol - multiple channels over a single TCP connection. As such, the SSH channels are essentially unaware of the underlying flow control algorithm used by TCP. This means that SSHv2 has to implement its own flow control algorithm. The most common implementation basically reimplements sliding windows. The means that you have the SSH sliding window riding on top of the TCP sliding window. The end results is that the effective size of the receive buffer is the minimum of the receive buffers of the two sliding windows. Stock OpenSSH has a maximum receive buffer size of 2MB but this really ends up being closer to ~1.2MB. Most modern OSes have a buffer that can grow (using auto-tuning receive buffers) up to an effective size of 4MB. Why does this matter? If the receive buffer size is less than the bandwidth delay product (BDP) then you will never be able to fully fill the pipe regardless of how fast your system is.

This is complicated by the fact that SFTP adds another layer of flow control onto of the TCP and SSH flow controls. SFTP uses a concept of outstanding messages. Each message may be a command, a result of a command, or bulk data flow. The outstanding messages may be up to a specific datagram size. So you end up with what you might as well think of as yet another receive buffer. The size of this receive buffer is datagram size * maximum outstanding messages (both of which may be set on the command line). The default is 32k * 64 (2MB). So when using SFTP you have to make sure that the TCP receive buffer, the SSH receive buffer, and the SFTP receive buffer are all of sufficient size (without being too large or you can have over buffering problems in interactive sessions).

HPN-SSH directly addresses the SSH buffer problem by having a maximum buffer size of around 16MB. More importantly, the buffer dynamically grows to the proper size by polling the proc entry for the TCP connection's buffer size (basically poking a hole between layers 3 and 4). This avoids overbuffering in almost all situations. In SFTP we raise the maximum number of outstanding requests to 256. At least we should be doing that - it looks like that change didn't propagate as expected to the 6.3 patch set (though it is in 6.2. I'll fix that soon). There isn't a 6.4 version because 6.3 patches cleanly against 6.4 (which is a 1 line security fix from 6.3). You can get the patch set from sourceforge.

I know this sounds odd but right sizing the buffers was the single most important change in terms of performance. In spite of what many people think the encryption is not the real source of poor performance in most cases. You can prove this to yourself by transferring data to sources that are increasingly far away (in terms of RTT). You'll notice that the longer the RTT the lower the throughput. That clearly indicates that this is an RTT dependent performance problem.

Anyway, with this change I started seeing improvements of up to 2 orders of magnitude. If you understand TCP you'll understand why this made such a difference. It's not about the size of the datagram or the number of packets or anything like that. It's entire because in order to make efficient use of the network path you must have a receive buffer equal to the amount of data that can be in transit between the two hosts. This also means that you may not see any improvement whatsoever if the path isn't sufficiently fast and long enough. If the BDP is less than 1.2MB HPN-SSH may be of no value to you.

The parallelized AES-CTR cipher is a performance boost on systems with multiple cores if you need to have full encryption end to end. Usually I suggest people (or have control over both the server and client) to use the NONE cipher switch (encrypted authentication, bulk data passed in clear) as most data isn't all that sensitive. However, this only works in non-interactive sessions like SCP. It doesn't work in SFTP.

There are some other performance improvements as well but nothing as important as the right sizing of the buffers and the encryption work. When I get some free time I'll probably pipeline the HMAC process (currently the biggest drag on performance) and do some more minor optimization work.

So if HPN-SSH is so awesome why hasn't OpenSSH adopted it? That's a long story and people who know the OpenBSD team probably already know the answer. I understand many of their reasons - it's a big patch which would require additional work on their end (and they are a small team), they don't care as much about performance as security (though there is no security implications to HPN-SSH), etc etc etc. However, even though OpenSSH doesn't use HPN-SSH Facebook does. So do Google, Yahoo, Apple, most ever large research data center, NASA, NOAA, the government, the military, and most financial institutions. It's pretty well vetted at this point.

If anyone has any questions feel free to ask but I may not be keeping up to date on this forum. You can always send me mail via the HPN-SSH email address (google it).

音盲 2025-01-03 05:02:52

更新:正如评论者指出的那样,我在下面概述的问题在这篇文章之前的某个时间已经得到解决。然而,我知道 HP-SSH 项目,并且我要求作者权衡一下。正如他们在(理所当然)得票最高的答案中所解释的那样,加密不是问题的根源。为电子邮件和比我聪明的人而欢呼!

哇,一个一年前的问题,除了错误的答案之外什么也没有。然而,我必须承认,当我问自己同样的问题时,我认为速度减慢是由于加密造成的。但问自己下一个逻辑问题:您的计算机加密和解密数据的速度有多快?如果您认为该速率接近 OP 报告的 4.5Mb/秒(0.5625MB 或大约 5.5 英寸软盘容量的一半!),请扇自己几下,喝点咖啡, 再次问自己同样的问题。


这显然与数据包大小选择中的疏忽有关,或者至少是这样的LIBSSH2 的作者说

SFTP 的本质及其发送的每个小数据块的 ACK,使得最初的简单 SFTP 实现在通过高延迟网络发送数据时会受到严重影响。如果您必须为每 32KB 的数据等待几百毫秒,那么永远不会有快速的 SFTP 传输。 libssh2 直到(包括 libssh2 1.2.7)都提供了这种简单的实现。

所以速度下降是由于微小的数据包大小 x 每个数据包的强制 ack 响应,这显然是疯狂的。

高性能 SSH/SCP (HP-SSH) 项目提供OpenSSH 补丁集明显改进了内部缓冲区以及并行加密。但请注意,即使是非并行化版本的运行速度也高于一些评论者获得的 40Mb/s 未加密速度。该修复涉及更改 OpenSSH 调用加密库(而不是密码)的方式,并且 AES128 和 AES256 之间的速度差异为零。加密需要一些时间,但这是微不足道的。它在 90 年代可能很重要,但(就像 Java 与 C 的速度一样)它已经不再重要了。

UPDATE: As a commenter pointed out, the problem I outline below was fixed some time before this post. However, I knew of the HP-SSH project and I asked the author to weigh in. As they explain in the (rightfully) most upvoted answer, encryption is not the source of the problem. Yay for email and people smarter than myself!

Wow, a year-old question with nothing but incorrect answers. However, I must admit that I assumed the slowdown was due to encryption when I asked myself the same question. But ask yourself the next logical question: how quickly can your computer encrypt and decrypt data? If you think that rate is anywhere near the 4.5Mb/second reported by the OP (.5625MBs or roughly half the capacity of a 5.5" floppy disk!) smack yourself a few times, drink some coffee, and ask yourself the same question again.


It apparently has to do with what amounts to be an oversight in the packet size selection, or at least that's what the author of LIBSSH2 says,

The nature of SFTP and its ACK for every small data chunk it sends, makes an initial naive SFTP implementation suffer badly when sending data over high latency networks. If you have to wait a few hundred milliseconds for each 32KB of data then there will never be fast SFTP transfers. This sort of naive implementation is what libssh2 has offered up until and including libssh2 1.2.7.

So the speed hit is due to tiny packet sizes x mandatory ack responses for each packet, which is clearly insane.

The High Performance SSH/SCP (HP-SSH) project provides an OpenSSH patch set which apparently improves the internal buffers as well as parallelizing encryption. Note, however, that even the non-parallelized versions ran at speeds above the 40Mb/s unencrypted speeds obtained by some commenters. The fix involves changing the way in which OpenSSH was calling the encryption libraries, NOT the cipher and there is zero difference in speed between AES128 and AES256. Encryption takes some time, but it is marginal. It might have mattered back in the 90's but (like the speed of Java vs C) it just doesn't matter anymore.

⊕婉儿 2025-01-03 05:02:52

有几个因素会影响 SFTP 传输速度:

  1. 加密。尽管对称加密速度很快,但它的速度也不会那么快而被忽视。如果您比较快速网络(100mbit 或更大)上的速度,加密就会成为您的进程的一个障碍。
  2. 哈希计算和检查。
  3. 缓冲区复制。与普通 FTP 相比,在 SSH 之上运行的 SFTP 导致每个数据块被复制至少 6 次(每侧 3 次),在普通 FTP 中,数据在最佳情况下可以传递到网络接口而根本不被复制。块复制也需要一些时间。

Several factors affect speed of SFTP transfer:

  1. Encryption. Though symmetric encryption is fast, it's not that fast to be unnoticed. If you comparing speeds on fast network (100mbit or larger), encryption becomes a break for your process.
  2. Hash calculation and checking.
  3. Buffer copying. SFTP running on top of SSH causes each data block to be copied at least 6 times (3 times on each side) more comparing to plain FTP where data in best cases can be passed to network interface without being copied at all. And block copy takes a bit of time as well.
在巴黎塔顶看东京樱花 2025-01-03 05:02:52

对于那些仍然发现这个问题并寻找不需要修补 OpenSSH 的答案的人。我是名为 Push SFTP 的开源 GPL 项目的作者,该项目可在 Push SFTP 上获取="https://github.com/sshtools/push-sftp" rel="nofollow noreferrer">GitHub。它是一个命令行客户端,类似于标准 SFTP 命令,添加了一个使用并行性上传文件的 push 命令。

它不需要 OpenSSH 的修补版本,测试表明使用 push 机制时吞吐量平均增加了 2.5 倍。此方法适用于所有主要发行版上的标准 OpenSSH 服务器。它依赖于随机访问支持,因此在某些情况下它不起作用,例如,SFTP 服务器具有指向 S3 等基于云的存储的自定义文件系统。

此外,作为支持客户端的开源 Java SSH 库的开发者,Maverick Synergy,我们在 API 中内置了支持,供其他开发人员在他们的项目中使用。

For those still finding this question and looking for an answer that does not require patching OpenSSH. I am the author of an open-source GPL project called Push SFTP that is available on GitHub. It's a command-line client, similar to the standard SFTP command, that adds a push command that uploads files using parallelism.

It does not require a patched version of OpenSSH, and testing has shown an average x2.5 increase in throughput when using the push mechanism. This method works with standard OpenSSH servers on all major distributions. It relies on random access support, so there are circumstances where it does not work, for example, where the SFTP server has custom file systems pointing to cloud-based storage like S3.

In addition, as the developer of the open-source Java SSH library that backs the client, Maverick Synergy, we have built support into the API for other developers to utilise in their projects.

§普罗旺斯的薰衣草 2025-01-03 05:02:52

加密不仅有cpu,还有一些网络开销。

Encryption has not only cpu, but also some network overhead.

终难愈 2025-01-03 05:02:52

你的结果是有道理的。由于 FTP 在非加密通道上运行,因此它比 SFTP(SSH 版本 2 协议之上的子系统)更快。另请记住,SFTP 是基于数据包的协议,与基于命令的 FTP 不同。

SFTP 中的每个数据包在从客户端写入传出套接字之前都会进行加密,随后在服务器接收时进行解密。这当然会导致传输速度缓慢,但传输非常安全。使用诸如 zlib 之类的压缩和 SFTP 可以提高传输时间,但仍然无法接近纯文本 FTP。也许更好的比较是比较 SFTP 和 FTPS,两者都使用加密?

SFTP 的速度取决于用于加密/解密的密码、使用的压缩(例如 zlib)、数据包大小和用于套接字连接的缓冲区大小。

Your results make sense. Since FTP operates over a non-encrypted channel it is faster than SFTP (which is subsystem on top of the SSH version 2 protocol). Also remember that SFTP is a packet based protocol unlike FTP which is command based.

Each packet in SFTP is encrypted before being written to the outgoing socket from the client and subsequently decrypted when received by the server. This of-course leads to slow transfer rates but very secure transfer. Using compression such as zlib with SFTP improves the transfer time but still it won't be anywhere near plain text FTP. Perhaps a better comparison is to compare SFTP with FTPS which both use encryption?

Speed for SFTP depends on the cipher used for encryption/decryption, the compression used e.g. zlib, packet sizes and buffer sizes used for the socket connection.

梦里南柯 2025-01-03 05:02:52

SFTP 不是基于 SSH 的 FTP,它是一种不同的协议,与 SCP 类似,它提供了更多功能。

SFTP is not FTP over SSH, it's a different protocol and being similar to SCP, it's offers more capabilities.

残疾 2025-01-03 05:02:52

有各种各样的东西可以做到这一点。一种可能性是“流量整形”。这通常在办公环境中完成,以便为关键业务活动保留带宽。出于非常相似的原因,它也可能由网络托管公司或您的 ISP 完成。

您也可以非常简单地在家中进行设置。

例如,可能存在为 FTP 保留最小带宽的规则,而 SFTP 可能属于“其他所有”规则。或者可能存在限制 SFTP 带宽的规则,但其他人也与您同时使用 SFTP。

那么:您将文件从哪里传输到哪里?

There are all sorts of things which can do this. One possiblity is "Traffic Shaping". This is commonly done in office environments to reserve bandwidth for business critical activities. It may also be done by the web hosting company, or by your ISP, for very similar reasons.

You can also set it up at home very simply.

For example there may be a rule reserving minimum bandwidth for FTP, while SFTP might be falling under an "everything else" rule. Or there might be a rule capping bandwidth for SFTP, but someone else is also using SFTP at the same time as you.

So: Where are you tranferring the file from and to?

浅唱々樱花落 2025-01-03 05:02:52

为了进行比较,我尝试将 299GB ntfs 磁盘映像从运行 Raring Ringtail Ubuntu alpha 2 live cd 的 i5 笔记本电脑传输到运行 Ubuntu 12.04.1 的 i7 桌面。报告速度:

通过 wifi + 电力线:
scp:5MB/秒(40 Mbit/秒),

通过千兆位以太网 + netgear G5608 v3:

scp:44MB/秒

sftp:47MB/秒

sftp -C:13MB/秒

因此,在良好的千兆位链路上,sftp 比 scp 稍快, 2010 时代的快速 CPU 似乎足够快来加密,
但压缩并不是在所有情况下都是胜利。

不过,在糟糕的千兆位以太网链路上,我发现 sftp 的性能远远优于 scp。关于 scp 非常健谈的事情,请参阅 2008 年 comp.security.ssh 上的“scp UNBELIEVABLY Slow”:
https://groups.google.com/forum/ ?fromgroups=#!topic/comp.security.ssh/ldPV3msFFQw
http://fixunix.com/ssh/368694-scp-untrustously-slow.html

For comparison, I tried transfering a 299GB ntfs disk image from an i5 laptop running Raring Ringtail Ubuntu alpha 2 live cd to an i7 desktop running Ubuntu 12.04.1. Reported speeds:

over wifi + powerline:
scp: 5MB/sec (40 Mbit/sec)

over gigabit ethernet + netgear G5608 v3:

scp: 44MB/sec

sftp: 47MB/sec

sftp -C: 13MB/sec

So, over a good gigabit link, sftp is slightly faster than scp, 2010-era fast CPUs seem fast enough to encrypt,
but compression isn't a win in all cases.

Over a bad gigabit ethernet link, though, I've had sftp far outperform scp. Something about scp being very chatty, see "scp UNBELIEVABLY slow" on comp.security.ssh from 2008:
https://groups.google.com/forum/?fromgroups=#!topic/comp.security.ssh/ldPV3msFFQw
http://fixunix.com/ssh/368694-scp-unbelievably-slow.html

剩一世无双 2025-01-03 05:02:52

是的,加密会给你的CPU增加一些负载,但是如果你的CPU不是很古老,那么影响应该不会像你说的那么大。

如果启用 SSH 压缩,尽管有 SSH 加密,SCP 实际上比 FTP 更快(如果我记得的话,对于我尝试过的文件来说,速度是 FTP 的两倍)。我实际上没有使用过 SFTP,但我相信它使用 SCP 进行实际的文件传输。所以请尝试一下并让我们知道:-)

Yes, encryption add some load to your cpu, but if your cpu is not ancient that should not affect as much as you say.

If you enable compression for SSH, SCP is actually faster than FTP despite the SSH encryption (if I remember, twice as fast as FTP for the files I tried). I haven't actually used SFTP, but I believe it uses SCP for the actual file transfer. So please try this and let us know :-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文