复制或 rsync 命令
以下命令按预期工作...
cp -ur /home/abc/* /mnt/windowsabc/
rsync
比它有什么优势吗?有没有更好的方法让备份文件夹每 24 小时保持同步?
The following command is working as expected...
cp -ur /home/abc/* /mnt/windowsabc/
Does rsync
has any advantage over it? Is there a better way to keep to backup folder in sync every 24 hours?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
Rsync 更好,因为它只会复制更新文件的更新部分,而不是整个文件。如果需要,它还可以使用压缩和加密。查看此教程。
Rsync is better since it will only copy only the updated parts of the updated file, instead of the whole file. It also uses compression and encryption if you want. Check out this tutorial.
rsync 不一定更高效,因为它执行的文件和块的清单更详细。该算法的功能非常出色,但您需要了解您的问题才能知道它是否真的是最佳选择。
在非常大的文件系统(例如数千或数百万个文件)上,文件往往会被添加但不会更新,“cp -u”可能会更有效。 cp 决定仅复制元数据,并且可以简单地开始复制业务。
请注意,您可能需要一些缓冲,例如使用 tar 而不是直接 cp,具体取决于文件的大小、网络性能、其他磁盘活动等。我发现以下想法非常有用:
元数据本身实际上可能会成为重大开销在非常大的(集群)文件系统上,但 rsync 和 cp 会共享这个问题。
rsync 似乎经常是首选工具(在通用应用程序中是我通常的默认选择),但可能有很多人不经过深思熟虑就盲目使用 rsync。
rsync is not necessarily more efficient, due to the more detailed inventory of files and blocks it performs. The algorithm is fantastic at what it does, but you need to understand your problem to know if it is really going to be the best choice.
On a very large file system (say many thousands or millions of files) where files tend to be added but not updated, "cp -u" will likely be more efficient. cp makes the decision to copy solely on metadata and can simply get to the business of copying.
Note that you might want some buffering, e.g. by using tar rather than straight cp, depending on the size of the files, network performance, other disk activity, etc. I find the following idea very useful:
Metadata itself may actually become a significant overhead on very large (cluster) file systems, but rsync and cp will share this problem.
rsync seems to frequently be the preferred tool (and in general purpose applications is my usual default choice), but there are probably many people who blindly use rsync without thinking it through.
所编写的命令将创建具有当前日期和时间戳的新目录和文件,并将您自己作为所有者。如果您是系统上的唯一用户并且您每天都这样做,那么这可能并不重要。但是,如果保留这些属性对您很重要,您可以使用
-p 修改命令,以保留文件的所有权、时间戳和模式。这可能非常重要,具体取决于您要备份的内容。
使用 rsync 的替代命令是
使用 rsync,-a 表示“存档”,它保留上述所有属性。 -v 表示“详细”,仅列出运行时对每个文件执行的操作。 -z 此处省略用于本地副本,但用于压缩,如果您通过网络备份,这将有所帮助。最后,-h 告诉 rsync 以人类可读的格式报告大小,如 MB、GB 等。
出于好奇,我运行了一个副本来启动系统并避免对第一次运行产生偏差,然后我对从内部 SSD 驱动器到 USB 连接 HDD 的 1GB 文件的测试运行进行了以下计时。这些只是复制到空的目标目录。
这两个命令似乎大致相同,尽管压缩和解压缩显然会给带宽不是瓶颈的系统带来负担。
The command as written will create new directories and files with the current date and time stamp, and yourself as the owner. If you are the only user on your system and you are doing this daily it may not matter much. But if preserving those attributes matters to you, you can modify your command with
The -p will preserve ownership, timestamps, and mode of the file. This can be pretty important depending on what you're backing up.
The alternative command with rsync would be
With rsync, -a indicates "archive" which preserves all those attributes mentioned above. -v indicates "verbose" which just lists what it's doing with each file as it runs. -z is left out here for local copies, but is for compression, which will help if you are backing up over a network. Finally, the -h tells rsync to report sizes in human-readable formats like MB,GB,etc.
Out of curiosity, I ran one copy to prime the system and avoid biasing against the first run, then I timed the following on a test run of 1GB of files from an internal SSD drive to a USB-connected HDD. These simply copied to empty target directories.
Both commands seem to be about the same, although zipping and unzipping obviously tax the system where bandwidth is not a bottleneck.
特别是如果您使用像 BTRFS 或 ZFS 这样的写时复制文件系统,
rsync
会更好。我使用 BTRFS,并且我的
~/.bashrc
中有这个:对于像 BTRFS 这样的 CoW FS,这里的重要标志是
--inplace
因为它只复制更改的部分文件,不会为文件之间的微小更改创建新的 inode 等。请参阅此。Especially if you use a copy-on-write filesystem like BTRFS or ZFS,
rsync
is much better.I use BTRFS, and I have this in my
~/.bashrc
:The important flag here for CoW FSs like BTRFS is
--inplace
because it only copies the changed part of the files, doesn't create new inodes for small changes between files, etc. See this.这实际上并不是什么更有效的问题。
命令“rsync”和“cp”并不等效,并且实现不同的目标。
1- rsync 可以保留现有文件的创建时间。 (使用 -a 选项)
2- rsync 将运行多进程并使用本地套接字或网络套接字进行传输。 (即将自身分叉为多个进程)
3- 复制大量小文件甚至多个较大文件时,多处理和线程将提高吞吐量。
所以底线是 rsync 适用于大数据,而 cp 适用于较小的本地复制。 (MB 到小 GB 范围)。当您开始使用多个 GB 或 TB 范围时,请使用 rsync。当然还有网络副本、rsync。
It's not really a question of what's more efficient.
The commands 'rsync', and 'cp' are not equivalent and achieve different goals.
1- rsync can preserve the time of creation of existing files. (using -a option)
2- rsync will run multiprocess and transfer using either local sockets or network sockets. (i.e. fork itself into multiple processes)
3- The multiprocessing, and threading will increase your throughput when copying large number of small files, and even with multiple larger files.
So bottom line is rsync is for large data, and cp is for smaller local copying. (MB to small GB range). When you start getting into multiple GB or in the TB range, go with rsync. And of course network copies, rsync all the way.
我更喜欢使用带有以下选项的 rsync
上面的参数可以定义如下:
I will prefer to use rsync with the following options
The above parameters can be defined as follows :
对于本地副本,rsync 的唯一优点是,如果目标目录中已存在该文件,它将避免复制。 “已存在”的定义是(a)相同的文件名(b)相同的大小(c)相同的时间戳。 (也许相同的所有者/组;我不确定......)
“rsync 算法”非常适合通过慢速网络链接进行文件的增量更新,但它不会为您购买太多本地副本,因为它需要读取现有(部分)文件以运行其“diff”计算。
因此,如果您经常运行此类命令,并且更改的文件集相对于文件总数而言很小,您应该会发现 rsync 比 cp 更快。 (rsync 还有一个
--delete
选项,您可能会发现它很有用。)For a local copy, the only advantage of rsync is that it will avoid copying if the file already exists in the destination directory. The definition of "already exists" is (a) same file name (b) same size (c) same timestamp. (Maybe same owner/group; I am not sure...)
The "rsync algorithm" is great for incremental updates of a file over a slow network link, but it will not buy you much for a local copy, as it needs to read the existing (partial) file to run it's "diff" computation.
So if you are running this sort of command frequently, and the set of changed files is small relative to the total number of files, you should find that rsync is faster than cp. (Also rsync has a
--delete
option that you might find useful.)请记住,在计算机内部传输文件(即不是网络传输)时,使用 -z 标志可能会在传输时间上产生巨大差异。
在同一台机器内传输
Keep in mind that while transferring files internally on a machine i.e not network transfer, using the -z flag can have a massive difference in the time taken for the transfer.
Transfer within same machine
如果您使用 cp,则在复制同名文件夹时不会保存现有文件。假设你有这个文件夹:
然后你将一个文件夹复制到另一个文件夹上:
结果:
这至少是 macOS 上发生的情况,我想保留 diff 文件,所以我使用了 rsync。
if you are using cp doesn't save existing files when copying folders of the same name. Lets say you have this folders:
Then you copy one over the other:
result:
This is at least what happens on macOS and I wanted to preserve the diff files so I used rsync.
我使用rsynk通过USB 3.0将330G数据从本地硬盘传输到外部硬盘。我花了三天时间。暂停作业后,传输速率一度降至 800 Kb/s,并一度升至 50 M/s。这是一个典型的过度缓冲问题。本地文件传输体验不佳:顾名思义,(R)sync 代表远程同步(针对通过网络传输进行了优化)。正如经常发生的那样,我只是在想知道这个问题并寻求理解之后才发现了“-z”标志
I used rsynk to transfer 330G data from a local HD to a external HD via USB 3.0. It took me three days. The transfer rate went down to 800 Kb/s and rised to 50 M/s for a while only after pausing the job. It is a typical overbuffering issue. Bad experience for local file tranfers: as the name indicates, (R)sync stands for REMOTE-sync (optimized for tranfers via network). As often happens, I discovered the "-z" flag only after I wondered about the issue and looked for an understandment
与 cp 相比,rsync 要好得多,因为 rsync 仅在第一次复制整个文件/目录。下次当您对同一文件/目录使用 rsync 命令时,仅将新更改复制到目标文件夹,而不复制整个文件。
rsync is much much better compared to cp because rsync copies whole files/directory only the first time. The next time when you use rsync command with the same files/directory, only new changes are copied to the destination folder, not the entire files are copied.