使用源代码管理同步大型媒体文件夹的开销

发布于 2024-12-27 16:31:25 字数 153 浏览 0 评论 0原文

我正在考虑使用现有的远程连接 svn 存储库基础设施作为同步媒体文件夹的方法。

我所说的媒体是指大约 1Tb 的视频内容。

使用SVN 这样做有什么缺点吗?

传输协议传输大文件是否有效?

存储方面有开销吗?

谢谢

I was thinking about using an existing remotely connected svn repositories infrastructure as a method to also sync media folders.

By media I mean about 1Tb of video content.

Is there any disadvantage by doing it using SVN?

Is the transfer protocol efficient to transfer big files?

Is there an overhead in storage?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

ζ澈沫 2025-01-03 16:31:25

正如我在评论中所说,SVN 是关于源代码控制的。
它不是一个同步工具,即使您也可以将其用于此目的。

也就是说,SVN 可以很好地处理二进制文件(例如媒体文件)。
当然,您无法获得二进制文件的差异,但对于存储来说这是相当不错的。

正如 SVN 手册所述:

请注意,文件是否为二进制并不影响其数量
用于存储对该文件的更改的存储库空间,也不
影响客户端和服务器之间的流量。用于储存
和传输目的,Subversion 使用一种有效的 diffing 方法
对于二进制文件和文本文件同样有效;这与完全无关
“svn diff”命令使用的比较方法。

但请记住,如果您只需要同步,您可能有比 SVN 更好的选择。
因此,在决定使用 SVN 之前,请尝试查看其他解决方案,例如 rsync

As I said in my comment, SVN is about source-control.
It's not a syncying tool, even if you may also use it for that purpose.

That said, SVN works well with binary files (like your media files).
Of course, you can't get a diff on binary files, but it's quite OK for the storage.

As the SVN manual states:

Note that whether or not a file is binary does not affect the amount
of repository space used to store changes to that file, nor does it
affect the amount of traffic between client and server. For storage
and transmission purposes, Subversion uses a diffing method that works
equally well on binary and text files; this is completely unrelated to
the diffing method used by the 'svn diff' command.

But remember that, if you only need syncing, you may have better options than SVN.
So try to take a look at other solutions, like rsync for instance, before deciding to use SVN.

樱花落人离去 2025-01-03 16:31:25

我强烈建议您不要使用 Subversion(或任何为源代码设计的版本控制系统)来执行此操作。 Subversion 可以处理二进制文件,但您无法获得合并、比较等功能,而这些功能是使用 Subversion 等工具的主要原因。从技术上讲,它很可能有效;然而,有更好的工具可以完成这项工作。

如果您所做的只是同步文件夹,请尝试类似 rsync 的操作。

如果需要的话,AlienBrain 之类的东西可以为您提供版本控制功能。

I highly recommend that you do not use Subversion (or any version control system designed for source code) for doing this. Subversion can handle binary files, but you don't get the merge, diff, etc. capabilities that are the main reason for using something like Subversion. Technically, it would most likely work; however, there are much better tools for the job.

If all you are doing is synchronizing folders, try something like rsync.

Something like AlienBrain could give you versioning capabilities, if needed.

鱼忆七猫命九 2025-01-03 16:31:25

Subversion 客户端有大量开销。

对大型媒体文件使用 Subversion 的主要问题是标准客户端将文件的“原始”副本存储在本地工作区中,从而有效地将库的签出副本的大小加倍。这是设计人员有意识的选择,有利于源代码控制的标准颠覆用例,但对于大型二进制文件来说并不是最佳选择。曾经有一些替代客户没有这个问题——我最近没有看过。

Subversion 服务器端通常会出现开销/效率低下的情况。

第二个问题是,如果存储压缩的二进制文件,文件内容的微小变化将产生完全不同的压缩结果,从而使颠覆增量存储策略毫无用处。请注意,MP3 和许多视频格式都经过压缩,但压缩发生在块上。因此,当您看到这个问题正在发生时——一个小的变化就会影响整个块——它不一定会影响整个文件。

SCM 工具中的假设

我不同意 SCM 工具对于您想要做的事情来说一定是不好的选择,但它们中的大多数在设计时没有考虑到这一点,因此存在问题。对于我的媒体要求,工作流程更接近于 SCM 工具,而不是 rsync、unison 等同步工具。然而,SCM 中的一些基本假设不适用于大文件。

如果您的工作流程会对文件进行大量更改,您可能需要相当积极地清除旧版本的数据。大多数 SCM 工具的设计理念是旧数据极其非常重要,而且旧数据的存储相对便宜(文件很小且压缩良好)。因此,虽然我认识的所有 SCM 工具(很多!)都有办法做到这一点,但通常(并且有意)不容易做到。

结论

总之,你有一些令人不快的选择。 Subversion 等 SCM 工具对存储空间和保留进行了假设,这些假设可能会迅速增加存储需求,但文件同步工具通常不支持大多数人在媒体库中真正喜欢的工作流程。

何时使用 Subversion

如果您不介意将客户端设备上使用的磁盘空间加倍,并且拥有一台具有额外 2 倍磁盘空间(扩展空间)的服务器,并且您希望很少更改文件(例如,仅更改 MP3 标签)或类似的)那么颠覆可能适合你。

如果您正在突破存储空间(SSD?)的限制,那么“开箱即用”的颠覆可能不是您想要的,尽管您仍然可以将其用作解决方案的一部分。

There is substantial overhead on the subversion client side.

The main problem for using subversion for large media files is that the standard client stores "pristine" copies of your files in the local workspace, thus effectively doubling the size of a checked-out copy of your library. This is a conscious choice on the designers parts that is good for the standard subversion use case of source control but not optimal for large binary files. At one point there were alternate clients that do not have this problem -- I haven't looked lately.

There is often overhead/inefficiencies on the subversion server side.

A secondary problem is that if you store compressed binary files, a small change in the file content will produce an entirely different compression result, rendering subversions delta storage strategy useless. Note that MP3 and many video formats are compressed, but that compression happens on blocks. So while you see this problem in action -- a small change will affect the entire block -- it will not necessarily affect the entire file.

Assumptions in SCM tools

I don't agree that SCM tools are necessarily bad choices for what you're trying to do, but most of them are not designed with this in mind and therefore have problems. For my media requirements the workflow is a lot closer to an SCM tool than a sync tool such as rsync, unison or the like. However, some of the fundamental assumptions in SCM don't work well for large files.

If you have a workflow that creates a lot of changes to the files you'll probably want to purge old version of your data fairly aggressively. Most SCM tools are designed with the ideas that old data is extremely important and also that storage of old data is relatively cheap (the files are small and compress well). Therefore, while all SCM tools of my acquaintance (many!) have ways to do this, it's typically (and intentionally) not easy to do.

Conclusion

In conclusion, you have your choice of unpalatable options. SCM tools such as subversion have assumptions about storage space and retention that can rapidly balloon storage requirements but file sync tools usually don't support the workflow most people would really like in their media library.

When to use Subversion

If you don't mind doubling the disk space used on client devices and have a server with an additional 2x the disk space (room for expansion) and you expect to change your files fairly rarely (e.g. just change the MP3 tags or some such) then subversion might work well for you.

If you're pushing the limits on your storage space (SSD?) then "out of the box" subversion is probably not what you want though you might still be able to use it as part of your solution.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文