任何版本控制系统(例如 SVN、Git 或 Mercurial)是否可以让您“保留最新版本”?但不是修订版? (例如二进制文件)

发布于 2024-09-05 09:14:11 字数 334 浏览 10 评论 0原文

在我们的项目文件中,如果有二进制文件,例如.doc、.xls、.jpg,并且我们选择不保留它们过去的修订版本(只保留最新版本就可以了),有没有办法告诉SVN、Git 、Mercurial 或其他一些工具来跳过这些文件或特定文件夹的修订?

比如说,有一个 4MB 的 .doc 文件,我需要签入数百次,但我不太关心它过去的版本。因此,如果系统保留 100 个版本,则已经是 400MB……签入 300 次意味着 1 个文件需要 1.2GB,这并不好。只有最新版本才是好的,以便每个人都可以同步到它。另外,我不希望其他人查看该项目并且必须查看 20GB 的内容。 (Git 和 Mercurial 会将所有修订保留在每个人的本地存储库中吗?)

In our project files, if there are binary files, such as .doc, .xls, .jpg, and we choose to not keep their past revisions (just keeping a latest version is ok), is there a way to tell SVN, Git, or Mercurial or some other tool to skip the revisions for these files or for a particular folder?

Say, there is a 4MB .doc file that I need to check in hundred of times, but I don't really care so much about its past versions. So if the system keeps 100 revisions of it, that's already 400MB... checking in 300 times means 1.2GB for 1 file and that's not good. Only the latest version is good so that everybody can sync to it. Also I don't want other people check out the project and have to check out 20GB of stuff. (will Git and Mercurial keep all revision in each person's local repository?)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

罪歌 2024-09-12 09:14:11

请注意,这并不是一个完整的答案。

如果我放弃围绕不为后代保留文件的正确版本的讨论,我至少会评论您问题的一部分,这可能会让您重新考虑不将文件的所有修订版本保留在存储库中。

版本控制系统通常不会存储每个新修订版的整个文件,而是存储更改。根据系统的不同,您有时可能会拥有该文件的完整副本,但大多数变更集只是更改。

例如,在 Mercurial 中,我尝试了以下操作:首先,我从以下网址下载了 C# 3.0 语言规范作为 word 文件: http://download.microsoft.com/download/3/8/8/388e7205-bc10-4226-b2a8-75351c669b09 /CSharp%20Language%20Specification.doc

然后我将其提交到一个新的 Mercurial 存储库。提交前(空存储库)的大小为 80 字节,磁盘上文件的大小为 2.387.968 字节,提交后存储库的大小为 2.973.696 字节。请注意,该文件现在有效地存储了两次,一次在我的工作副本(我可以编辑的副本)中,一次作为我初始提交的一部分在我的存储库中。

然后我打开该文件,并将所有出现的 3.0 更改为 4.0 (不带引号),并将所有出现的 C# 更改为 VB,并保存。然后我用一个字母的评论提交了新版本。提交后存储库的大小现在为 3.497.984 字节。差异是 512KB(存储库中涉及一些分块,因此大小是精确的 512KB 值。)

如果我现在再次打开该文件,仅将标题页 VB 更改回 C#,保存并再次提交,则大小为存储库增长了 276KB,达到 3.780.608 字节。

正如您所看到的,更改不会提交文件的整个副本,但当然,差异也不在“10KB”范围内。

我们假设每个差异的平均大小(仅针对该文件)将介于这两个值之间,假设两个值之间的平均值为 50%。这意味着对此文件进行了 300 次更改,平均为 394KB,总计 115MB。 这不是很多

我的建议如下:

  • 别再小气了,磁盘空间很便宜,相比之下,当有人说“我真的希望我知道那个文件是什么样的”时,你会感到头疼上周在你破坏它之前”。

Note that this is not quite an answer.

If I forgo the discussion around not keeping the correct version of the file for posterity, I will at least comment on one part of your question, that might make you reconsider not keeping all the revisions of the file in the repository.

Version control systems typically doesn't store the entire file on each new revision, they store changes. Depending on the system, you might occasionally have a full copy of the file, but most of the changesets will be changes only.

For instance, in Mercurial, I tried this: First I downloaded the C# 3.0 language specification as a word file from this url: http://download.microsoft.com/download/3/8/8/388e7205-bc10-4226-b2a8-75351c669b09/CSharp%20Language%20Specification.doc

Then I committed this to a fresh Mercurial repository. Size before the commit (empty repository) was 80 bytes, size of file on disk was 2.387.968 bytes, and repository after commit was 2.973.696 bytes. Note that the file is now effectively stored twice, once in my working copy (the one I can edit), and once in my repository as part of my initial commit.

Then I opened the file, and changed all occurances of 3.0 with 4.0 (without the quotes), and all occurances of C# with VB, and saved. Then I committed the new version with a single-letter comment. Size of repository after commit is now 3.497.984 bytes. Difference is 512KB (there's some chunking involved in the repository, hence the size being an exact 512KB value.)

If I now open up the file again, change only the title page VB back to C#, save, and commit again, the size of the repository grows by 276KB, up to 3.780.608 bytes.

As you can see, changes does not commit an entire copy of the file, but granted, the differences aren't in the "10KB" range either.

Let's assume that the average size of each diff, for this file alone, will be somewhat inbetween those, let's say averages to 50% between the two values. This means that 300 commits of changes to this file, averaging 394KB totals 115MB. This is not alot

My suggestion is as follows:

  • Stop being cheapskates, disk space is cheap, compared to the headache you will have when someone says "I really wish I knew what that file looked like last week before you corrupted it".
暮年 2024-09-12 09:14:11

我确实知道有人这样做,但你不会喜欢这个答案。

它的视觉源安全。检查文件上的“仅存储最新版本”标志,它会停止保留历史记录。

如果您希望使用像样的 SCM 来实现此功能,我建议根本不要将文件放入 SCM 中,而是将其存储在其他地方,例如文档管理解决方案,甚至只是文件系统共享。

I do know one that does this, but you're not going to like the answer.

Its Visual Sourcesafe. Check the flag 'store only latest version' on a file and it stops keeping history.

If you want this feature with a decent SCM, I would recommend not putting the file in the SCM at all, but store it elsewhere like a document management solution, or even just a filesystem share.

小草泠泠 2024-09-12 09:14:11

快速检查一下硬盘驱动器价格,每个 1 TB 内部驱动器的价格约为 75 美元。根据计算,这相当于 4MB 文件的 250,000 个副本,或者每个副本 0.0003 美元。程序员一小时的典型开销约为 100 美元。

什么成本更高:保留该文件的所有版本,或者在您再次需要该副本时付费给程序员重新创建旧版本?

A quick check of hard drive prices puts 1 terabyte (TB) internal drives around $75 USD each. Using your math, that's 250,000 copies of your 4MB file, or $0.0003 per copy. Typical overhead for a programmer for an hour is around $100.

What costs more: keeping all of the versions of that file, or paying a programmer to recreate an older version if you ever need that copy again?

感悟人生的甜 2024-09-12 09:14:11

正如 Ken 所说,这不是 VCS 的工作,而是文件系统的工作。

但是,如果您确实需要这样的“功能”,您可以使用hooks机制,从历史记录中删除文件的先前版本(比方说,早于3次提交)。

This is not a job for VCS, but for the filesystem, like Ken said.

However, if you really need such a 'feature', you may use hooks mechanism, to delete previous (lets say, older than 3 commits) versions of the file from the history.

墨洒年华 2024-09-12 09:14:11

为了满足您的特定需求,您可以随时删除过去的版本,VCS(版本控制系统,旨在永远不会丢失版本)不太适合。

您正在寻找存储库管理器(这是一种比文件系统上的简单共享路径更高级的解决方案)。
(例如 Nexus Sonatype,仅举一个)

For your specific need, where you can remove past versions whenever you want, a VCS (a Version Control System, made to never lose a version) are not well suited.

A repository manager (which is a more advanced solution than a simple shared path on a filesystem) is what you are looking for.
(E.g Nexus Sonatype, to mention only one)

强者自强 2024-09-12 09:14:11

Perforce 可以为您做到这一点。

检查文件类型:

+S
仅存储头部修订版本
提交新修订后,旧修订将从仓库中清除。对于可执行文件或 .obj 文件很有用。

-或-

+Sn
仅存储最近的 n 个修订,其中 n 是 1 到 10 之间的数字,或 16、32、64、128、256 或 512。
提交超过 n 个新修订版本时,或者如果您将现有 +Sn 文件的 n 更改为小于其当前值的数字,则较早的修订版本将从软件仓库中清除。详细信息请参见命令参考。

Perforce can do it for you.

Check file types:

+S
Only the head revision is stored
Older revisions are purged from the depot upon submission of new revisions. Useful for executable or .obj files.

-or-

+Sn
Only the most recent n revisions are stored, where n is a number from 1 to 10, or 16, 32, 64, 128, 256, or 512.
Older revisions are purged from the depot upon submission of more than n new revisions, or if you change an existing +Sn file's n to a number less than its current value. For details, see the Command Reference.

千寻… 2024-09-12 09:14:11

版本控制系统的主要职责是保存更改历史记录,所以我认为这是不可能的。当您只需要最新版本时为什么要使用版本控制?

The primary responsibility of version control systems is to keep a history of changes, so I don't think this is possible. Why use a version control when you only want the latest version?

南风起 2024-09-12 09:14:11

一般来说,不会:VCS 旨在保留整个历史记录。然而,在太空方面,我们并没有失去一切。您指定的所有系统都将存储每个修订版的二进制差异,而不是整个文件的完整副本。这意味着所需的空间通常会少得多。

In general, no: a VCS is intended to keep the entire history. However, all is not lost on the space front; all the systems you named will store binary diffs for each revision, not a complete copy of the entire file. This means that the space required will often be much less.

非要怀念 2024-09-12 09:14:11

为什么不对二进制文件使用 SVN,对所有源文件使用 DVCSS?这样,您可以将所有修订保留在服务器端,但仅保留一份客户端副本。对于其他来源,您可以从拥有真正的 VCS 中受益。

我知道我们希望将二进制文件的所有修订保留在某个地方,但不想为每个开发人员对他们拥有的每个克隆所做的每次“拉取”付出代价..这可能是滥用的..

Why not use SVN for binary files and a DVCSS for all sources files? This way, you keep all revisions server-side but only one copy client side.. And for other sources, you get the benefit of having a real VCS.

I understand that we want to keep all revisions of a binary file somewhere but not pay the price for each "pull" every developers make on every clones they have.. That might be abusive..

深空失忆 2024-09-12 09:14:11

如果您只想在计算机之间同步文件,请使用 Dropbox

如果您使用版本控制,那么看看 Lasse V. Karlsen 写的,磁盘空间很便宜。

If all you want is to sync files across computers, use Dropbox.

If you are using version control, then see what Lasse V. Karlsen wrote, disk space is cheap.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文