svn 和 git 版本控制模型的区别
我想知道 git (或其他 DVCS)和 subversion (或其他 CVCS)建议的版本控制方法之间有什么区别。
这是我在 http://www.xsteve.at/prg/vc_svn/ 上找到的内容svn.txt 关于此主题:
Subversion 将版本化树作为一阶对象进行管理( 存储库是一个树数组),而变更集是 派生(通过比较相邻的树)。像 Arch 或 Bitkeeper 的构建方式正好相反:它们旨在管理 变更集作为一阶对象(存储库是一袋 补丁),树是通过将补丁集组合在一起而派生的。
但尚不清楚 subversion 存储库如何存储更改,是否包含版本化文件的最旧变体等等。例如,为什么我们不能像 git 那样生成一堆补丁?人们总是提到 svn 和 git 之间的一个主要区别,即简化/复杂化合并,但不幸的是,我仍然不明白这个想法。
I would like to know what is the difference between versioning approaches suggested by git (or other DVCSs) and subversion (or other CVCSs).
Here is what I found on http://www.xsteve.at/prg/vc_svn/svn.txt regarding this topic:
Subversion mananges versioned trees as first order objects (the
repository is an array of trees), and the changesets are things that
are derived (by comparing adjacent trees.) Systems like Arch or
Bitkeeper are built the other way around: they're designed to manage
changesets as first order objects (the repository is a bag of
patches), and trees are derived by composing sets of patches together.
But it's not clear how subversion repository stores changes, whether it contain oldest variant of versioned file and so on. Why couldn't we generate a bunch of patches as in case of git, for example? It's always mentioned as a principal difference between svn and git which simplifies/complexifies merges, but, unfortunately, I still do not get the idea.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Martin 的博客。这里我就不再重复了。
然而,我想强调一点,一开始可能并不明显。基于变更集的 VCS 使跟踪合并变得非常容易,这对于像 Subversion 这样基于快照的系统来说要困难得多。
在基于变更集的 VCS 中,合并只是具有多个父变更集的变更集(或提交,因为它们在 git 中称为“提交”)。存储库的图形表示通常显示 DAG(有向无环图),其中节点表示变更集,箭头表示父子关系。当您看到一个节点具有多个父节点时,您就可以确切地知道那里发生了哪种合并。
在 Subversion 中,“合并跟踪”是新事物。直到 1.4 版本为止,还没有这样的概念,因此为了了解合并的历史记录,您必须在提交的日志消息中做笔记。 1.5 版实现了合并跟踪,以便更轻松地执行从一个分支到另一个分支的重复合并,而无需强制用户明确修订范围等。这是通过与接收合并的目录关联的属性 (svn:mergeinfo) 来实现的。它跟踪哪些修订已经从哪些分支合并。这足以推断哪些修订应该在后续合并中合并。但是,绘制显示合并历史记录的图表并不容易,而当您与多个开发人员一起处理复杂的项目时,您会经常看到这种情况。
There's a nice explanation about the main differences between VCS based on changesets and on snapshots at Martin's blog. I'll not repeat it here.
However, I would stress one point that may not be obvious at first. Changeset based VCSs make it really easy to track merges, which is much more difficult for systems like Subversion, which is based on snapshots.
In a changeset based VCS, merges are simply changesets (or commits, as they're called in git) which have more than one parent changeset. The graphical representation of the repository usually shows a DAG (Directed acyclic graph) where the nodes represent changesets and the arrows represent parent-child relationships. When you see a node with more than one parent you know exactly what kind of merge occurred there.
In Subversion, "merge tracking" is something new. Up until version 1.4 there was no such concept, so that in order to know about the history of merges you had to make notes in the log messages of your commits. Version 1.5 implemented merge tracking to make it easier to perform repeated merges from one branch to another without forcing the user to be explicit about revision ranges and the like. This is implemented with a property (svn:mergeinfo) associated with the directory receiving the merge. It tracks which revisions have been already merged from which branches. This is enough to infer which revisions should be merged in subsequente merges. But it doesn't make it easy to draw graphs showing the merge history, which is something you would like to see frequently as you work in a complex project with several developers.
Git 原则上是以版本树作为一阶对象排列的。也就是说,您处理提交对象的图,每个对象都与作为该修订版本状态的树具有一对一的关系。
请注意,它们的实际存储方式可能非常不同。 Git 一开始只是单独压缩每个文件和树/提交对象。据我了解,将对象打包到单个文件中并仅存储某些对象的增量是很久以后才添加的。
因此,事实上,尽管补丁在 git 用户界面中似乎无处不在,但它们实际上与数据的存储方式无关 - 存储在包文件中的增量是二进制级别的增量,根本不是文本样式的差异。 Git 将应用增量来获取对象,然后再次比较它们以按需生成补丁。这与 CVS 形成鲜明对比,CVS 继承了 RCS 的最新版本加反向增量存储系统。
例如,根据您引用的内容,Git 和 SVN 实际上比 CVS 更相似。
Git is arranged with version trees as first-order objects in principle. That is, you deal with a graph of commit objects, each of which has a one-to-one relationship with a tree that is the state at that revision.
Note that how these are actually stored can be very different. Git started out simply compressing each file and tree/commit object individually. As I understand it, packing objects into a single file and storing just deltas for some objects was added much later.
So in fact, although patches seem to be ubiquitous in git user interfaces, they are in fact no relation to how the data is stored- the deltas that are stored in the pack files are binary-level deltas, not text-style diffs at all. Git will apply deltas to get objects and then diff them again to produce the patch on demand. This is in contrast to, for instance, CVS which inherited a latest-version-plus-reverse-deltas storage system from RCS.
Based on what you quoted, it appears that Git and SVN are actually more similar than either is to CVS, for example.
迟到且部分答复。我认为上面没有澄清以下内容:
重要术语:
CVCS = 集中版本控制系统
DVCS = D分布式版本控制系统(由 Git 使用)
REPOSITORY = 项目的文件树,即具有一个或多个子目录的目录,单个项目的所有许多文件。例如:
集中式:
每个人共享一个(集中式)存储库。
用法:
更改的权限授予所有用户。
分布式:
每个人共享一个只读存储库,然后在每个用户的位置至少有该存储库的完整副本。
换句话说,每个用户都会将整个项目树的副本复制到其本地计算机上,或从主存储库复制整个文件树。
用法:
更改权限由控制主存储库的项目所有者控制。 (在 git 中,我们有一个“拉取请求”,或者向控制中央存储库的项目所有者发出请求,以拉取新的更改。)
我对此进行了过度简化,以重点关注集中式和集中式之间的主要区别分布式。 (现在我承认我仍在学习如何实际记录您所询问的更改,并希望在我完全理解这一点后进行更新。)
参考: 这是一篇很好的更完整的文章。
Late and partial answer. I didn't think the following had been clarified above:
Important terms:
CVCS = Centralized Version Control System
DVCS = Distributed Version Control System (used by Git)
REPOSITORY = A project's file tree, i.e. a directory with one or more subdirectories, with all of the many files for a single project. For example:
Centralized:
One (centralized) Repository shared by everyone.
Usage:
Permission to make changes is granted to all users.
Distributed:
One read only Repository shared by everyone, then at a minimum a full copy of that Repository at each user's location.
In other words every user makes a copy of the entire project tree onto their local machine, or copies the entire file tree from the primary repository.
Usage:
Permission to make changes is controlled by the project owner who controls the primary repository. (In git we have a "pull request", or a request to the project owner who controls the central Repository, to pull in the new changes.)
I've oversimplified this, to focus on the primary differences between centralized and distributed. (Now I admit that I'm still learning how the changes are actually recorded that you had asked about, and hope to update this once I fully understand this.)
Ref: This is a good more complete article.