Mercurial 如何与众多开发人员合作?
我查看了一些已知产品的 Mercurial 存储库,例如 TortoiseHg 和 Python,尽管我可以看到多人提交更改,但时间线看起来总是很干净,只有一个人支行不断前行。
然而,假设您有 14 个人在开发同一个产品,这会不会很快陷入在任何给定时间都有 14 个并行分支的分支噩梦?
例如,只有两个人,并且产品位于变更集 X,现在两个开发人员都在周一早上开始开发不同的功能,因此两者都从相同的父变更集开始。
当他们提交时,我们现在有两个分支,然后有 14 个人,我们很快就会有 10 多个(可能不是 14 个……)分支需要合并回默认分支。
或者...我在这里没有看到什么?也许这并不是真正的问题?
编辑:我发现对于我在这里真正要问的内容有些困惑,所以让我澄清一下。
我完全清楚 Mercurial 可以轻松处理多个分支和合并,并且正如一个答案所述,即使人们处理相同的文件,他们也不会经常在同一行上工作,即使如此,冲突也很容易处理。我还知道,如果两个人最终创建了一个合并地狱,因为他们在同一个文件中更改了很多相同的代码,那么这里就会出现一些总体规划失败,因为我们将两个功能放在完全相同的位置给两个开发人员,而不是尝试让它们一起工作,或者一开始就把它们都交给一个开发人员。
所以不是这样的。
我很好奇的是这些开源项目如何管理如此干净的历史。对我来说,历史是否干净并不重要(正如一位评论所想的那样),我的意思是,我们确实并行工作,存储库能够反映这一点,那就更好了(在我看来),但是我看过的这些存储库没有这个。他们似乎沿着 Subversion 模型工作,在更新和合并之前你无法提交,在这种情况下历史记录只是一条直线。
那么他们是如何做到的呢?
他们是否“重新调整”这些更改,以便它们看起来遵循分支的最新提示,即使它们最初是在分支历史记录中提交的?移植变更集以使它们看起来像是已经在主分支中提交了?
或者我看过的项目在添加新东西方面速度太慢(目前,我没有回顾太久的历史),以至于实际上他们一次只为一个人工作?
或者他们是否会将更改推送给一位中央维护人员来进行审查然后集成?看起来并非如此,因为我查看的许多项目在变更集上都有不同的名称。
I look at Mercurial repositories of some known products, like TortoiseHg and Python, and even though I can see multiple people committing changes, the timeline always looks pretty clean, with just one branch moving forward.
However, let's say you have 14 people working on the same product, won't this quickly get into a branch nightmare with 14 parallel branches at any given time?
For instance, with just two people, and the product at changeset X, now both developers start working on separate features on monday morning, so both start with the same parent changeset.
When they commit, we now have two branches, and then with 14 people, we would quickly have 10+ (might not be 14...) branches that needs to be merged back into the default.
Or... What am I not seeing here? Perhaps it's not really a problem?
Edit: I see there's some confusion as to what I'm really asking about here, so let me clarify.
I know full and well that Mercurial easily handles multiple branches and merging, and as one answer states, even when people work on the same files, they don't often work on the same lines, and even then, a conflict is easily handled. I also know that if two people end up creating a merge hell because they changed a lot of the same code in the same files, there's some overall planning failure here, since we've placed two features in the exact same place onto two developers, instead of perhaps trying them to work together, or just giving both to one developer in the first place.
So that's not it.
What I'm curious about is how these open source project manage such a clean history. It's not important to me (as one comment wondered) that the history is clean, I mean, we do work in parallel, that the repository is able to reflect that, so much the better (in my opinion), however these repositories I've looked at doesn't have that. They seem to be working along the Subversion model where you can't commit before you've updated and merged, in which case the history is just one straight line.
So how do they do it?
Are they "rebasing" the changes so that they appear to be following the latest tip of the branch even though they were originally committed a bit back in the branch history? Transplanting changesets to make them appear to' having been committed in the main branch to begin with?
Or are the projects I've looked at either so slow (at the moment, I didn't look far back in the history) at adding new things that in reality they've only been working one person at a time?
Or are they pushing changes to one central maintainer who reviews and then integrates? It doesn't look like that since many of the projects I looked at had different names on the changesets.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这并不是什么问题。在大型项目中,即使人们处理相同的功能,他们通常也不会处理同一个文件。当他们处理同一个文件时,他们通常不会修改相同的行。当他们修改相同的行时,应该手动完成合并(对于受影响的行)。
这意味着在实践中,80% 以上的合并可以由 Mercurial 本身自动完成。
让我们举个例子:
您有:
编辑:为了清楚起见,通过分支我在这里指的是未命名的分支。
如果您在
branch 1
中更改了文件,但branch 2
中的同一文件与base
中的相同,则中的版本>分支 1
被选择。如果文件在branch 1
和branch 2
中都被修改,则文件将使用相同的算法逐行合并:if line 1 in file1 inbranch 1< /code> 与
base
中的 file1 中的第 1 行不同,但branch 2
和base
的第 1 行相等,中的第 1 行选择分支 1(依此类推)。
对于在两个分支中修改的行,Mercurial 会中断自动合并过程并提示用户选择要使用的行,或手动编辑这些行。
由于决定使用哪些行最好由修改这些行的人来完成,因此一个好的做法是让实现功能的人执行合并。这意味着,如果我和你在同一个项目上工作,我会实现我的功能,然后从中央/公共存储库中提取(获取每个人都使用的最新版本),然后将我的新版本与提取的更改合并,然后发布它到公共存储库(此时,公共存储库有一个主分支,其中包含我合并的更改)。然后,您从服务器中提取该内容并对您的更改执行相同的操作。
这意味着每个人都能够在本地存储库中做任何他们想做的事情,并且公共/官方存储库有一个分支。这也意味着您需要决定人们应该合并更改的时间范围。
我的机器上曾经有三到四个存储库,已经在不同的产品版本(存储库的不同分支)上编译,并且在我的主要存储库(一个用于重构,一个用于开发等等)。每当我将一个分支置于稳定状态(例如完成重构)时,我都会从服务器中拉出,将该分支合并到拉出的更改中,然后将其推回服务器,并让任何人知道他们是否对对于受影响的文件,他们应该首先从服务器中提取。
我们曾经每周一早上同步已实现的功能,大约需要一个小时来合并所有内容,然后在服务器上进行每周构建以提供给质量检查(在糟糕的日子里,团队的两名成员需要两个小时左右,然后每个人都会将本周的更改拉到他们的机器上并将它们用作本周的新基础)。这是一个由八名开发人员组成的团队。
It's not really a problem. In a large project even when people work on the same feature, they don't usually work on the same file. When they work on the same file, they don't usually modify the same lines. And when they modify the same lines, then a merge should be done manually (for the affected lines).
This means in practice that 80+% of the merges can be done automagically by Mercurial itself.
Let's take an example:
you have:
Edit: for clarity, by branch I refer here to unnamed branches.
If you have a file changed in
branch 1
but the same file inbranch 2
is the same as inbase
, then the version inbranch 1
is chosen. If the file is modified in bothbranch 1
andbranch 2
the files are merged line by line using the same algorithm: if line 1 in file1 inbranch 1
is different than line 1 in file1 inbase
butbranch 2
andbase
have the line 1 equal, line 1 inbranch 1
is chosen (and so on and so forth).For the lines that are modified in both branches, Mercurial interrupts the automated merging process and prompts the user to choose which lines to use, or edit the lines manually.
Since deciding which lines to use is best done by the person(s) who modified those lines, a good practice is to have the person that implemented a feature perform the merge. That means that if me and you work on the same project, I implement my feature, then make a pull from a central/common repository (get the latest version that everyone uses), then merge my new version with the pulled changes, then publish it to the common repository (at this point, the common repository has one main branch, with my merged changes into it). Then, you pull that from the server and do the same with your changes.
This implies that everyone is capable of doing whatever they want in their local repository, and the common/official repository has one branch. It also means that you need to decide on a time frame when people should merge their changes in.
I used to have three or four repositories on my machine already compiled on different product versions (different branches of the repository) and a few different branches in my main repository (one for refactoring, one for development and so on). Whenever I would bring one branch to a stable state (say - finish a refactoring) I would pull from the server, merge that branch into the pulled changes, then push it back to the server and let anyone know that if they made any changes to the affected files, they should pull first from the server.
We used to synchronize implemented features every Monday morning and it took us about an hour to merge everything, then make a weekly build on the server to give to QA (on bad days it would take two member of the team two hours or so, then everyone would pull the week's changes on their machine and use them as a new base for the week). This was for an eight-developers team.
在您更新的问题中,您似乎对整理历史记录的方式更感兴趣。当您有历史记录并希望将其变成一条整齐的直线时,您需要使用 rebase< /a>、移植 和/或 水银队列。查看这三个文档,您应该了解其完成方式的工作流程。
编辑:由于我等待编译,下面是一个具体示例我的意思是:
运行
hg glog
现在显示此(分歧)历史记录有两个分支:执行变基操作,将变更集 1 变成 2 的子项而不是 0:
现在让我们再次检查历史记录:
急!单线。 :)
另请注意,我没有进行合并。当你像这样变基时,你将不得不处理合并冲突和所有事情,就像你进行合并一样。因为这几乎就是幕后发生的事情。在一个小型测试仓库中进行实验。例如,尝试更改
revision 0
中添加的文件,而不是仅仅添加更多文件。In your updated question it seems that you are more interested in ways of tidying up the history. When you have a history and want to make it into a single, neat, straight line you want to use rebase, transplant and/or mercurial queues. Check the docs out for those three and you should realise the workflow for how its done.
Edit: Since Im waiting for a compile, here follows a specific example of what I mean:
Running
hg glog
now shows this (diverging) history with two branches:Do a rebase, making changeset 1 into a child of 2 rather than 0:
Now lets check history again:
Presto! Single line. :)
Also note that I didnt do a merge. When you rebase like this, you will have to deal with merge conflicts and everything just like as if you did a merge. Because thats pretty much what happens under the hood. Experiment with this in a small test repo. For example, try changing the file added in
revision 0
rather than just adding more files.我是一名 Mercurial 开发人员,所以让我解释一下我们/我是如何做到这一点的。
在 Mercurial 项目中,我们接受以补丁形式发送到邮件列表的贡献。当我们通过
hg import
应用这些内容时,我们会对我们正在处理的分支的尖端进行隐式变基。这对于保持历史记录的干净有很大帮助。至于我自己的更改,我使用 rebase 或 mq 在推送之前将它们线性化,再次保持历史整洁。这基本上是一个做的问题,
如果你愿意,你可以将拉取和变基结合起来(
hg pull --rebase
),但我总是喜欢一次迈出一步。顺便说一句,对于这种线性化历史的做法存在一些分歧——一些人认为历史应该显示事情是如何真正发生的,包括所有的分支和合并等等。我发现只要你不弄乱公共变更集,那么线性化历史就可以并且很有用。
I'm a Mercurial developer, so let me explain how we/I do it.
In the Mercurial project we accept contributions in form of patches sent to the mailinglist. When we apply those with
hg import
, we do an implicit rebase to the tip of the branch we are working on. This help a lot with keeping the history clean.As for my own changes, I use rebase or mq to linearize things before I push them, again to keep the history tidy. It's basically a matter of doing
You can combine the pull and rebase if you like (
hg pull --rebase
) but I've always liked to take one step at a time.By the way, there are some disagreements about this practice of linearizing the history -- some believe that the history should show how things really happened, with all the branches and merges and whatnot. I find that as long as you don't mess with public changesets, then it's okay and useful to linearize history.
Linux 内核存储在数千个存储库和可能数百万个分支中,这似乎不构成问题。对于大型项目,您需要一个存储库策略(例如,独裁者-副官策略),但拥有多个分支是现代 DVCS 的主要优势,根本不是问题。
The Linux kernel is stored in thousands of repositories and probably millions of branches, and this doesn't seem to pose a problem. For large projects you need a repository strategy (e.g., the dictator–lieutenants strategy), but having many branches is the main strength of the modern DVCSes and not a problem at all.
是的,我们必须合并,为了避免主存储库出现问题,开发人员应该在子存储库上进行合并。
因此,在将代码推送到父存储库之前,您首先要提取最新的更改,在您这边合并并(尝试)推送。这应该避免主存储库中不需要的头
Yes, we'll have to merge and to avoid heads on the main repository, merging should be done on the child repositories by the developer.
So before you push your code to the parent repository you first pull the latest changes, merge on your side and (try to) push. This should avoid unwanted heads in the master repo
我不知道 TortoiseHg 团队是如何做事的,但是您可以使用 Mercurial 的 rebase 扩展 来“分离”一根树枝并将其放在尖端的顶部,形成一根树枝。
但实际上,只要我看到的头数不超过应有的数量,我就不会担心多个分支。合并其实并不是什么大事。
I don't know how the TortoiseHg team does things, but you can use Mercurial's rebase extension to "detach" a branch and drop it on the top of the tip, creating a single branch.
In practice, though, I don't get concerned about multiple branches, as long as I don't see more heads than there should be. Merging is not really a big deal.