用 Git 寻找分支点?
我有一个包含 master 和 A 分支的存储库,以及两者之间的大量合并活动。当基于master创建分支A时,如何在我的存储库中找到提交?
我的存储库基本上如下所示:
-- X -- A -- B -- C -- D -- F (master)
\ / \ /
\ / \ /
G -- H -- I -- J (branch A)
我正在寻找修订版 A,这不是 git merge-base (--all) 找到的。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(27)
[注意:对使用历史默认名称“master”表示歉意 - 我会将其更改为“main”,但这会破坏示例存储库的 shas。]
我正在寻找同样的事情,我发现了这个问题。谢谢你的提问!
然而,我发现我在这里看到的答案似乎并没有完全给出您要求的答案(或者我正在寻找的答案)——它们似乎给出了
G 提交,而不是
A
提交。所以,我创建了以下树(按时间顺序分配的字母),这样我就可以测试一下:
这看起来与你的有点不同,因为我想确保我得到了(指的是这个图,而不是你的) B,但不是 A(也不是 D 或 E)。以下是附加到 SHA 前缀和提交消息的字母(我的存储库可以从
那么,目标:找到 B。经过一番修改后,我发现了以下三种方法:
1. 视觉上,使用 gitk:
您应该视觉上看到这样的树(从 master 处查看):
或此处(从主题查看):
在这两种情况下,我都选择了图中的
B
提交。单击它后,其完整 SHA 将显示在图表下方的文本输入字段中。2. 视觉上,但从终端:
git log --graph --oneline --all
(编辑/旁注:添加
--decorate
也可以很有趣;它添加了分支名称、标签等的指示。没有将其添加到上面的命令行,因为下面的输出没有反映其用途。)显示(假设 git config - -global color.ui auto):
或者,在直接文本中:
无论哪种情况,我们都将 6aafd7f 提交视为最低公共点,即我的图中的
B
或您的图中的A
。3. 使用 shell magic:
您没有在问题中指定您是否需要类似上面的内容,或者只需要一个命令即可获得一个修订版,而不是其他任何内容。好吧,这是后者:
您也可以将其作为 放入 ~/.gitconfig 中(注意:尾随破折号很重要;谢谢 Brian 引起了人们的注意):
这可以通过以下(用引号引起来的)命令行来完成:
注意:
zsh
也可以很容易地bash
,但sh
将无法工作——普通<()
语法不存在 <代码>sh。 (再次感谢@conny,让我在本页另一个答案的评论中意识到这一点!)然后可以从 shell 中将其用作:
注意:上述的替代版本:
感谢 liori for 指出在比较相同的分支时,上述内容可能会失败,并提出一种替代的 diff 形式,从混合中删除 sed 形式,并使之“更安全”(即它返回结果(即最近的提交),即使您将 master 与 master 进行比较):
作为 .git-config 行:
来自 shell:
所以,在我的测试树中(有一段时间不可用,抱歉;它回来了),现在适用于 master 和 topic(分别给出提交 G 和 B)。再次感谢 liori 提供的替代形式。
这就是我[和 liori] 的想法。这似乎对我有用。它还允许使用额外的几个别名,这些别名可能会很方便:
Happy git-ing!
[Note: apologies for using the historically-default name "master" -- I'd change it to "main", but that would break the shas of the example repo.]
I was looking for the same thing, and I found this question. Thank you for asking it!
However, I found that the answers I see here don't seem to quite give the answer you asked for (or that I was looking for) -- they seem to give the
G
commit, instead of theA
commit.So, I've created the following tree (letters assigned in chronological order), so I could test things out:
This looks a little different than yours, because I wanted to make sure that I got (referring to this graph, not yours) B, but not A (and not D or E). Here are the letters attached to SHA prefixes and commit messages (my repo can be cloned from here, if that's interesting to anyone):
So, the goal: find B. Here are three ways that I found, after a bit of tinkering:
1. visually, with gitk:
You should visually see a tree like this (as viewed from master):
or here (as viewed from topic):
in both cases, I've selected the commit that is
B
in my graph. Once you click on it, its full SHA is presented in a text input field just below the graph.2. visually, but from the terminal:
git log --graph --oneline --all
(Edit/side-note: adding
--decorate
can also be interesting; it adds an indication of branch names, tags, etc. Not adding this to the command-line above since the output below doesn't reflect its use.)which shows (assuming
git config --global color.ui auto
):Or, in straight text:
in either case, we see the 6aafd7f commit as the lowest common point, i.e.
B
in my graph, orA
in yours.3. With shell magic:
You don't specify in your question whether you wanted something like the above, or a single command that'll just get you the one revision, and nothing else. Well, here's the latter:
Which you can also put into your ~/.gitconfig as (note: trailing dash is important; thanks Brian for bringing attention to that):
Which could be done via the following (convoluted with quoting) command-line:
Note:
zsh
could just as easily have beenbash
, butsh
will not work -- the<()
syntax doesn't exist in vanillash
. (Thank you again, @conny, for making me aware of it in a comment on another answer on this page!)This can then be used from the shell as:
Note: Alternate version of the above:
Thanks to liori for pointing out that the above could fall down when comparing identical branches, and coming up with an alternate diff form which removes the sed form from the mix, and makes this "safer" (i.e. it returns a result (namely, the most recent commit) even when you compare master to master):
As a .git-config line:
From the shell:
So, in my test tree (which was unavailable for a while, sorry; it's back), that now works on both master and topic (giving commits G and B, respectively). Thanks again, liori, for the alternate form.
So, that's what I [and liori] came up with. It seems to work for me. It also allows an additional couple of aliases that might prove handy:
Happy git-ing!
您可能正在寻找
git merge-base
:You may be looking for
git merge-base
:我已经使用 git rev-list 来完成此类事情。例如,(注意3点)
将吐出分支点。现在,它并不完美;因为您已将 master 合并到分支 A 几次,所以会分裂出几个可能的分支点(基本上是原始分支点,然后是您将 master 合并到分支 A 的每个点) )。然而,它至少应该缩小可能性。
我已将该命令添加到
~/.gitconfig
中的别名中:这样我就可以将其称为:
I've used
git rev-list
for this sort of thing. For example, (note the 3 dots)will spit out the branch point. Now, it's not perfect; since you've merged master into branch A a couple of times, that'll split out a couple possible branch points (basically, the original branch point and then each point at which you merged master into branch A). However, it should at least narrow down the possibilities.
I've added that command to my aliases in
~/.gitconfig
as:so I can call it as:
如果您喜欢简洁的命令,
这里有一个解释。
以下命令为您提供了创建branch_name后发生的master中所有提交的列表,
因为您只关心这些提交中最早的提交,所以您需要输出的最后一行:
最早提交的父级,它不是“branch_name”的祖先根据定义,在“branch_name”中,并且在“master”中,因为它是“master”中某些内容的祖先。因此,您已经获得了两个分支中最早的提交。
该命令
只是显示父提交引用的一种方法。你可以使用
或其他什么。
PS:我不同意祖先顺序无关的论点。这取决于你想要什么。例如,在这种情况下,
输出 C2 作为“分支”提交是非常有意义的。这是开发人员从“master”分支出来的时候。当他分支时,分支“B”甚至没有合并到他的分支中!这就是这篇文章中给出的解决方案。
如果您想要的是最后一次提交 C,以便从原点到分支“A”上的最后一次提交的所有路径都经过 C,那么您希望忽略祖先顺序。这纯粹是拓扑结构,让您了解何时同时运行两个版本的代码。这时您将采用基于合并的方法,在我的示例中它将返回 C1。
If you like terse commands,
Here's an explanation.
The following command gives you the list of all commits in master that occurred after branch_name was created
Since you only care about the earliest of those commits you want the last line of the output:
The parent of the earliest commit that's not an ancestor of "branch_name" is, by definition, in "branch_name," and is in "master" since it's an ancestor of something in "master." So you've got the earliest commit that's in both branches.
The command
is just a way to show the parent commit reference. You could use
or whatever.
PS: I disagree with the argument that ancestor order is irrelevant. It depends on what you want. For example, in this case
it makes perfect sense to output C2 as the "branching" commit. This is when the developer branched out from "master." When he branched, branch "B" wasn't even merged in his branch! This is what the solution in this post gives.
If what you want is the last commit C such that all paths from origin to the last commit on branch "A" go through C, then you want to ignore ancestry order. That's purely topological and gives you an idea of since when you have two versions of the code going at the same time. That's when you'd go with merge-base based approaches, and it will return C1 in my example.
目的:此答案测试此线程中提供的各种答案。
测试存储库
正确的解决方案
唯一有效的解决方案是 lindes 提供的解决方案正确返回
A
:正如 Charles Bailey 指出的那样,这个解决方案非常脆弱。
如果您将
branch_A
合并到master
中,然后将master
合并到branch_A
中而不干预提交,那么 lindes 的解决方案只会为您提供最近的第一次背离。这意味着对于我的工作流程,我认为我将不得不坚持标记长时间运行的分支的分支点,因为我不能保证以后可以可靠地找到它们。
这实际上都归结为 git 缺乏 hg 所谓的命名分支。博主 jhw 在他的文章 血统与家族称为>为什么我比 Git 更喜欢 Mercurial 及其后续文章 有关 Mercurial 与 Git 的更多信息 (与图表!)。我建议人们阅读它们,看看为什么一些善变的皈依者错过了
git
中没有命名分支。不正确的解决方案
mpadi 提供的解决方案返回两个答案,
I
和C:
Greg Hewgill 提供的解决方案 return
I
Karl 返回
X
:测试存储库复制
要创建测试存储库:
我唯一的添加是使其明确的标记关于我们创建分支的点以及我们希望找到的提交。
我怀疑 git 版本对此有何影响,但是:
感谢 Charles Bailey 向我展示了一种更紧凑的脚本编写方式示例存储库。
Purpose: This answer tests the various answers presented in this thread.
Test repository
Correct solutions
The only solution which works is the one provided by lindes correctly returns
A
:As Charles Bailey points out though, this solution is very brittle.
If you
branch_A
intomaster
and then mergemaster
intobranch_A
without intervening commits then lindes' solution only gives you the most recent first divergance.That means that for my workflow, I think I'm going to have to stick with tagging the branch point of long running branches, since I can't guarantee that they can be reliably be found later.
This really all boils down to
git
s lack of whathg
calls named branches. The blogger jhw calls these lineages vs. families in his article Why I Like Mercurial More Than Git and his follow-up article More On Mercurial vs. Git (with Graphs!). I would recommend people read them to see why some mercurial converts miss not having named branches ingit
.Incorrect solutions
The solution provided by mipadi returns two answers,
I
andC
:The solution provided by Greg Hewgill return
I
The solution provided by Karl returns
X
:Test repository reproduction
To create a test repository:
My only addition is the tag which makes it explicit about the point at which we created the branch and thus the commit we wish to find.
I doubt the git version makes much difference to this, but:
Thanks to Charles Bailey for showing me a more compact way to script the example repository.
一般来说,这是不可能的。在分支历史记录中,在命名分支分支之前进行分支合并,并且两个命名分支的中间分支看起来相同。
在 git 中,分支只是历史记录部分的当前名称。他们确实没有很强的身份。
这通常不是一个大问题,因为两个提交的合并基础(参见 Greg Hewgill 的答案)通常更有用,给出两个分支共享的最新提交。
在分支历史记录中的某个时刻已完全集成分支的情况下,依赖于提交父级顺序的解决方案显然不起作用。
如果在父级相反的情况下进行了集成合并(例如,使用临时分支来执行到主分支的测试合并,然后快进到功能分支以进一步构建),则该技术也会失败。
In general, this is not possible. In a branch history a branch-and-merge before a named branch was branched off and an intermediate branch of two named branches look the same.
In git, branches are just the current names of the tips of sections of history. They don't really have a strong identity.
This isn't usually a big issue as the merge-base (see Greg Hewgill's answer) of two commits is usually much more useful, giving the most recent commit which the two branches shared.
A solution relying on the order of parents of a commit obviously won't work in situations where a branch has been fully integrated at some point in the branch's history.
This technique also falls down if an integration merge has been made with the parents reversed (e.g. a temporary branch was used to perform a test merge into master and then fast-forwarded into the feature branch to build on further).
Git 2.36 提出了一个更简单的命令:
with:
git rev-list --exclude-first-parent-only ^mainbranch_A
为您提供J -- I -- H -- G
code>tail -1
为您提供 Ggit rev-parse G^
为您提供其第一个父级:A
或branch_A_tag(PowerShell 等效项,来自 < a href="https://stackoverflow.com/users/5887576/kumarchandresh">kumarchandresh 的 评论:
git log -1 --decorate --oneline $(git rev-parse "$(git rev-list --exclude-first-parent-only ^mainbranch_A_tag | select -last 1)^"))
使用测试脚本:
这给你:
这是:
使用 Git 2.36(2022 年第二季度),“
git 日志
"(man) 和朋友们学到了一个选项--exclude-first-parent-only
只沿着第一个父链向下传播无趣的位,只是like--first-parent
选项显示仅沿着第一个父链缺少 UNINTERESTING 位的提交。请参阅 提交 9d505b7(2022 年 1 月 11 日),作者:张杰瑞 (
jerry-skydio
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 708cbef,2022 年 2 月 17 日)rev-list-options
现在包含在其 手册页:rev-list-options
现在包含在其 手册页:正如 anarcat 在 注释,如果您的分支不是源自
master
,而是源自< code>main,或prod
,或...任何其他分支,您可以使用:philb 也在 评论
--boundary
选项(输出排除的边界提交。边界提交以-
为前缀):Git 2.36 proposes a simpler command from:
with:
git rev-list --exclude-first-parent-only ^main branch_A
gives youJ -- I -- H -- G
tail -1
gives you Ggit rev-parse G^
gives you its first parent:A
or branch_A_tag(PowerShell equivalent, from kumarchandresh's comment:
git log -1 --decorate --oneline $(git rev-parse "$(git rev-list --exclude-first-parent-only ^main branch_A_tag | select -last 1)^")
)With the test script:
Which gives you:
Which is:
With Git 2.36 (Q2 2022), "
git log
"(man) and friends learned an option--exclude-first-parent-only
to propagate UNINTERESTING bit down only along the first-parent chain, just like--first-parent
option shows commits that lack the UNINTERESTING bit only along the first-parent chain.See commit 9d505b7 (11 Jan 2022) by Jerry Zhang (
jerry-skydio
).(Merged by Junio C Hamano --
gitster
-- in commit 708cbef, 17 Feb 2022)rev-list-options
now includes in its man page:rev-list-options
now includes in its man page:As noted by anarcat in the comments, if your branch does not derive from
master
, but frommain
, orprod
, or... any other branch, you can use:philb also mentions in the comments the
--boundary
option (Output excluded boundary commits. Boundary commits are prefixed with-
):怎么样
How about something like
一种更容易在 git log --graph 中查看分支点的简单方法是使用选项
--first-parent
。例如,从 repo 获取https://stackoverflow.com/a/4991675/3217306">接受的答案:
现在添加
--first-parent
:这使得它更容易!
请注意,如果存储库有很多分支,您将需要指定要比较的 2 个分支,而不是使用
--all
:A simple way to just make it easier to see the branching point in
git log --graph
is to use the option--first-parent
.For example, take the repo from the accepted answer:
Now add
--first-parent
:That makes it easier!
Note if the repo has lots of branches you're going to want to specify the 2 branches you're comparing instead of using
--all
:当然我错过了一些东西,但是在我看来,上述所有问题都是因为我们总是试图找到历史记录中的分支点而引起的,并且由于可用的合并组合而导致了各种问题。
相反,我采用了不同的方法,基于两个分支共享大量历史记录的事实,分支之前的所有历史记录都是 100% 相同的,所以我的建议不是回去,而是继续前进(从第一个分支开始) commit),寻找两个分支中的第一个差异。简而言之,分支点将是找到的第一个差异的父点。
在实践中:
它解决了我所有常见的情况。当然,还有一些边界没有被覆盖,但是......再见:-)
surely I'm missing something, but IMO, all the problems above are caused because we are always trying to find the branch point going back in the history, and that causes all sort of problems because of the merging combinations available.
Instead, I've followed a different approach, based in the fact that both branches share a lot of history, exactly all the history before branching is 100% the same, so instead of going back, my proposal is about going forward (from 1st commit), looking for the 1st difference in both branches. The branch point will be, simply, the parent of the first difference found.
In practice:
And it's solving all my usual cases. Sure there are border ones not covered but... ciao :-)
经过大量研究和讨论,很明显没有一种灵丹妙药可以在所有情况下发挥作用,至少在当前版本的 Git 中是这样。
这就是为什么我编写了几个补丁来添加
tail
分支的概念。每次创建分支时,也会创建一个指向原始点的指针,即tail
ref。每次分支变基时,该引用都会更新。要找出devel分支的分支点,你所要做的就是使用
devel@{tail}
,就是这样。https://github.com/felipec/git/commits/fc/tail
After a lot of research and discussions, it's clear there's no magic bullet that would work in all situations, at least not in the current version of Git.
That's why I wrote a couple of patches that add the concept of a
tail
branch. Each time a branch is created, a pointer to the original point is created too, thetail
ref. This ref gets updated every time the branch is rebased.To find out the branch point of the devel branch, all you have to do is use
devel@{tail}
, that's it.https://github.com/felipec/git/commits/fc/tail
我最近也需要解决这个问题,并最终为此编写了一个 Ruby 脚本: https: //github.com/vaneyckt/git-find-branching-point
I recently needed to solve this problem as well and ended up writing a Ruby script for this: https://github.com/vaneyckt/git-find-branching-point
我似乎对
你得到的最后一行是分支上的第一个提交感到高兴,所以这是获取其父级的问题。所以
似乎对我有用并且不需要差异等(这很有帮助,因为我们没有该版本的差异)
更正:如果您在主分支上,这不起作用,但我正在做这在脚本中,所以这不是一个问题
I seem to be getting some joy with
The last line you get is the first commit on the branch, so then it's a matter of getting the parent of that. So
Seems to work for me and doesn't need diffs and so on (which is helpful as we don't have that version of diff)
Correction: This doesn't work if you are on the master branch, but I'm doing this in a script so that's less of an issue
有时这实际上是不可能的(除了一些例外,您可能幸运地拥有额外的数据)并且这里的解决方案不起作用。
Git 不保留引用历史记录(包括分支)。它只存储每个分支(头部)的当前位置。这意味着随着时间的推移,您可能会丢失 git 中的一些分支历史记录。例如,每当您进行分支时,就会立即丢失哪个分支是原始分支。分支所做的只是:
您可能假设第一个提交的是该分支。情况往往如此,但并非总是如此。在上述操作之后,没有什么可以阻止您首先提交到任一分支。此外,不保证 git 时间戳的可靠性。直到你同时致力于两者,它们才真正成为结构上的分支。
虽然在图中我们倾向于在概念上对提交进行编号,但当提交树分支时,git 没有真正稳定的顺序概念。在这种情况下,您可以假设数字(指示顺序)由时间戳确定(当您将所有时间戳设置为相同时,看看 git UI 如何处理事情可能会很有趣)。
这是人们在概念上所期望的:
这就是你实际得到的:
你会假设 B1 是原始分支,但它实际上可能只是一个死分支(有人签出 -b 但从未提交给它)。直到你同时提交这两个内容,你才能在 git 中获得合法的分支结构:
你总是知道 C1 在 C2 和 C3 之前,但你永远无法可靠地知道 C2 是在 C3 之前还是 C3 在 C2 之前(因为你可以设置时间)例如,您的工作站到任何东西)。 B1 和 B2 也具有误导性,因为你不知道哪个分支先出现。在许多情况下,您可以对其做出非常好的且通常准确的猜测。这有点像赛道。一般而言,在所有情况与汽车相同的情况下,您可以假设落后一圈的汽车开始落后一圈。我们也有非常可靠的约定,例如 master 几乎总是代表寿命最长的分支,尽管遗憾的是我见过情况并非如此。
这里给出的例子是一个历史保存的例子:
这里的 Real 也具有误导性,因为我们作为人类从左到右、从根到叶阅读它(参考)。 Git 不这样做。当我们在头脑中做 (A->B) 的地方,git 会做 (A<-B 或 B->A)。它从 ref 读取到 root。引用可以在任何地方,但往往是叶子,至少对于活跃的分支来说是这样。引用指向一个提交,并且提交仅包含对其父级的“like”,而不是其子级的“like”。当一项提交是合并提交时,它将有多个父项。第一个父级始终是合并到的原始提交。其他父项始终是合并到原始提交中的提交。
这不是一个非常有效的表示,而是 git 可以从每个引用(B1 和 B2)获取的所有路径的表达式。
Git 的内部存储看起来更像这样(不是 A 作为父项出现两次):
如果转储原始 git 提交,您将看到零个或多个父字段。如果为零,则意味着没有父级,并且提交是根(实际上可以有多个根)。如果有,则意味着没有合并,也不是根提交。如果有多个,则意味着该提交是合并的结果,并且第一个之后的所有父级都是合并提交。
当两者都击中 A 时,它们的链条将相同,在此之前它们的链条将完全不同。第一个提交和另外两个提交的共同点是共同的祖先,并且从那里它们开始分歧。术语提交、分支和引用之间可能存在一些混淆。事实上你可以合并一个提交。这就是合并真正要做的事情。 ref 只是指向提交,而分支只不过是文件夹 .git/refs/heads 中的引用,文件夹位置决定了引用是分支而不是其他内容(例如标签)。
丢失历史记录的地方是,合并将根据情况执行以下两件事之一。
考虑:
在这种情况下,任一方向的合并都会创建一个新的提交,其中第一个父级作为当前签出分支指向的提交,第二个父级作为合并到当前分支的分支尖端的提交。它必须创建一个新的提交,因为两个分支自必须合并的共同祖先以来都发生了变化。
此时,D (B1) 现在具有来自两个分支(其本身和 B2)的两组更改。然而,第二个分支没有 B1 的变化。如果您将 B1 中的更改合并到 B2 中,以便它们同步,那么您可能会期望看起来像这样的东西(您可以强制 git merge 这样做,但是使用 --no-ff):
即使 B1 您也会得到有额外的提交。只要 B2 中不存在 B1 中没有的更改,两个分支就会合并。它执行快进,类似于变基(变基也吃掉或线性化历史记录),但与变基不同的是,因为只有一个分支具有更改集,所以它不必将一个分支的变更集应用到另一个分支的变更集之上。
如果你停止 B1 的工作,那么从长远来看,对于保存历史来说,一切都很好。通常只有 B1(可能是 master)会前进,因此 B2 在 B2 历史记录中的位置成功地代表了它被合并到 B1 中的点。这就是 git 希望你做的事情,从 A 分支 B,然后随着变化的累积,你可以根据需要将 A 合并到 B 中,但是当将 B 合并回 A 时,并不期望你会在 B 上工作并进一步工作。如果您在快进合并回您正在处理的分支后继续处理您的分支,那么您每次都会删除 B 之前的历史记录。每次快速提交到源代码然后提交到分支之后,您实际上都是在创建一个新分支。当您快进提交时,您最终会看到许多分支/合并,您可以在历史记录和结构中看到这些分支/合并,但无法确定该分支的名称是什么,或者看起来两个单独的分支是否实际上是同一个分支。
1 到 3 和 5 到 8 是结构分支,如果您跟踪 4 或 9 的历史记录,就会出现。在 git 中无法知道这些未命名和未引用的结构分支属于命名和引用分支中的哪一个作为结构的末端。你可能会从这张图中假设0到4属于B1,4到9属于B2,但除了4和9之外,无法知道哪个分支属于哪个分支,我只是以一种给出的方式绘制它对此的幻觉。 0 可能属于 B2,5 可能属于 B1。在这种情况下,每个结构分支可能属于其中的命名分支,有 16 种不同的可能性。这是假设这些结构分支都不是来自已删除的分支,或者是从主库拉取时将分支合并到自身的结果(两个存储库上的相同分支名称实际上是两个分支,单独的存储库就像分支所有分支) 。
有许多 git 策略可以解决这个问题。您可以强制 git merge 从不快进并始终创建合并分支。保存分支历史记录的一种可怕方法是根据您选择的某些约定使用标签和/或分支(确实推荐使用标签)。我真的不建议在您要合并的分支中进行虚拟的空提交。一个非常常见的约定是,除非您想真正关闭分支,否则不要合并到集成分支中。人们应该尝试遵守这种做法,否则你就在围绕建立分支机构的问题进行工作。然而,在现实世界中,理想并不总是实际的,这意味着做正确的事情并不适用于所有情况。如果您在一个分支上所做的事情是孤立的,那么您可能会遇到这样的情况:当多个开发人员正在处理某件事时,他们需要快速共享他们的更改(理想情况下,您可能真的想在一个分支上工作,但并非所有情况都适合,通常您要避免两个人在一个分支机构工作)。
Sometimes it is effectively impossible (with some exceptions of where you might be lucky to have additional data) and the solutions here wont work.
Git doesn't preserve ref history (which includes branches). It only stores the current position for each branch (the head). This means you can lose some branch history in git over time. Whenever you branch for example, it's immediately lost which branch was the original one. All a branch does is:
You might assume that the first commited to is the branch. This tends to be the case but it's not always so. There's nothing stopping you from commiting to either branch first after the above operation. Additionally, git timestamps aren't guaranteed to be reliable. It's not until you commit to both that they truly become branches structurally.
While in diagrams we tend to number commits conceptually, git has no real stable concept of sequence when the commit tree branches. In this case you can assume the numbers (indicating order) are determined by timestamp (it might be fun to see how a git UI handles things when you set all the timestamps to the same).
This is what a human expect conceptually:
This is what you actually get:
You would assume B1 to be the original branch but it could infact simply be a dead branch (someone did checkout -b but never committed to it). It's not until you commit to both that you get a legitimate branch structure within git:
You always know that C1 came before C2 and C3 but you never reliably know if C2 came before C3 or C3 came before C2 (because you can set the time on your workstation to anything for example). B1 and B2 is also misleading as you can't know which branch came first. You can make a very good and usually accurate guess at it in many cases. It is a bit like a race track. All things generally being equal with the cars then you can assume that a car that comes in a lap behind started a lap behind. We also have conventions that are very reliable, for example master will nearly always represent the longest lived branches although sadly I have seen cases where even this is not the case.
The example given here is a history preserving example:
Real here is also misleading because we as humans read it left to right, root to leaf (ref). Git does not do that. Where we do (A->B) in our heads git does (A<-B or B->A). It reads it from ref to root. Refs can be anywhere but tend to be leafs, at least for active branches. A ref points to a commit and commits only contain a like to their parent/s, not to their children. When a commit is a merge commit it will have more than one parent. The first parent is always the original commit that was merged into. The other parents are always commits that were merged into the original commit.
This is not a very efficient representation, rather an expression of all the paths git can take from each ref (B1 and B2).
Git's internal storage looks more like this (not that A as a parent appears twice):
If you dump a raw git commit you'll see zero or more parent fields. If there are zero, it means no parent and the commit is a root (you can actually have multiple roots). If there's one, it means there was no merge and it's not a root commit. If there is more than one it means that the commit is the result of a merge and all of the parents after the first are merge commits.
When both hit A their chain will be the same, before that their chain will be entirely different. The first commit another two commits have in common is the common ancestor and from whence they diverged. there might be some confusion here between the terms commit, branch and ref. You can in fact merge a commit. This is what merge really does. A ref simply points to a commit and a branch is nothing more than a ref in the folder .git/refs/heads, the folder location is what determines that a ref is a branch rather than something else such as a tag.
Where you lose history is that merge will do one of two things depending on circumstances.
Consider:
In this case a merge in either direction will create a new commit with the first parent as the commit pointed to by the current checked out branch and the second parent as the commit at the tip of the branch you merged into your current branch. It has to create a new commit as both branches have changes since their common ancestor that must be combined.
At this point D (B1) now has both sets of changes from both branches (itself and B2). However the second branch doesn't have the changes from B1. If you merge the changes from B1 into B2 so that they are syncronised then you might expect something that looks like this (you can force git merge to do it like this however with --no-ff):
You will get that even if B1 has additional commits. As long as there aren't changes in B2 that B1 doesn't have, the two branches will be merged. It does a fast forward which is like a rebase (rebases also eat or linearise history), except unlike a rebase as only one branch has a change set it doesn't have to apply a changeset from one branch on top of that from another.
If you cease work on B1 then things are largely fine for preserving history in the long run. Only B1 (which might be master) will advance typically so the location of B2 in B2's history successfully represents the point that it was merged into B1. This is what git expects you to do, to branch B from A, then you can merge A into B as much as you like as changes accumulate, however when merging B back into A, it's not expected that you will work on B and further. If you carry on working on your branch after fast forward merging it back into the branch you were working on then your erasing B's previous history each time. You're really creating a new branch each time after fast forward commit to source then commit to branch. You end up with when you fast forward commit is lots of branches/merges that you can see in the history and structure but without the ability to determine what the name of that branch was or if what looks like two separate branches is really the same branch.
1 to 3 and 5 to 8 are structural branches that show up if you follow the history for either 4 or 9. There's no way in git to know which of this unnamed and unreferenced structural branches belong to with of the named and references branches as the end of the structure. You might assume from this drawing that 0 to 4 belongs to B1 and 4 to 9 belongs to B2 but apart from 4 and 9 was can't know which branch belongs to which branch, I've simply drawn it in a way that gives the illusion of that. 0 might belong to B2 and 5 might belong to B1. There are 16 different possibilies in this case of which named branch each of the structural branches could belong to. This is assuming that none of these structural branches came from a deleted branch or as a result of merging a branch into itself when pulling from master (the same branch name on two repos is infact two branches, a separate repository is like branching all branches).
There are a number of git strategies that work around this. You can force git merge to never fast forward and always create a merge branch. A horrible way to preserve branch history is with tags and/or branches (tags are really recommended) according to some convention of your choosing. I realy wouldn't recommend a dummy empty commit in the branch you're merging into. A very common convention is to not merge into an integration branch until you want to genuinely close your branch. This is a practice that people should attempt to adhere to as otherwise you're working around the point of having branches. However in the real world the ideal is not always practical meaning doing the right thing is not viable for every situation. If what you're doing on a branch is isolated that can work but otherwise you might be in a situation where when multiple developers are working one something they need to share their changes quickly (ideally you might really want to be working on one branch but not all situations suit that either and generally two people working on a branch is something you want to avoid).
这是我之前的答案之前的答案的改进版本。它依赖于合并的提交消息来查找首次创建分支的位置。
它适用于这里提到的所有存储库,我什至解决了 产生的一些棘手的存储库在邮件列表上。我还为此编写了测试。
Here's an improved version of my previous answer previous answer. It relies on the commit messages from merges to find where the branch was first created.
It works on all the repositories mentioned here, and I've even addressed some tricky ones that spawned on the mailing list. I also wrote tests for this.
以下命令将显示提交 A 的 SHA1
git merge-base --fork-point A
The following command will reveal the SHA1 of Commit A
git merge-base --fork-point A
这并不是问题的解决方案,但我认为值得注意的是当我有一个长期存在的分支时我使用的方法:
在创建分支的同时,我还创建了一个具有相同名称但带有 < 的标签code>-init 后缀,例如
feature-branch
和feature-branch-init
。(这是一个很难回答的问题,这有点奇怪!)
Not quite a solution to the question but I thought it was worth noting the the approach I use when I have a long-living branch:
At the same time I create the branch, I also create a tag with the same name but with an
-init
suffix, for examplefeature-branch
andfeature-branch-init
.(It is kind of bizarre that this is such a hard question to answer!)
似乎使用 reflog 解决了这个问题 git reflog 显示了分支的所有提交,包括分支创建。
这是来自一个在合并回 master 之前有 2 次提交的分支。
Seems like using reflog solves this
git reflog <branchname>
shows all the commits of the branch including branch creation.This is from a branch that had 2 commits before it was merged back to master.
要查找来自分支点的提交,您可以使用它。
To find commits from the branching point, you could use this.
解决方案是找到第一个分歧,然后取父分歧。使用 zsh,可以按如下方式完成此操作(编辑:与我的第一个答案相比,我添加了缺少的
--topo-order
;我忘记了它在存储库上进行测试,其中所有提交都具有相同的日期(由脚本生成):sed
选择仅出现在一个分支中的第一个提交。然后 git rev-parse "$(...)^" 输出其父级。注意:
branch_A
是否已与master
合并,这都有效(在合并的情况下,branch_A
的头对应于最后一次提交在合并之前的这个分支中)。4.0
分支)中。diff
并不是最佳选择,因为只需要第一个差异,但这似乎是标准 Unix 实用程序的最简单解决方案。编辑:
如果在某处进行合并,两个分支之一中合并之前的第一次提交可能不是使用
diff
看到的分歧的一部分(请注意,尽管如此,结果在某些情况下可能仍然有意义)。因此,最好查看两侧的第一个分歧,即对于diff
,第一次插入和第一次删除(如果可用)。上述解决方案的另一个可能的问题是,在合并提交的情况下,
^
选择第一个父级,而人们希望合并被视为对称。因此,为了选择父母双方,^@
是首选。最后,在通过上述注释获得的可能的分支点中,选择最旧的一个:这是由 git rev-list --topo-order --reverse --no-walk ... | 完成的head -n 1 下面(请注意,由于
--reverse
,不能使用-1
或-n 1
作为git rev-list
选项而不是指向head -n 1
的管道)。因此,这是一个完整的解决方案,可用作脚本,其中
master
和HEAD
作为分支的默认值:--topo-order
描述不是很详细,但这似乎在各种复杂的示例上按预期工作。但也可能有更复杂的例子,其中分支点没有很好地定义。A solution is to find the first divergence, then take the parent. With zsh, this can be done as follows (EDIT: compared to my first answer, I've added the missing
--topo-order
; I forgot it as I did tests on a repository where all commits had the same date, as generated by a script):The
sed
selects the first commit that appears in only one of the branches. Then thegit rev-parse "$(...)^"
outputs its parent.Notes:
branch_A
has been merged withmaster
or not (in case of a merge, the head ofbranch_A
corresponds to the last commit in this branch before the merge).4.0
branch).diff
is not optimal since one just needs the first difference, but this seems to be the simplest solution with standard Unix utilities.EDIT:
In case of a merge somewhere, it is possible that the first commits before the merge in one of the two branches are not part of the divergence seen by using
diff
(note that the result may still make sense in some cases, though). So it is better to look at the first divergence on both sides, i.e. fordiff
, the first insertion and the first deletion (when available).Another possible issue with the above solution is that in case of a merge commit,
^
selects the first parent, while one would like merges to be regarded as symmetrical. So,^@
is preferred in order to select both parents.Finally, among the possible branch points obtained with the above remarks, one chooses the oldest one: this is done by
git rev-list --topo-order --reverse --no-walk ... | head -n 1
below (note that because of--reverse
, one cannot use-1
or-n 1
as agit rev-list
option instead of the pipe tohead -n 1
).So, here is a complete solution, usable as a script, with
master
andHEAD
as defaults for the branches:The
--topo-order
description is not much detailed, but this appears to work as expected on various complex examples. But it might also be possible to have even more complex examples where the branch point is not well defined.问题似乎是在一侧的两个分支之间找到最新的单提交剪切,以及在另一侧的最早共同祖先(可能是回购协议)。这符合我对“分支”点的直觉。
考虑到这一点,使用普通的 git shell 命令来计算这一点并不容易,因为我们最强大的工具 git rev-list 不允许我们通过以下方式限制路径:达到了提交。我们拥有的最接近的是 git rev-list --boundary ,它可以为我们提供一组“阻止我们前进”的所有提交。 (注意:
git rev-list --ancestry-path
很有趣,但我不知道如何让它在这里有用。)这是脚本:https://gist.github.com/abortz/d464c88923c520b79e3d。它相对简单,但由于有循环,它足够复杂以保证要点。
请注意,这里提出的大多数其他解决方案不可能在所有情况下都有效,原因很简单:git rev-list --first-parent 在线性化历史记录方面并不可靠,因为可能会与任何一个合并订购。
另一方面,
git rev-list --topo-order
非常有用——用于按拓扑顺序遍历提交——但是进行差异比较脆弱:有对于给定的图,有多种可能的拓扑排序,因此您依赖于排序的一定稳定性。也就是说,strongk7 的解决方案在大多数情况下可能都运行得很好。然而,由于必须遍历整个存储库的历史......两次,它比我的慢。 :-)The problem appears to be to find the most recent, single-commit cut between both branches on one side, and the earliest common ancestor on the other (probably the initial commit of the repo). This matches my intuition of what the "branching off" point is.
That in mind, this is not at all easy to compute with normal git shell commands, since
git rev-list
-- our most powerful tool -- doesn't let us restrict the path by which a commit is reached. The closest we have isgit rev-list --boundary
, which can give us a set of all the commits that "blocked our way". (Note:git rev-list --ancestry-path
is interesting but I don't how to make it useful here.)Here is the script: https://gist.github.com/abortz/d464c88923c520b79e3d. It's relatively simple, but due to a loop it's complicated enough to warrant a gist.
Note that most other solutions proposed here can't possibly work in all situations for a simple reason:
git rev-list --first-parent
isn't reliable in linearizing history because there can be merges with either ordering.git rev-list --topo-order
, on the other hand, is very useful -- for walking commits in topographic order -- but doing diffs is brittle: there are multiple possible topographic orderings for a given graph, so you are depending on a certain stability of the orderings. That said, strongk7's solution probably works damn well most of the time. However it's slower that mine as a result of having to walk the entire history of the repo... twice. :-)以下实现了 git 等价于 svn log --stop-on-copy 的功能,也可用于查找分支源。
方法
就像所有的河流都流向大海,所有的分支都流向 master 和因此我们在看似不相关的分支之间找到了合并基础。当我们从分支头返回祖先时,我们可以在第一个潜在的合并基础处停下来,因为理论上它应该是该分支的起点。
注释
详细信息:https://stackoverflow.com/a/35353202/9950
The following implements git equivalent of svn log --stop-on-copy and can also be used to find branch origin.
Approach
Like all rivers run to the sea, all branches run to master and therefore we find merge-base between seemingly unrelated branches. As we walk back from branch head through ancestors, we can stop at the first potential merge base since in theory it should be origin point of this branch.
Notes
details: https://stackoverflow.com/a/35353202/9950
为什么不使用
Which 为您提供分支 A 拥有但 master 没有的所有提交(
..
的功能),并使用tail -1
返回最后一个提交输出行,这将 找到指定的第一个提交分支(分支A)。然后,使用该提交的 SHA
为您提供指定提交之前的所有提交(
^1
的函数)和head -1
返回第一行输出,这是分支 A 中最早提交的“一次提交”,也称为“分支点”。作为单个可执行命令:
从分支 A 中运行上述命令(HEAD 的功能)
Why not use
Which gives you all of the commits that Branch A has that master doesn't have (the function of
..
), andtail -1
to return the last line of output, which would find you the first commit of the specified branch (Branch A).Then, with that commit's SHA
Which gives you all the commits prior to the specified commit (the function of
^1
) andhead -1
to return the first line of output, which is "one commit prior" to the earliest commit in the Branch A, aka the "branch point".As a single, executable command:
Run the above from within Branch A (the function of HEAD)
简单答案
将两个分支合并,现在您可以找到共同的祖先。
Simple Answer
Merge-base with the two branches, now you can find the common ancenstor(s).
您可以检查分支 A 的引用日志,以查找它是从哪个提交创建的,以及该分支指向的提交的完整历史记录。引用日志位于 .git/logs 中。
You can examine the reflog of branch A to find from which commit it was created, as well as the full history of which commits that branch pointed to. Reflogs are in
.git/logs
.您可以使用以下命令返回branch_a中最旧的提交,该提交无法从master访问:
也许可以通过额外的健全性检查来确保该提交的父级实际上可以从master访问...
You could use the following command to return the oldest commit in branch_a, which is not reachable from master:
Perhaps with an additional sanity check that the parent of that commit is actually reachable from master...
我相信我已经找到了一种方法来处理这里提到的所有极端情况:
查尔斯·贝利(Charles Bailey)说得很对,基于祖先顺序的解决方案价值有限;在一天结束时,您需要某种“此提交来自分支 X”的记录,但此类记录已经存在;默认情况下,'git merge' 将使用诸如“将分支 'branch_A' 合并到 master”之类的提交消息,这告诉您来自第二个父级的所有提交 (commit^2) 来自 'branch_A' 并合并到第一个父级父级(commit^1),即“master”。
有了这些信息,您就可以找到“branch_A”的第一次合并(这是“branch_A”真正存在的时候),并找到合并基础,这将是分支点:)
我已经尝试过以下存储库马克·布斯 (Mark Booth) 和查尔斯·贝利 (Charles Bailey) 的解决方案有效;怎么可能呢?唯一不起作用的方法是,如果您手动更改了合并的默认提交消息,则分支信息确实丢失了。
为了有用:
然后你可以执行“
gitbranch-pointbranch_A
”。享受 ;)
I believe I've found a way that deals with all the corner-cases mentioned here:
Charles Bailey is quite right that solutions based on the order of ancestors have only limited value; at the end of the day you need some sort of record of "this commit came from branch X", but such record already exists; by default 'git merge' would use a commit message such as "Merge branch 'branch_A' into master", this tells you that all the commits from the second parent (commit^2) came from 'branch_A' and was merged to the first parent (commit^1), which is 'master'.
Armed with this information you can find the first merge of 'branch_A' (which is when 'branch_A' really came into existence), and find the merge-base, which would be the branch point :)
I've tried with the repositories of Mark Booth and Charles Bailey and the solution works; how couldn't it? The only way this wouldn't work is if you have manually changed the default commit message for merges so that the branch information is truly lost.
For usefulness:
Then you can do '
git branch-point branch_A
'.Enjoy ;)