git rebase vs结帐

发布于 2025-02-03 06:27:29 字数 213 浏览 3 评论 0 原文

每当我想查看分支的新更改(远程/本地)时,我正在对该分支进行结帐,但最近我遇到了 rebase 命令,似乎是出于此目的而创建的,我想知道这两种方法之间的区别。有人可以简单地解释吗?

git checkout <branch_name>

git rebase <branch_name>

It's been a while that whenever I want to review new changes in a branch (remote/local) I am doing a checkout to that branch, but recently I came across rebase command which seems to be created for such purposes, I am wondering about the difference between these two approaches. Can someone explain it in a simple way?

git checkout <branch_name>

git rebase <branch_name>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

我偏爱纯白色 2025-02-10 06:27:29

重新结帐和结帐是巨大的命令,具有不同的目标。这两个目标都不完全匹配您自己的目标 - 或似乎与 Inspect 某物相匹配,但是结帐 更接近。

有人可以解释它eli5?

恐怕我要击中词汇限制,但让我们从适当的基础知识开始,这是git用户过多的基础知识(出于好坏的原因,但最终结果是不好的)。

git是关于

git的基本存储单位的git是 commit 。 git 存储库是一个委员会的集合,存储在一个大数据库中,该数据库称为 object数据库。 GIT存储库有更多的部分,我们将在稍后进行,但是第一个(对象数据库)是必不可少的:没有它,就没有储存库

对象数据库是一个简单的 key-value商店调用 oids 或对象ID查找对象。出于我们的目的,最重要的对象(实际上,我们真正关心的对象)是提交对象,它是任何提交的第一部分。因此,我们的承诺在git中拥有这些OID。我们将它们称为 hash ids ,以避免陷入太多的tlas(三个字母的首字母缩写词),并最终可能, ras综合征。有些人称它们为SHA或SHA-1,因为Git最初(目前)使用SHA-1 Crytography Hash作为Hash ID,但是Git不再与SHA-1结婚,因此“ Hash ID”或“ OID”是更多合适的。

OID或HASH ID是一堆丑陋的字母和数字,例如 E54793A95AFEEA1E10DE1E1E5AD7EB914E7416250 。这实际上是一个非常大的数字,在 hexadecimal 中。 git 需要这些以找到其对象。 ID是该特定对象所唯一的:在大对象数据库中没有其他对象可以具有该ID。您所做的每个 commit 都必须获得 new 随机数字,从未使用过,再也不会被使用 ever ,in < em>任何 git存储库,除非它用于存储您的提交。实际上使此操作很难 - 从技术上讲,这是不可能的 1 - 但是哈希ID的纯粹大小使其在实践中起作用。 git世界末日可能有一天来(请参阅新发现的SHA-1碰撞影响Git Git?)但是不会已经有一段时间了。


1 请参阅 pigeonhole hole hole infimiple


git是不是关于分支机构或文件的

如果git提交没有存储文件,则git将毫无用处。因此,Consits do 存储文件。但是提交不是文件本身,而文件不是Git的“工作单位”。 git是关于 consits 的,它是偶然的,即包含文件。

在git中, branch 一词非常严重,几乎是毫无意义的。分支机构在这里,它可能会变得非常混乱,尽管一旦您将基础知识放下,您会发现自己在所有其他人中随便扔到句子中的单词 branch ,在同一句子中可能不止一次,每个单词都意味着不同的东西,但整个事情似乎很明显。

为了帮助保持这一点,我喜欢(至少尝试)使用短语 branch name 在引用 main master 之类的名称时, dev 开发功能等等。 A 分支名称在git中是一种快速而重要的方法,找到 一个特定的提交。人类之所以使用这些,是因为人的大脑不擅长使用哈希ID:它们太大,丑陋且看起来随机。

因此,存储库保留一个单独的数据库(另一个简单的键值存储),每个键是a name ,该值是带有该名称的大丑陋哈希ID。分支名称是git粘贴在第二个数据库中的众多名称之一。因此,您可以给Git一个分支名称; Git将查找Hash ID,并为该分支找到最新提交

从这个意义上讲,我们使用分支机构(或更确切地说,分支 name )来获得我们的提交。但是Git实际上与这些分支无关。它仍然是关于 consits


2 有关此问题的更极端示例,请参见。有关Git滥用单词 branch 的更多信息,请参见我们到底是什么意思? a>


我们知道git的提交是什么是

关于提交的,让我们看一下实际的原始提交。这是我上面提到的一个:

$ git cat-file -p e54793a95afeea1e10de1e5ad7eab914e7416250
tree dc3d0156b95303a305c69ba9113c94ff114b7cd3
parent 565442c35884e320633328218e0f6dd13f3657d3
author Junio C Hamano <[email protected]> 1651786597 -0700
committer Junio C Hamano <[email protected]> 1651786597 -0700

Git 2.36.1

Signed-off-by: Junio C Hamano <[email protected]>

那是原始的commit 对象,实际上完全由提案的元数据组成。

提交对象有两个部分:

  • 每个提交都有构成该特定提交的所有文件的完整快照。在像上面的真实提交中,那是 tree 行,这是必需的:必须有一个且只有一个 tree

  • 每个提交都有一些元数据。这就是上面的整个文本,实际上(包括 tree 行本身)。

请注意,元数据告诉我们是谁进行了提交,何时:魔术号 1651786597 上面是日期和时间stamp,含义 5月5日14:36:37 2022 -0700 是时区,在这种情况下是太平洋日光时间或UTC-7。 (这可能是山区标准时间,也是UTC-7,现在正在亚利桑那州的纳瓦霍民族地区使用,但是您可以肯定地敢打赌,这不是Junio Hamano当时的实际位置。)该参数的犯罪者消息在这种情况下非常简短:与EG相比,来自 f8781bfda31756acdc0ae77da7e70337aedae7c9

2.36 gitk/diff-tree --stdin regression fix

This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.

The diff_free() call is to be used after we completely finished with
a diffopt structure.  After "git diff A B" finishes producing
output, calling it before process exit is fine.  But there are
commands that prepares diff_options struct once, compares two sets
of paths, releases resources that were used to do the comparison,
then reuses the same diff_option struct to go on to compare the next
two sets of paths, like "git log -p".

After "git log -p" finishes showing a single commit, calling it
before it goes on to the next commit is NOT fine.  There is a
mechanism, the .no_free member in diff_options struct, to help "git
log" to avoid calling diff_free() after showing each commit and ...

这是一个更好的提交消息。 在 log-tree.c 中排除更新的测试和注释,修复本身仅将三行添加到内置/diff-tree.c 。)

( > parent> parent line,git独立设置的元数据的一部分非常重要。可以有多个 parent 行 - 或很少, no parent行 - 因为每个提交都在其元数据中携带,父母哈希ID。这些只是存储库中某些现有提交的原始哈希ID,即您或Junio或任何人添加了 new commit。我们将在片刻之内看到这些用途。

回顾到目前为止,

A 存储库有两个数据库:

  • 一个(通常更大)包含提交和其他对象。这些具有哈希ID; git 需要哈希ID来找到它们。
  • 另一个(通常较小)包含名称,例如分支和标签名称,并将每个名称映射到一个 hash ID。对于分支名称,我们在这里获得的一个哈希ID是该分支的最新提交。
  • consits 是所有这些都存在的原因。每个都存储两件事:一个完整​​的快照和一些元数据。

现在有效的树

是使哈希ID在git中起作用的技巧之一是,任何对象的任何部分都无法更改。曾经做出的提交是永远的方式。通过该哈希ID的订单持有 ,因此具有父母(或那些父母)等。 一切都一直冻结。

提交中的文件以特殊的,只读的,压缩的(有时是高度压缩), de de deplicated 格式存储。即使大多数提交大多数人主要使用父母提交的大多数文件,避免了存储库的膨胀。由于文件已被删除,因此重复项几乎没有空间。只有一个更改文件需要任何空间。

但是有一个明显的问题:

  • 只有 git can read 这些压缩和删除的文件。
  • 没有什么,甚至没有git本身,可以写他们。

如果我们要完成任何工作,我们必须拥有普通文件,普通程序既可以读取又可以读写。我们会从哪里得到这些?

Git的答案是提供任何 non-bare 存储库, 3 您可以在其中完成工作的领域。 git称之为该区域 - 一个装满文件夹的目录树或文件夹,或您喜欢的任何术语 - 您的工作树 work-tree 简称。实际上,典型的设置是在隐藏的 .git 目录中,在 的工作树中。其中的所有内容都是 git的; 您的


3 a Bare 存储库是没有工作树的一个。这似乎有点多余或毫无意义,但实际上确实具有一个函数:请参阅


什么 git Checkout git switch 是关于

查看的一些提交的 git Checkout git Switch 和一个分支名称 - 您告诉git:

  • 使用分支名称查找Hash ID的最新提交。
  • 从我的工作树中删除所有从我一直使用的提交中出现的所有文件。
  • 替换为我刚刚命名的提交的所有文件中的所有文件。

git在可能的情况下在这里进行了很大的缩写:如果您要从提交 A123456 转移到 B789ABC ,并且这两个提交中的大多数文件都被删除。 ,Git实际上不会为这些文件的删除和更换而打扰。此缩短稍后将变得很重要,但是如果您开始考虑 git Checkout / git switch at trimest:删除当前提交的文件,请更改为新当前提交并提取这些文件您有一个良好的开始。

如何重新审议提交的委员会

我们现在 。每个提交在其元数据中都有一组 parent 行。 大多数提交(到目前为止,在大多数存储库中)都有一个父母,这就是要开始的。

让我们以简单,微小的三信号存储库来绘制 。这三个提交将具有三个大丑陋的随机哈希ID,但没有做出一些概念,让我们称它们为“ consits a ”, b 和 c c /代码>按此顺序。提交 a 是第一个提交 - 这有点特别,因为它具有 no parent commit-,然后您制作了 b 使用提交 a ,并在使用 b 时制作 c 。因此,我们有一个:

A <-B <-C

即,commit c 最新 commit,具有一些文件作为快照,并且作为其父母,是Commit commit <代码> b 。我们说 c 指向 b

同时,提交 b 具有一些文件作为快照,并且已提交 a 作为其父。我们说 b 指向 a

您的分支名称,我们将假设是 main 指向最新的提交 c

A--B--C   <-- main

:懒惰地将箭头绘制为 箭头之间的箭头,但实际上它们仍然是向后的箭头。

当您 GIT Checkout Main 时,Git将所有提交 - c 文件提取到工作树中。您有可以查看和编辑的这些文件。

如果确实编辑了一些,则使用 git add git commit 来制作新提交。这个新的提交获得了全新的,从未在宇宙中的任何GIT存储库中使用过任何地方,Hash ID,但是我们将称此新提交 d 。 git将安排新的提交 d 向后指向现有提交 c ,因为 c 是您一直在使用的,所以让我们绘制在新提交 d 中:(

A--B--C   <-- main
       \
        D

向后斜线从 d c 是为什么我对箭头懒惰的原因 -是一些箭头字体,但它们在Stackoverflow上的工作不佳,因此我们只需要想象箭头从 d c 。)

但是现在 d 最新 main commit,因此 git commit 也存储 d 的哈希ID中的名称 main ,以便 main 现在指向 d :(

A--B--C
       \
        D   <-- main

现在没有使用额外的线来绘制事物的原因;

这是分支成长的一种方式,在git中。 you 查看分支,以便它是您的当前分支。它的提示最高的提交 - 朝向右侧的提交,或 git log -graph output中的顶部提交的提交 - 当前提交,这些是文件您在工作树上看到。您要编辑这些文件,使用 git add ,然后运行 git commit ,然后将git包装到 new> new 文件 - 具有自动de-deplication,所以如果将文件 back 更改为 b a 的方式,它将在此处删除! ,然后将 new colls的哈希ID塞入当前分支名称中。

分支表单如何

从相同的三个信号存储库开始:

A--B--C   <-- main

现在,让我们创建一个新的分支名称 dev 。此名称​​必须指向一些现有的commit 。只有三个提交,因此我们必须选择 a b 或 c 的一个。代码>点点。显而易见的是,最新的是:我们可能不需要及时回去提交 b a 开始添加新提交。因此,让我们添加 dev ,以便它也要指向 c ,通过运行:

git branch dev

我们得到:

A--B--C   <-- dev, main

很难从我们的图纸中分辨出来:我们是否在 dev main ?也就是说,如果我们运行 git状态,它会说“在分支Dev”或“分支Main上”?让我们在所有大写中添加一个特殊名称, head ,然后将其附加到两个分支名称之一,以显示 name 我们正在使用:

A--B--C   <-- dev, main (HEAD)

我们是“ ON”分支 main 。如果我们现在进行新的提交,则提交 d 将像往常一样指向comport c ,而git将把新的哈希ID贴在 name <代码>主。

但是,如果我们运行:

git checkout dev

git将从我们的工作树中删除所有提交 - c 文件,然后放入所有提交 - c 文件中。 看起来有点愚蠢,不是吗?换取!

A--B--C   <-- dev (HEAD), main

( 代码>我们获取:

A--B--C   <-- main
       \
        D   <-- dev (HEAD)

如果我们 git Checkout main ,git将删除提交 - d 文件,并安装提交 - c 文件,我们'将返回:

A--B--C   <-- main (HEAD)
       \
        D   <-- dev

如果我们现在做出另一个新提交,我们将获得:

        E   <-- main (HEAD)
       /
A--B--C
       \
        D   <-- dev

这就是分支机构在git中工作的方式。 a分支 name ,例如 main dev ,选择 last commit。从那里开始, git向后工作。提交 e 可能是最后一个 main commit,但是CONDITS abc on main 因为当我们从 e 开始并向后工作时,我们会得到他们。

同时,提交 d 是最后一个 dev commit,但CONDITS abc on dev 因为当我们从 d 开始并向后工作时,我们就可以找到他们。提交 d 不是 main 上的,因为我们从 e 向后工作:直接跳过 d

评论

我们现在知道:

  • git是关于 consits
  • 提交商店快照和元数据。
  • 我们使用分支名称将其组织成分支,以找到 last commit。
  • 我们查看将其文件视为文件并处理它们的提议。否则,它们是只有Git可以看到的特殊怪异的东西。
  • 一旦制造,任何提交的部分都无法改变。

现在,我们将转到 git rebase

什么 git rebase 是关于

我们经常发现自己使用git并陷入这种情况:

          F--G--H   <-- main
         /
...--A--B
         \
          C--D--E   <-- feature (HEAD)

我们对自己说: gosh,如果我们以后启动功能,当 main 已提交 g 和/或 h ,因为我们现在需要这些内容。

没有从根本上没有什么错误使用CONDITS CDE ,我们可以使用 git Merge ,但无论出于何种原因 - 老板说,同事们决定他们喜欢rebase流动,无论它可能是什么 - 我们决定我们将“改进” cde 提交。我们将重新制作它们,以便它们在 fgh 之后,这样:

                  C'-D'-E'   <-- improved-feature (HEAD)
                 /
          F--G--H   <-- main
         /
...--A--B
         \
          C--D--E   <-- feature

我们可以从字面上看,从字面上看< em>查看 commit h ,制作一个新的分支,然后重新完成我们的工作:

git switch main
git switch -c improved-feature
... redo a bunch of work ...

什么 git rebase 做的是为我们自动化这一点。如果我们要手动进行操作,则每个“重做”步骤都将涉及使用 git cherry-pick (我在这里不会详细介绍)。 git rebase 命令自动化为我们挑选樱桃,然后添加另一个扭曲:而不是需要 new new 分支名称,例如改进的功能,它只是将旧的分支名称从旧提交中拉起,并指向新的分支:

                  C'-D'-E'   <-- feature (HEAD)
                 /
          F--G--H   <-- main
         /
...--A--B
         \
          C--D--E   [abandoned]

旧的废弃委托实际上仍在Git中至少仍然存在30天左右。但是,如果没有 name 找到他们,只有您可以看到如果您保存了他们的哈希ID,或者有一些技巧可以找到那些哈希ID。 4

当反弹完全完成时,我们的原始提交将复制到新的和改良的提交中。新提交具有新的和不同的哈希ID,但是由于没有人注意到实际的哈希ID,因此看着这个存储库的人只会看到三个 feature -branch-forly-forly consits,并且 <> 改为新的改进。


/em>他们神奇地将其

5 git可以看到真相,如果您将git存储库与其他一些git存储库联系起来,他们将进行……单词或长时间的对话,并且如果您会造成很大的混乱不知道你在做什么。基本上,如果他们仍然有您的原始作品,那么当您想到时,您可以将它们恢复原状!每当您连接两个git存储库时,通常都会有一只手在任何新提交上都缺少另一个。这是哈希ID的魔力真正生效的地方:他们仅通过哈希ID来完成这一切。

最重要的是,只有当这些提交的所有用户都同意可以重新考虑这些提交的所有用户时,您才应该重新提交。如果您是唯一的用户,那么您只需要同意自己,所以这要容易得多。否则,在开始重新审视之前,请先从所有其他用户中提前达成协议。

Rebase and checkout are wildly different commands, with different goals. Neither goal exactly matches your own—which is or seems to be to inspect something—but checkout comes much closer.

Can someone explain it Eli5?

I'm afraid I blow right past the vocabulary limits for that ???? but let's start with the proper basics, which too many Git users have skipped (for reasons good or bad, but the end result was bad).

Git is about commits

The basic unit of storage in Git is the commit. A Git repository is a collection of commits, stored in a big database that Git calls the object database. A Git repository has several more parts, which we'll get to in a moment, but this first one—the object database—is essential: without it there's no repository.

The object database is a simple key-value store, using what Git calls OIDs or Object IDs to look up the objects. The most important kind of object for our purposes—in fact, the only one we really care about—is the commit object, which holds the first part of any commit. So our commits, in Git, have these OIDs. We'll call them hash IDs to avoid getting caught up in too many TLAs—Three Letter Acronyms—and probably, eventually, RAS syndrome. Some call them SHA or SHA-1, because Git initially (and currently) uses the SHA-1 crytographic hash as its hash IDs, but Git is no longer wedded to SHA-1, so "hash ID" or "OID" is more appropriate.

An OID or hash ID is a big ugly string of letters and digits, such as e54793a95afeea1e10de1e5ad7eab914e7416250. This is actually a very large number, expressed in hexadecimal. Git needs these to find its objects. The ID is unique to that particular object: no other object, in the big objects database, can have that ID. Every commit you make has to get a new random-looking number, never-before-used, never to be used again ever, in any Git repository, unless it's being used to store your commit. Making this actually work is hard—technically, it's impossible1—but the sheer size of the hash ID makes it work in practice. A Git doomsday may come someday (see How does the newly found SHA-1 collision affect Git?) but it won't be for a while yet.


1See the pigeonhole principle.


Git is not about branches or files

If Git commits did not store files, Git would be useless. So commits do store files. But commits are not files themselves, and a file is not Git's "unit of work" as it were. Git is about the commits, which sort of accidentally-on-purpose contain files.

The word branch, in Git, is very badly overused, almost to the point of meaninglessness.2 There are at least two or three things people mean when they say branch here, and it can get very confusing, although once you've got the basics down you'll find yourself right among all the other people casually tossing the word branch out in a sentence, maybe more than once in the same sentence, with each word meaning something different, yet the whole thing seems totally obvious.

To help keep this straight, I like to (try at least) to use the phrase branch name when referring to a name like main or master, dev or develop, feature, and so on. A branch name, in Git, is a fast and important way to find one particular commit. Humans use these because human brains are no good at working with hash IDs: they're too big, ugly, and random-looking.

A repository therefore keeps a separate database—another simple key-value store—in which each key is a name and the value is the big ugly hash ID that goes with that name. Branch names are one of the many kinds of names that Git sticks in this second database. So, you can give Git a branch name; Git will look up the hash ID, and find the latest commit for that branch.

In this sense, we use branches—or more precisely, branch names—in Git to get to our commits. But Git isn't about these branches, really; it's still about the commits.


2For an even more extreme example of this problem, see Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. For more on Git's abuse of the word branch, see What exactly do we mean by "branch"?


What's in a commit

Now that we know Git is all about commits, let's take a look at an actual raw commit. Here's the one I referred to above:

$ git cat-file -p e54793a95afeea1e10de1e5ad7eab914e7416250
tree dc3d0156b95303a305c69ba9113c94ff114b7cd3
parent 565442c35884e320633328218e0f6dd13f3657d3
author Junio C Hamano <[email protected]> 1651786597 -0700
committer Junio C Hamano <[email protected]> 1651786597 -0700

Git 2.36.1

Signed-off-by: Junio C Hamano <[email protected]>

That's the raw commit object, and it actually consists entirely of the commit's metadata.

A commit object has two parts:

  • Every commit has a full snapshot of all of the files that make up that particular commit. In a real commit like the one above, that's the tree line, which is required: there must be one and only one tree.

  • Every commit also has some metadata. That's the entire chunk of text above, really (including the tree line itself).

Note that the metadata tells us who made the commit, and when: the magic number 1651786597 above is a date-and-time-stamp meaning Thu May 5 14:36:37 2022. The -0700 is the time zone, which in this case is Pacific Daylight Time or UTC-7. (It could be Mountain Standard Time which is also UTC-7, and is in use right now in the Navajo Nation areas of Arizona, but you can pretty safely bet that this was not Junio Hamano's actual location at the time.) It also has the committer's commit message, which in this case is remarkably short: compare with, e.g., a snippet from f8781bfda31756acdc0ae77da7e70337aedae7c9:

2.36 gitk/diff-tree --stdin regression fix

This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.

The diff_free() call is to be used after we completely finished with
a diffopt structure.  After "git diff A B" finishes producing
output, calling it before process exit is fine.  But there are
commands that prepares diff_options struct once, compares two sets
of paths, releases resources that were used to do the comparison,
then reuses the same diff_option struct to go on to compare the next
two sets of paths, like "git log -p".

After "git log -p" finishes showing a single commit, calling it
before it goes on to the next commit is NOT fine.  There is a
mechanism, the .no_free member in diff_options struct, to help "git
log" to avoid calling diff_free() after showing each commit and ...

which is a much better commit message. (Excluding the updated tests and a comment in log-tree.c, the fix itself just adds three lines to builtin/diff-tree.c.)

The other really important part of the metadata, which Git sets up on its own, is the parent line. There can be more than one parent line—or, rarely, no parent line—because each commit carries, in its metadata, a list of parent hash IDs. These are just the raw hash IDs of some existing commits in the repository, that were there when you, or Junio, or whoever, added a new commit. We'll see in a moment what these are for.

Review so far

A repository has two databases:

  • One (usually much bigger) contains commits and other objects. These have hash IDs; Git needs the hash IDs to find them.
  • The other (usually much smaller) contains names, such as branch and tag names, and maps each name to one hash ID. For a branch name, the one hash ID we get here is, by definition, the latest commit for that branch.
  • The commits are the reason that all of this exists. Each one stores two things: a full snapshot, and some metadata.

A working tree

Now, one of the tricks to making the hash IDs work, in Git, is that no part of any object can ever change. A commit, once made, is the way it is forever. That commit, with that hash ID, holds those files and that metadata and thus has that parent (or those parents) and so on. Everything is frozen for all time.

The files inside a commit are stored in a special, read-only, compressed (sometimes highly compressed), de-duplicated format. That avoids having the repository bloat up even though most commits mostly re-use most of the files from their parent commit(s). Because the files are de-duplicated, the duplicates literally take no space. Only a changed file needs any space.

But there's an obvious problem:

  • Only Git can read these compressed-and-de-duplicated files.
  • Nothing, not even Git itself, can write them.

If we're going to get any work done, we must have ordinary files, that ordinary programs can both read and write. Where will we get those?

Git's answer is to provide, with any non-bare repository,3 an area in which you can do your work. Git calls this area—a directory-tree or folder full of folders, or whatever terminology you like—your working tree, or work-tree for short. In fact, the typical setup is to have the repository proper live inside a hidden .git directory at the top level of the working tree. Everything inside this is Git's; everything outside it, at the top level of the working tree and in any sub-directory (folder) within it other than .git itself, is yours.


3A bare repository is one without a work-tree. This might seem kind of redundant or pointless, but it does actually have a function: see What problem is trying to solve a Git --bare repo?


What git checkout or git switch is about

When you check out some commit—with git checkout or git switch and a branch name—you're telling Git:

  • Use the branch name to find the latest commit by hash ID.
  • Remove, from my working tree, all the files that came out of whatever commit I've been using.
  • Replace, into my working tree, all the files that come out of the commit I just named.

Git takes a big short-cut here when it can: if you're moving from commit a123456 to b789abc, and most of the files in those two commits are de-duplicated, Git won't actually bother with the remove-and-replace for these files. This short-cut becomes important later, but if you start out thinking of git checkout / git switch as meaning: remove the current commit's files, change to a new current commit, and extract those files you have a good start.

How commits get strung together

Let's revisit the commit itself for a bit now. Each commit has, in its metadata, some set of parent lines. Most commits (by far in most repositories) have exactly one parent and that's the thing to start with.

Let's draw the commits in a simple, tiny, three-commit repository. The three commits will have three big ugly random-looking hash IDs, but rather than make some up, let's just call them commits A, B, and C in that order. Commit A was the very first commit—which is a bit special because it has no parent commit—and then you made B while using commit A, and made C while using B. So we have this:

A <-B <-C

That is, commit C, the latest commit, has some files as its snapshot, and has, as its parent, the raw hash ID of commit B. We say that C points to B.

Meanwhile, commit B has some files as its snapshot, and has commit A as its parent. We say that B points to A.

Your branch name, which we'll assume is main, points to the latest commit C:

A--B--C   <-- main

(here I get lazy about drawing the arrows between commits as arrows, but they're still backwards-pointing arrows, really).

When you git checkout main, Git extracts all the commit-C files into your working tree. You have those files available to view and edit.

If you do edit some, you use git add and git commit to make a new commit. This new commit gets an all-new, never been used before anywhere in any Git repository in the universe, hash ID, but we'll just call this new commit D. Git will arrange for new commit D to point backwards to existing commit C, because C is the one you've been using, so let's draw in new commit D:

A--B--C   <-- main
       \
        D

(The backwards slash going up-and-left from D to C is why I get lazy about the arrows—there are some arrow fonts but they don't work all that well on StackOverflow, so we just have to imagine the arrow from D to C.)

But now D is the latest main commit, so git commit also stores D's hash ID into the name main so that main now points to D:

A--B--C
       \
        D   <-- main

(and now there's no reason to use extra lines to draw things; I just kept it for visual symmetry).

This is one way a branch grows, in Git. You check out the branch, so that it's your current branch. Its tip-most commit—the one towards the right in this drawing, or towards the top in git log --graph output—becomes your current commit and those are the files you see in your working tree. You edit those files, use git add, and run git commit, and Git packages up the new files—with automatic de-duplication, so that if you change a file back to the way it was in B or A, it gets de-duplicated here!—into a new commit, then stuffs the new commit's hash ID into the current branch name.

How branches form

Let's say we start out with that same three-commit repository:

A--B--C   <-- main

Let's now create a new branch name dev. This name must point to some existing commit. There are only three commits, so we have to pick one of A, B, or C, for the name dev to point-to. The obvious one to use is the most recent: we probably don't need to go back in time to commit B or A to start adding new commits. So let's add dev so that it also points to C, by running:

git branch dev

We get:

A--B--C   <-- dev, main

It's hard to tell from our drawing: are we on dev or main? That is, if we run git status, which will it say, "on branch dev" or "on branch main"? Let's add a special name, HEAD in all uppercase like this, and attach it to one of the two branch names, to show which name we are using:

A--B--C   <-- dev, main (HEAD)

We are "on" branch main. If we make a new commit now, commit D will point back to commit C as usual, and Git will stick the new hash ID into the name main.

But if we run:

git checkout dev

Git will remove, from our working tree, all the commit-C files, and put in all the commit-C files instead. (Seems kind of silly, doesn't it? Short-cut! Git won't actually do any of that!) Now we have:

A--B--C   <-- dev (HEAD), main

and when we make our new commit D we get:

A--B--C   <-- main
       \
        D   <-- dev (HEAD)

If we git checkout main, Git will remove the commit-D files and install the commit-C files, and we'll be back to:

A--B--C   <-- main (HEAD)
       \
        D   <-- dev

and if we now make another new commit we will get:

        E   <-- main (HEAD)
       /
A--B--C
       \
        D   <-- dev

This is how branches work in Git. A branch name, like main or dev, picks out a last commit. From there, Git works backwards. Commit E might be the last main commit, but commits A-B-C are on main because we get to them when we start from E and work backwards.

Meanwhile, commit D is the last dev commit, but commits A-B-C are on dev because we get to them when we start from D and work backwards. Commit D is not on main because we never reach commit D when we start from E and work backwards: that skips right over D.

Review

We now know:

  • Git is about commits.
  • Commits store snapshots and metadata.
  • We organize the commits into branches using branch names to find the last commit.
  • We check out a commit to see its files as files, and to work on them. Otherwise they're special weird Gitty things that only Git can see.
  • No part of any commit can ever change, once it's made.

Now we'll get to git rebase.

What git rebase is about

We often find ourselves using Git and stuck in this kind of situation:

          F--G--H   <-- main
         /
...--A--B
         \
          C--D--E   <-- feature (HEAD)

and we say to ourselves: Gosh, it would be nice if we had started out feature later, when main had commit G and/or H in it, because we need what's in those now.

There's nothing fundamentally wrong with commits C-D-E and we could just use git merge, but for whatever reason—the boss says so, the co-workers have decided they like a rebase flow, whatever it might be—we decide that we're going to "improve" our C-D-E commits. We're going to re-make them so that they come after F-G-H, like this:

                  C'-D'-E'   <-- improved-feature (HEAD)
                 /
          F--G--H   <-- main
         /
...--A--B
         \
          C--D--E   <-- feature

We can, quite literally, do this by check out commit H, making a new branch, and then re-doing our work:

git switch main
git switch -c improved-feature
... redo a bunch of work ...

What git rebase does is automate this for us. If we were to do it manually, each "redo" step would involve using git cherry-pick (which I won't go into in any detail here). The git rebase command automates the cherry-picking for us, and then adds one other twist: instead of requiring a new branch name like improved-feature, it simply yanks the old branch name off the old commits and makes it point to the new ones:

                  C'-D'-E'   <-- feature (HEAD)
                 /
          F--G--H   <-- main
         /
...--A--B
         \
          C--D--E   [abandoned]

The old abandoned commits are actually still there, in Git, for at least 30 days or so. But with no name by which to find them, you can only see those commits if you have saved their hash IDs, or have some trick by which to find those hash IDs.4

When the rebase finishes completely, our original commits are copied to new-and-improved commits. The new commits have new and different hash IDs, but since no human ever notices the actual hash IDs, a human who looks at this repository just sees three feature-branch-only commits and assumes they have magically been changed into the new improved ones.5


4Git comes with some handy tricks built-in, but we won't cover them here.

5Git sees the truth, and if you connect your Git repository to some other Git repository, they will have ... words, or a long conversation, about this and it can make a big mess if you don't know what you're doing. Basically, if they still have your originals, you can wind up getting them back when you thought you'd gotten rid of them! Any time you connect two Git repositories, you generally have one hand over any new commits it has that the other one is missing. This is where the magic of the hash IDs really comes into effect: they do this all by hash ID alone.

The bottom line here is that you should only rebase commits when all users of those commits agree that those commits can be rebased. If you're the only user, you just have to agree with yourself, so that's a lot easier. Otherwise, get agreement in advance from all other users before you start rebasing.

情感失落者 2025-02-10 06:27:29

要查看一个远程分支(我还没有),我更喜欢 git switch abr abr abr :它的猜猜模式将自动设置远程跟踪分支 Origin/abranch ,允许我进行简单的 git lupl 以在将来的审核实例中对其进行更新。

那将与 git switch -c&lt; branch&gt; - track&lt;远程&gt;/&lt; branch&gt;

我也更喜欢以

git config --global pull.rebase true
git config --global rebase.autoStash true

这种方式设置,该分支上的 git lupl 将在更新分支的顶部重新列出我的任何本地提交,不仅要进行我的评论,还要检查我的本地(尚未推动)代码/提交是否仍在更新的远程分支的顶部工作。

To review a remote branch (I don't have yet), I prefer git switch aBranch: its guess mode would automatically set a remote tracking branch origin/aBranch, allowing me to do simple git pull to update it in the future review instances.

That would be the same as git switch -c <branch> --track <remote>/<branch>

I also prefer setting

git config --global pull.rebase true
git config --global rebase.autoStash true

That way, a git pull on that branch would rebase any of my local commits on top of the updated branch, not only for my review, but also to check if my local (not yet pushed) code/commits still work on top of the updated remote branch.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文