GIT 返回特定提交 ID，而不删除历史记录

发布于 2025-01-10 04:46:15 字数 1237 浏览 0 评论 0原文

这是我的提交日志，我想切换回特定的提交 ID（例如 Second），当我使用 git checkout 时，没问题，但是，我无法再切换回最后一次提交（第四次）。

HEAD 指向第二次提交，当我记录我的提交时，此后就没有任何内容了。

如何在不删除历史记录的情况下在提交之间切换？

commit 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a (HEAD -> main)
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:08:36 2022 +0330

    Fourth

commit 9c3e8919cfa2c970f14056eef34ca12b49025f65
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:08:13 2022 +0330

    Third

commit d33795596001197f382038a72d20faf0cfbe7ab7
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:07:55 2022 +0330

    Second

commit 2fe7b1d8270fcfb41d73e69293da10734e37b069
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:07:39 2022 +0330

    First

原文

Here is my commit log and I want to switch back to a specific commit id (for example Second),
When I use git checkout , it is ok but, I am no longer able to switch back to the last commit (Fourth).

HEAD points to the second commit and there is nothing after that when I log my commits.

How can I switch between my commits without deleting the history?

commit 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a (HEAD -> main)
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:08:36 2022 +0330

    Fourth

commit 9c3e8919cfa2c970f14056eef34ca12b49025f65
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:08:13 2022 +0330

    Third

commit d33795596001197f382038a72d20faf0cfbe7ab7
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:07:55 2022 +0330

    Second

commit 2fe7b1d8270fcfb41d73e69293da10734e37b069
Author: mohhhe <[email protected]>
Date:   Fri Feb 25 19:07:39 2022 +0330

    First

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦里梦着梦中梦 2025-01-17 04:46:15

小巷类比

想象一下，您正处于一座大城市的一条狭窄街道或小巷的入口处，周围都是摩天大楼。沿着小巷往下看，你可以看到一系列的垃圾箱。现在，沿着小巷走到一半，看看前方。一半的垃圾箱不见了！他们去哪儿了？无处：他们就在你身后。

同样的想法也适用于此：Git 不会删除您看不到的提交。您只是看不到它们。移回到您可以看到它们的有利位置，您将再次看到它们。

现实，就像在 Git 中一样

，提交是一个由两部分组成的实体：它保存所有文件的快照 - 好吧，是 Git 知道的所有文件，当时你（或任何人））制作了该快照以及一些元数据。每个提交都会编号，并带有一个又大又丑的随机哈希 ID，如输出中所示的 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a。

哈希 ID 是 Git 查找提交所需的 ID。提交本身使用提交对象和其他内部支持对象存储为一堆部分。您在此处看到的哈希 ID 是提交对象本身的哈希 ID，它仅保存元数据：快照位于一个树对象中，该对象还有更多子对象。但您通常不需要知道这一点；您确实需要知道的是，Git 有一个包含所有对象的大型数据库，每个对象都有编号，并且 Git 本身需要该编号来检索对象.¹

然而，人类在数字方面非常糟糕。这四个哈希 ID 又是什么？无论如何，不值得记住它们。 Git 为您提供了一种非常快速的方法来查找这四个哈希 ID 中的一个：名称main，您很容易记住，可以找到一个 > 这些哈希 ID。

随着时间的推移，main 找到的一个哈希 ID 可能会发生变化，但现在，它会为您和 Git 找到 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a。该提交是分支 main 上的最新提交。这是根据定义，因为名称main 包含该ID。因此，如果您希望 Git 找到 main 上的最新提交，您只需向 Git 询问 main，Git 就会查找名称< code>main 并找到该 ID，从而找到该提交。

如果当您进行新提交时，Git 将执行以下操作（以某种顺序；您实际上无法看到是否有任何特定顺序）：

Make a Git 知道的每个文件的快照。要使 Git 看到您对 Git 已经知道的文件所做的任何更新，您必须对其运行 git add 。要使 Git 看到之前不存在的您创建的新文件，您必须对其运行git add。事情远不止这些，但这就是您必须继续运行 git add 的第一个近似原因：告诉 Git 新快照应该使用 < em>新的或更新的文件。
收集大量元数据。 Git 将收集的元数据包括您的姓名（在 user.name 设置中设置）和电子邮件地址（来自您的 user.email 设置）。它包括精确到秒的当前日期和时间。而且，它包括您所在分支上的当前最新提交，无论是什么，在本例中为 main。

Git 将所有这些写出来以进行新提交，这将获得一个新的、唯一的、以前从未使用过、永远不会再使用的哈希 ID。此哈希 ID 绝不能出现在任何 Git 存储库中，除非用于标识此您刚刚进行的提交。（这就是哈希 ID 如此大且难看的原因：因此它们可以是唯一的。）

然后 Git 将新提交的哈希 ID 存储在当前分支名称中。所以现在名称 main 选择您的新提交——您刚刚创建的提交。

¹那是因为这个大数据库是一个键值对store，以哈希 ID 为键。有一种缓慢的方法可以遍历整个数据库并获取每个<键，值>。一对，但是在一个大存储库中这需要很多秒，甚至几分钟：太慢而没有用处。一个键查找需要几毫秒，所以这就是你希望 Git 做的事情。

提交因此形成向后看的链

这一切意味着 name main 自动且始终选择最后一次提交 > 在名为 main 的分支中。根据定义，main 是街道/小巷/高速公路/高速公路/无论它是什么的尽头。您可以通过在这条“道路”上进行新的提交来添加新的提交，从而将“道路”进一步延伸。

展示这一点的另一种方法是使用大写字母来绘制提交来代表真实的哈希 ID。在这里，我们有您最初的四个提交，我们将其称为 A、B、C 和 D简而言之：

A <-B <-C <-D   <--main

name main 将“指向”（包含其哈希 ID）最后一个提交，即提交 D。提交 D 有一个快照（所有文件的副本，永久冻结）和一些元数据，并且 D 的元数据表明上一个提交是提交C。我们说D指向C。

Commit C 当然有快照和元数据。快照保存了 Git 在您创建 C 时知道的文件，并永久冻结，元数据保存了日期和时间等，包括早期提交的哈希 ID <代码>B。我们说C指向B。

提交 B 也保存快照和元数据，并向后指向提交 A，后者保存快照和元数据。但是提交 A 是您所做的第一次提交，直到您创建 A 为止，这是一个完全空的存储库。所以 commit A 不会进一步向后指向：它不能。

这就是您的存储库中的四次提交的情况。他们永远无法改变！它们是完全只读的，并且这四个哈希 ID 现在将永远用完。²名称 main 指向最后一个 — 直到您创建一个新的提交。然后新的提交 E 出现，向后指向 D，Git 更新名称 main 以指向 E >:

A <-B <-C <-D <-E   <--main

²这在技术上是不可能的，Git 并没有真正尝试阻止其他人获得相同的哈希 ID 除非通过使用加密欺骗来使其不太可能我们不必担心。没有人会意外重复使用您的哈希 ID。加密货币也使得故意这样做变得困难。

回到过去

但是当您想要访问旧提交时会发生什么？您运行：

git checkout d33795596001197f382038a72d20faf0cfbe7ab7

告诉 Git 从您的工作区域中删除永久安全存储在提交 D 中的所有文件，然后返回到提交 B：提取存储的文件-永远将文件从提交 B 到您的工作区。 Git 做到了，然后 git log 显示您提交了 B 和 A 并停止了。为什么？

Git 使用 HEAD 来查看事物

Git 有一个非常特殊的名称 HEAD，它根本不是一个分支名称。³相反，此名称 HEAD 通常附加到分支名称。这就是您的第一个 git log 显示的内容：

commit 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a（HEAD -> main）

Git 的名称 HEAD “指向”这里的名称 main。我喜欢这样画它：

A--B--C--D   <-- main (HEAD)

名称 HEAD “附加”到名称 main。（我也懒得在提交之间绘制箭头。只要记住从 A 到 B 到 C 到 D 的连接线实际上是向后指向的箭头。）

运行 git log 告诉 Git：< em>首先，使用 HEAD 查找提交。由于 HEAD 附加到 main，Git 使用 main< /code> 查找提交 D。然后，git log 命令会显示您提交的 D，默认情况下会显示它；您可以使用 git log 来更改此选项，然后按照 D 的箭头返回到 C 并显示 C< /代码>。然后git log跟随C的箭头到B，并显示B，并跟随B< /code> 的箭头指向 A 并显示 A。 Commit A 没有向后箭头，因此 git log 终于可以停止了。

然而，当您通过哈希 IDgit checkout提交时，Git 会进入 Git 所谓的分离 HEAD 模式。在这里，名称 HEAD 不再附加到分支名称。相反，它直接指向提交。如果您选择提交 B，您会得到以下结果：

A--B   <-- HEAD
    \
     C--D   <-- main

git log 命令的工作方式与以前一样：它使用 HEAD 来查找提交。但这次 HEAD 找到提交 B，而不是 name main，然后提交 D。因此，git log 显示 B，并按照 B 的箭头返回到 A 并显示 A< /code>，然后用完要显示的提交并停止。

如果您想查看所有提交，您可以：

git checkout main

切换回分支 main，重新附加您的 HEAD：

A--B--C--D   <-- main (HEAD)

现在您可以开始 git log 从路的尽头（main 上的最后一次提交）开始，您将看到所有四个提交。或者，您可以运行：

git log main

它告诉 git log 它应该使用名称 main 来查找要开始的提交。这将找到提交 D，即使 HEAD 仍然直接指向提交 B。

³技术上可以创建一个名为 HEAD 的分支。不要这样做。

多个分支名称

一旦理解了上述内容，您就可以处理多个分支名称了。假设我们有这个：

A--B--C--D   <-- main (HEAD)

并且我们创建一个新的名称，例如develop，指向提交D，通过运行：

git branch develop

我们现在有这个：

A--B--C--D   <-- develop, main (HEAD)

也就是说，两个名称，develop 和main，都指向提交D。不过，特殊名称 HEAD 目前已附加到名称 main 上。让我们在 main 上进行一个新的提交，提交 E，并将其绘制出来：

           E   <-- main (HEAD)
          /
A--B--C--D   <-- develop

提交 E 现在是 < em>main 上的最新提交，而 D 提交仍然是 develop 上的最新提交 。

如果您现在运行：

git checkout develop

或：

git switch develop

切换到分支 develop，我们会得到：

           E   <-- main
          /
A--B--C--D   <-- develop (HEAD)

Commit E 仍然存在，但 Git 将获取所有 E' s 文件移出我们的工作区域，并放入D的所有文件。名称 HEAD 现在附加到名称 develop，而不是名称 main，因此 git log 将显示提交D、C、B 和 A，然后停止。运行 git log main 将显示 E，然后是 D，然后是 C，依此类推。

请注意，从 D 开始的提交都在两个分支上。但现在我们处于 develop 而不是 main 上，让我们进行另一个新的提交：

           E   <-- main
          /
A--B--C--D
          \
           F   <-- develop (HEAD)

通过 D 提交 A仍在两个分支上，但现在 main 和 develop 各有一个提交，而另一个分支没有。这两个名称选择最新的提交，即 E 和 F。 E 是最新的 main 分支提交，F 是最新的 develop-分支提交。它们都是“最新提交”！如果我们在 develop 上进行另一个新提交，如下所示：

           E   <-- main
          /
A--B--C--D
          \
           F--G   <-- develop (HEAD)

那么两个最新提交现在是 E 和 G。 每个分支名称“意味着”特定的提交，根据定义，这是该分支上的最新提交。此外，所有您（或 Git）可以通过从以下位置开始找到的提交“最新”提交以及向后工作都在该分支“上”。因此，当我们有：

          I--J   <-- br1
         /
...--G--H   <-- main
         \
          K--L   <-- br2

我们有三个最新提交，并且通过 H 进行的提交都在所有三个分支上。选择一个名称进行检查，这就是您将通过 git log 看到的一组提交；您工作区中的文件将是来自最新（或提示）提交的文件。

请注意，提交永远不会改变：一旦提交，它就永远有效。然而，我们通过分支名称发现提交，并且这些确实移动。如果我们采用最后一个示例并将名称 br2 向后移动一跳：

          I--J   <-- br1
         /
...--G--H   <-- main
         \
          K   <-- br2
           \
            L   ???

我们可能永远无法再次找到提交L。它已“丢失”，因为无法恢复其哈希 ID。不过，只要我们能找到 J 和 K，即使我们完全删除H，我们也不会丢失H em> 名称main。删除该名称仅意味着我们不再有直接访问来提交H：我们必须通过从K向后退一步来找到它，或者两个来自J。

An alleyway analogy

Imagine, for a moment, that you're at the entrance to a narrow street or alleyway in a big city, surrounded by skyscrapers. Looking down the alleyway, you can see a series of dumpsters. Now, walk halfway down the alleyway and look ahead of you. Half the dumpsters are gone! Where did they go? Nowhere: they're right behind you.

The same idea applies here: Git did not delete the commits you can't see. You just can't see them. Move back to a vantage point from which you can see them, and you'll see them again.

Reality, such as it is

In Git, a commit is a two-part entity: it holds a snapshot of all files—well, all the files that Git knew about, at the time you (or whoever) made that snapshot—and some metadata. Each commit is numbered, with a big, ugly, random-looking hash ID, like 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a as shown in your output.

The hash ID is what Git needs to find the commit. The commit itself is stored as a bunch of parts, using a commit object and other internal supporting objects. The hash ID you see here is that of the commit object itself, which holds only the metadata: the snapshot is in a tree object, which has yet more sub-objects. But you don't normally need to know this; what you do need to know is that Git has a big database holding all of its objects, each of which is numbered, and that Git itself needs the number to retrieve the object.¹

Humans, however, are very bad at numbers. What were those four hash IDs again? It's not worth memorizing them anyway though. Git offers you a very fast way to find one of those four hash IDs: the name main, which is easy for you to remember, finds one of those hash IDs.

Over time, the one hash ID that main finds may change, but right now, it finds 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a for you, and for Git. That commit is the latest commit on the branch main. It is by definition, because the name main holds that ID. So if you want Git to find the latest commit on main, you can simply ask Git for main, and Git will look up the name main and find that ID and hence find that commit.

If and when you make a new commit, here's what Git will do (in some order or another; you don't really get to see if there's any particular order to this):

Make a snapshot of every file that Git knows about. To make Git see any update you've made to a file that Git already knows about, you must run git add on it. To make Git see any new file you created that did not exist until now, you must run git add on it. There's a lot more to it than this, but that's the first approximation to the reason you have to keep running git add: to tell Git that the new snapshot should use the new or updated file.
Gather up a bunch of metadata. The metadata Git will gather includes your name (as set in your user.name setting) and email address (from your user.email setting). It includes the current date-and-time down to the second. And, it includes the currently most-recent commit, whatever that is, on the branch you're on—in this case main.

Git writes all this out to make a new commit, which gains a new, unique, never-used-before, never-will-be-used-again, hash ID. This hash ID must never occur in any Git repository except to be used to identify this commit that you just made right now. (That's why the hash IDs are so big and ugly: so they can be unique.)

Git then stores the new commit's hash ID in the current branch name. So now the name main selects your new commit—the one you just made.

¹That's because this big database is a key-value store, with the hash IDs being the keys. There's a slow method of walking the entire database and getting every <key, value> pair, but this takes many seconds, or even minutes, in a big repository: far too slow to be useful. A key lookup takes milliseconds, so that's what you want Git to be doing.

Commits thus form backwards-looking chains

What this all means is that the name main automatically and always selects the last commit in the branch named main. By definition, main is the end of the street / alleyway / superhighway / motorway / whatever it is. You add new commits by making new commits while you're on that "road", and that extends the "road" a bit further.

Another way to show this is to draw the commits using uppercase letters to stand in for the real hash IDs. Here, we have your original four commits, which we'll call A, B, C, and D for short:

A <-B <-C <-D   <--main

The name main will "point to" (contain the hash ID of) the last of these commits, commit D. Commit D has a snapshot—a copy of all the files, frozen for all time—and some metadata, and D's metadata says that the previous commit is commit C. We say that D points to C.

Commit C, of course, has a snapshot and metadata. The snapshot holds the files that Git knew about at the time you made C, frozen for all time, and the metadata holds the date-and-time and so on, including the hash ID of earlier commit B. We say that C points to B.

Commit B holds a snapshot and metadata too, and points backwards to commit A, which holds a snapshot and metadata. But commit A was the very first commit you made, in what had been, up until you made A, a totally-empty repository. So commit A doesn't point further backwards: it can't.

That's how your four commits are, in your repository. They can never change! They are completely read-only, and those four hash IDs are now used up forever.² The name main points to the last one—until you make a new commit. Then new commit E springs into being, pointing backwards to D, and Git updates the name main to point to E:

A <-B <-C <-D <-E   <--main

²This is technically impossible, and Git doesn't really try to prevent anyone else from getting the same hash ID except by using cryptographic trickery to make it so unlikely that we don't have to worry about it. Nobody will accidentally re-use your hash IDs. The crypto makes it hard to do it on purpose, too.

Driving back into the past

But what happens when you want to visit an old commit? You ran:

git checkout d33795596001197f382038a72d20faf0cfbe7ab7

to tell Git to erase, from your work area, all the files that are safely stored forever in commit D, and go back to commit B: extract the stored-forever files from commit B into your work area. Git did that, and then git log showed you commits B and A and stopped. Why?

Git uses your HEAD to be able to see things

Git has a very special name, HEAD, that is not a branch name at all.³ Instead, this name HEAD is normally attached to a branch name. That's what your first git log shows:

commit 61c71a9e5a6d9e29a4172e687172dd4b8523eb4a (HEAD -> main)

Git has the name HEAD "pointing to" the name main here. I like to draw it this way instead:

A--B--C--D   <-- main (HEAD)

with the name HEAD "attached to" the name main. (I also got lazy about drawing the arrows between commits. Just remember that the connecting lines, from A to B to C to D, are really backwards-pointing arrows.)

Running git log tells Git: First, use HEAD to find a commit. Since HEAD is attached to main, Git uses main to find commit D. The git log command then shows you commit D—well, shows it by default; there are options you can give git log to change this—and then follows D's arrow back to C and shows C. Then git log follows C's arrow to B, and shows B, and follows B's arrow to A and shows A. Commit A has no backwards arrow, so git log can finally stop.

When you git checkout a commit by its hash ID, however, Git goes into what Git calls detached HEAD mode. Here, the name HEAD is no longer attached to a branch name. Instead, it points directly to a commit. If you choose commit B, you get this:

A--B   <-- HEAD
    \
     C--D   <-- main

The git log command works as before: it uses HEAD to find a commit. But this time HEAD finds commit B, not name main and then commit D. So git log shows B, and follows B's arrow back to A and shows A, and then runs out of commits to show and stops.

If you want to see all your commits, you can:

git checkout main

which switches back to branch main, re-attaching your HEAD:

A--B--C--D   <-- main (HEAD)

and now you're starting git log from the end of the road—the last commit on main—and you'll see all four commits. Or, you can run:

git log main

which tells git log that it should use the name main to look up the commit to start with. This will find commit D, even though HEAD is still pointing directly to commit B.

³It's technically possible to create a branch named HEAD. Don't do it.

More than one branch name

Once you understand the above, you're ready to handle multiple branch names. Suppose we have this:

A--B--C--D   <-- main (HEAD)

and we create a new name, such as develop, pointing to commit D, by running:

git branch develop

We now have this:

A--B--C--D   <-- develop, main (HEAD)

That is, both names, develop and main, point to commit D. The special name HEAD is currently attached to the name main though. Let's make a new commit on main, commit E, and draw it in:

           E   <-- main (HEAD)
          /
A--B--C--D   <-- develop

Commit E is now the latest commit on main, while commit D continues to be the latest commit on develop.

If you now run:

git checkout develop

or:

git switch develop

to switch to branch develop, we get:

           E   <-- main
          /
A--B--C--D   <-- develop (HEAD)

Commit E still exists, but Git will take all of E's files out of our work area, and put in all of D's files instead. The name HEAD is now attached to the name develop, not the name main, so git log will show commits D, C, B, and A and then stop. Running git log main will show E, then D, then C, and so on.

Note that commits up through D are on both branches. But now that we're on develop instead of main, let's make another new commit:

           E   <-- main
          /
A--B--C--D
          \
           F   <-- develop (HEAD)

Commits A through D are still on both branches, but now main and develop each have one commit that the other branch doesn't have. The two names pick the latest commits, which are E and F. E is the latest main-branch commit and F is the latest develop-branch commit. They're both "the latest commit"! If we make another new commit on develop, like this:

           E   <-- main
          /
A--B--C--D
          \
           F--G   <-- develop (HEAD)

then the two latest commits are now E and G. Each branch name "means" that particular commit, which is by definition the latest commit on that branch. Moreover, all the commits you (or Git) can find by starting at that "latest" commit, and working backwards, are "on" that branch. So when we have:

          I--J   <-- br1
         /
...--G--H   <-- main
         \
          K--L   <-- br2

we have three latest commits, and commits up through H are on all three branches. Pick one name to check out, and that's the set of commits you'll see with git log; the files in your work area will be those from that latest—or tip—commit.

Note that the commits never change: once you make a commit, it is good forever. However, we find commits through branch names, and those do move about. If we take the last example and move the name br2 back one hop:

          I--J   <-- br1
         /
...--G--H   <-- main
         \
          K   <-- br2
           \
            L   ???

we may never be able to find commit L again. It has become "lost", as there's no way to recover its hash ID. As long as we can find J and K, though, we can't lose H, even if we completely delete the name main. Deleting that name just means we no longer have direct access to commit H: we have to find it by working back one step from K, or two from J.