当前位置：文江博客话题详情

快照的概念

发布于 2024-12-05 19:29:27 字数 421 浏览 1 评论 0原文

Git基本术语中有一个快照的概念。

这个概念被用在 Git 工作流程中：

您修改工作目录中的文件。
您暂存文件，并将其快照添加到您的暂存中面积。
您执行一次提交，这会按文件中的原样获取文件临时区域和存储快照永久到您的 Git 目录。

您能否准确解释什么是快照，并展示一些文件及其快照的小示例，以及为什么 Git 使用它们而不是像其他 VCS 那样进行差异化？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

り繁华旳梦境 2024-12-12 19:29:27

快照仅表示文件在给定时间点的内容。所有版本控制系统在概念上都在快照上运行。您希望能够查看源代码在过去任何给定时间点的样子。它们还都存储差异以节省存储空间。 git 的独特之处在于两个方面：内部计算和存储差异的方式与文件的历史记录没有直接关系，并且差异不会每次都重新计算。

假设您有一个 1000 字节的文件，几乎在每次构建时都会更新该文件。如果您更改其中的一个字节，git 将暂时存储该文件的全新副本，并更改该字节。这就是人们大发雷霆并说：“天啊，git 太蠢了，它应该立即存储差异。我坚持使用 subversion。”

但是，请考虑一下您实际如何使用源代码管理。几乎所有您想要进行比较的内容都是自上次推送以来发生变化的内容。因为它还没有计算差异，所以 git 恰好有一个完整的、易于访问的所有最近更改的文件的缓存，其他版本控制系统必须从版本 1 开始并应用数百个差异来重建相同的内容。

然后，当您推送以共享更改时，git gc 会自动运行，以便更有效地存储这些文件以便通过网络传输，然后计算并存储差异。但是，它不一定是文件版本 n-1 到版本 n 的差异。如果内容在多个文件中重复，git 可以考虑到这一点。如果在多个分支中进行相同的更改，git 可以将其考虑在内。如果文件被移动，git 会考虑到这一点。如果将来发现一些可以提高效率的启发式方法，git 可以在不破坏现有客户端的情况下考虑到这一点。它并不坚持差异必须始终是从一个连续版本到下一个版本的想法。

与其他版本控制软件相比，正是像这样的基本设计决策使得 git 如此之快。

A snapshot just means what a file's contents were at a given point in time. All version control systems operate conceptually on snapshots. You want to be able to see what your source code looked like at any given point in the past. They also all store diffs in order to save storage space. Where git is unique is in two ways: the way diffs are computed and stored internally isn't directly related to the file's history, and the diffs aren't recomputed every single time they could be.

Let's say you have a 1000-byte file that gets updated on practically every build. If you change one byte of it, git will temporarily store a completely new copy of the file, with the one byte changed. This is where people flip out and say, "OMG, git is so stupid, it should store the diffs right away. I'm sticking with subversion."

However, think about how you actually use your source control. Almost everything you want to do comparisons with are things that have changed since the last time you pushed. Because it hasn't computed the diffs yet, git just happens to have a full, easily accessible cache of all those recently-changed files, where other version control systems have to start with version 1 and apply hundreds of diffs to reconstruct the same content.

Then when you do a push to share your changes, git gc is run automatically in order to store those files more efficiently for transport over the network, and diffs are computed and stored then. However, it's not necessarily a diff from version n-1 to version n of the file. If content is repeated across many files, git can take that into account. If the same change is made in several branches, git can take that into account. If a file is moved, git can take that into account. If some heuristic is discovered in the future that can make things more efficient, git can take that into account without breaking existing clients. It's not wedded to the idea that the diff must always be from one consecutive version to the next.

It's fundamental design decisions like these that make git so fast compared to other version control software.

回复收藏 0 原文

~没有更多了~