git 中有哪些不同的存储库格式版本(对于 core.repositoryFormatVersion 设置)?

发布于 2024-10-19 21:52:08 字数 95 浏览 7 评论 0原文

我注意到 git core.repositoryFormatVersion 中的默认选项默认为 0,但是什么是“存储库格式版本”以及它们有何功能差异?

I noticed a default option in git core.repositoryFormatVersion which defaults to 0, but what are "repository format versions" and what functional difference do they make?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一绘本一梦想 2024-10-26 21:52:08

这是为了未来的兼容性——如果 git 开发人员发现有必要更改存储库在磁盘上存储的方式以启用某些新功能,那么他们可以使升级后的存储库的 core.repositoryformatversion 为 <代码>1。然后,知道该新格式的较新版本的 git 将触发代码来处理它,而不知道该新格式的较旧版本的 git 将优雅地错误 “预期 git 存储库版本 <= 0,发现 1。请升级 Git”

截至目前,定义或识别的唯一存储库格式版本是 0,它表示 git 的每个公开版本都使用的格式。

It's for future compatibility -- if the git developers ever find it necessary to change the way that repos are stored on disk to enable some new feature, then they can make upgraded repos have a core.repositoryformatversion of 1. Then newer versions of git that know about that new format will trigger the code to deal with it, and older versions of git that don't will gracefully error with "Expected git repo version <= 0, found 1. Please upgrade Git".

As of now, the only repo format version defined or recognized is 0, which denotes the format that every public release of git has used.

遗失的美好 2024-10-26 21:52:08

git 2.7(2015 年 11 月)在新的 文档/技术/repository-version.txt
请参阅 提交 067fbd4提交 00a09d5 (2015 年 6 月 23 日)作者:Jeff King (peff
(由 Junio C Hamano -- gitster -- 合并于 提交 fa46579,2015 年 10 月 26 日)

现在,您可以定义“扩展”,并且使用 core.repositoryformatversion 作为“标记”来表明所述扩展的存在,而不必更改 Git 版本号本身:

如果我们要为每个此类更改提高存储库版本,那么任何理解版本 X 的实现也必须理解 X-1X- 2 等等,即使不兼容可能出现在系统的正交部分,而且我们没有理由不能
实现一个功能而不使用另一个功能(或者更重要的是,用户不能选择使用一个功能而不使用另一个功能,仅权衡该特定功能的兼容性)。

此补丁记录了现有的repositoryformatversion策略,并引入了一种新格式“1”,它允许存储库指定它必须与任意一组扩展一起运行

文档摘录:

每个 git 存储库都在目录中标有数字版本
config 文件的 core.repositoryformatversion 键。这个版本
指定对磁盘存储库数据进行操作的规则。

请注意,这仅适用于访问存储库的磁盘内容
直接。
仅理解格式 0 的旧客户端仍可以通过 git:// 连接到使用格式 1 的存储库,只要服务器进程理解格式1

版本0

这是git初始版本定义的格式,包括但不限于存储库目录、存储库配置文件以及对象和引用存储的格式。

版本1

此格式与版本 0 相同,但有以下例外:

  1. 读取core.repositoryformatversion变量时,git
    支持版本 1 的实现也必须读取任何
    配置键可以在extensions部分找到
    配置文件。

  2. 如果版本 1 存储库指定了任何 extensions.* 键,
    正在运行的 git 尚未实现,该操作不得
    继续。
    同样,如果任何已知键的值不被理解
    通过实现,操作不得继续。

这可以使用,例如:

  • 通知 git 不应基于对象进行修剪
    仅取决于参考提示的可达性(例如,因为它
    有“clone --shared”子项)

  • 参考文献以不同于通常格式的格式存储
    “refs”和“packed-refs”目录

现在,这确实是所有发布版本号策略及其的原始方法semver 政策

因为我们遇到格式“1”,并且因为格式“1”要求正在运行的 git 了解所提到的任何扩展,所以我们知道旧版本的代码在面对这些新格式时不会做出危险的事情。< /p>

例如,如果用户选择使用数据库存储来存储引用,他们可以将“extensions.refbackend”配置设置为“db”。
旧版本的 git 无法理解格式“1”和 bail。
理解“1”但不知道“refbackend”的 git 版本,或者知道“refbackend”但不知道“db”后端的 git 版本将拒绝运行。
当然,这很烦人,但比声明存储库中没有引用或写入其他实现无法读取的位置的替代方案要好得多。

请注意,我们在此仅定义格式 1 的规则。
我们自己从来不写格式 1; 它是一个供用户和未来扩展使用的工具,旨在为旧实现提供安全性


作为第一个扩展,您将拥有 git 2.7 preciousObjects

如果在存储库中使用此扩展,则不应运行可能从对象存储中删除对象的操作。如果您与其他您看不到其引用的存储库共享该存储,这可能会很有用。

该文档提到:

当配置键extensions.preciousObjects设置为true时,存储库中的对象不得被删除(例如,通过git-prune 或git repack -d)。


那是:

例如,如果您这样做:

$ git clone -s 父子
$ git -C 父配置扩展.preciousObjects true
$ git -C 父配置 core.repositoryformatversion 1

现在,在父存储库中运行 git 时,您可以更加安全。
修剪和重新打包将出现错误,并且 git gc 将跳过这些操作(它将继续打包引用并执行其他非对象操作)。
旧版本的 Git 在存储库中运行时,每次操作都会失败。

请注意,在执行“clone -s”时,默认情况下我们不会设置 preciousObjects 扩展名,因为这样做会破坏向后兼容性。这是用户应该明确做出的决定。


请注意,此 core.repositoryformatversion 业务已经过时了。真的老了。 提交 ab9cb76,2005 年 11 月,Git 0.99.9l
最初是针对数据库版本完成的

这使得 init-db 存储库版本感知。

它会检查现有配置文件是否表明正在重新初始化的存储库版本错误,并在造成进一步损害之前中止。


Git 2.22(2019 年第二季度)将避免周围的泄漏
repository_format 结构。

请参阅提交 e8805af(2019 年 2 月 28 日)和 提交 1301997(2019 年 1 月 22 日),作者:马丁·奥格伦 (``)
(由 Junio C Hamano -- gitster -- 合并于 提交 6b5688b,2019 年 3 月 20 日)

setup:使用structrepository_format修复内存泄漏

在我们设置了structrepository_format之后,它拥有各个部分
分配的内存。然后我们要么使用这些成员,因为我们决定我们
想要使用“候选”存储库格式,或者我们放弃
候选/暂存空间。
在第一种情况下,我们将内存的所有权转移给一些全局变量。在后一种情况下,我们只是默默地删除该结构并最终导致内存泄漏。

引入一个初始化宏REPOSITORY_FORMAT_INIT和一个
函数 clear_repository_format(),在每一侧使用
read_repository_format()。要有清晰简单的内存归属,
structrepository_format 的所有用户复制以下字符串:
他们从中获取,而不是窃取指针。

read_...() 开始时调用 clear_...() 而不是仅仅归零
结构体,因为我们有时会多次输入该函数。
因此,在调用 read_...() 之前初始化结构体非常重要,因此
记录下来。
这也很重要,因为在调用 clear_...() 之前我们甚至可能不会调用 read_...(),例如,参见 builtin/init- db.c.

教导read_...()在错误时清除结构,以便将其重置为
安全状态,并记录下来。 (在setup_git_directory_gently()中,我们
看看repo_fmt.hash_algo,即使repo_fmt.version是-1,我们
实际上不应该按照 API 执行此操作。这次提交之后,那就是
好的。)


使用 Git 2.28(2020 年第 3 季度),运行时本身可以自动升级存储库格式版本,例如在非浅层提取时。

请参阅 提交 14c7fa2提交 98564d8, 提交01bbbbd提交 16af5f1(2020 年 6 月 5 日),作者:李鑫 (livid)
(由 Junio C Hamano -- gitster -- 合并于 提交 1033b98,2020 年 6 月 29 日)

fetch:允许在初始后添加过滤器克隆

签字人:李鑫

追溯添加过滤器对于现有浅克隆非常有用,因为它们允许用户查看早期的更改历史记录,而无需在常规 --unshallow 提取中下载所有 git 对象。

如果没有此补丁,用户可以通过编辑存储库配置来进行部分克隆,将远程版本转换为承诺者,例如:

git config core.repositoryFormatVersion 1
git config extensions.partialClone 原点 
git fetch --unshallow --filter=blob:无来源

由于完成这项工作的困难部分已经就位,并且此类编辑可能容易出错,因此请教 Git 自动执行所需的配置更改。

请注意,此更改不会修改现有的 Git 行为,该行为可在不更改 repositoryFormatVersion 的情况下识别设置 extensions.partialClone


警告:在 2.28-rc0 中,我们纠正了一个错误,即即使在版本 0 存储库中,某些存储库扩展也会被错误地接受(extensions.* 命名空间中的这些配置变量应该在其存储库中具有特殊含义)版本号为 1 或更高),但这个变化有点太大了。

请参阅 提交 62f2eca提交 1166419(2020 年 7 月 15 日),作者:Jonathan Nieder (artagnon
(由 Junio C Hamano -- gitster -- 合并于 提交 d13b7f2,2020 年 7 月 16 日)

还原“check_repository_format_gently():拒绝扩展对于旧存储库”

报告人:Johannes Schindelin
签字人:Jonathan Nieder

这将恢复提交 14c7fa269e42df4133edd9ae7763b678ed6594cd

ab9cb76f661core.repositoryFormatVersion 字段> ("仓库格式版本检查。", 2005-11-25, Git v0.99.9l -- merge),由于 Martin Atukunda 的一些受欢迎的分析,提供了一些受欢迎的前向兼容性。

语义很简单:将 core.repositoryFormatVersion 设置为 0 的存储库应该可以被所有正在使用的 Git 实现所理解; Git 实现应该尽早出错,而不是尝试对具有较高 core.repositoryFormatVersion 值(表示它们不理解的新格式)的 Git 存储库进行操作。

直到00a09d57eb8(引入“扩展”形式的 core.repositoryformatversion,2015-06-23)。

这为 Git 存储库提供了更细粒度的扩展机制。

core.repositoryFormatVersion 设置为 1 的存储库中,Git 实现可以作用于“extensions.*”设置,从而修改存储库的解释方式。

在存储库格式版本 1 中,无法识别的扩展设置会导致 Git 出错。

如果用户设置扩展设置但忘记将存储库格式版本增加到 1,会发生什么?
在这种情况下,扩展设置仍然可以被识别;更糟糕的是,无法识别的扩展设置不会导致 Git 出错。

因此,将存储库格式版本 0 与扩展设置相结合,在某种意义上会产生两全其美的结果。

为了改善这种情况,自 14c7fa269e4 (check_repository_format_gently()< /code>: 拒绝旧存储库的扩展,2020-06-05)Git 相反会忽略 v0 模式下的扩展。这样,v0 存储库即可获取历史(2015 年之前)行​​为,并保持与不了解 v1 格式的 Git 实现的兼容性。

不幸的是,用户一直在使用这种配置,这种行为变化让许多人感到惊讶:

  • 按照建议启用extensions.worktreeConfig(无需增加存储库格式版本)的“git config --worktree”用户将找到他们的工作树配置不再生效
  • 已设置 extensions.partialClone 等工具,例如 copybara code> 在现有存储库中(同时不增加存储库格式版本)会发现设置不再生效

如果我们回去的话,14c7fa269e4中引入的行为可能是一个很好的行为到了 2015 年,但我们已经太晚了。

出于某种原因,我认为这是最初实现的内容,但它已经倒退了。

抱歉,在 14c7fa269e4 正在开发时没有进行研究。

让我们回到 2015 年以来的行为:始终对 extensions.* 设置进行操作,无论存储库格式版本如何。

当我们在这里时,包括一些测试来描述对“升级存储库版本”代码路径的影响。

git 2.7 (Nov. 2015) adds a lot more information in the new Documentation/technical/repository-version.txt.
See commit 067fbd4, commit 00a09d5 (23 Jun 2015) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit fa46579, 26 Oct 2015)

You now can define "extensions", and use core.repositoryformatversion as a "marker" to signal the existence of said extensions, instead of having to bump the Git version number itself:

If we were to bump the repository version for every such change, then any implementation understanding version X would also have to understand X-1, X-2, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot
implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature).

This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions.

Extracts from the doc:

Every git repository is marked with a numeric version in the
core.repositoryformatversion key of its config file. This version
specifies the rules for operating on the on-disk repository data.

Note that this applies only to accessing the repository's disk contents
directly.
An older client which understands only format 0 may still connect via git:// to a repository using format 1, as long as the server process understands format 1.

Version 0

This is the format defined by the initial version of git, including but not limited to the format of the repository directory, the repository configuration file, and the object and ref storage.

Version 1

This format is identical to version 0, with the following exceptions:

  1. When reading the core.repositoryformatversion variable, a git
    implementation which supports version 1 MUST also read any
    configuration keys found in the extensions section of the
    configuration file.

  2. If a version-1 repository specifies any extensions.* keys that
    the running git has not implemented, the operation MUST NOT
    proceed.
    Similarly, if the value of any known key is not understood
    by the implementation, the operation MUST NOT proceed.

This can be used, for example:

  • to inform git that the objects should not be pruned based
    only on the reachability of the ref tips (e.g, because it
    has "clone --shared" children)

  • that the refs are stored in a format besides the usual
    "refs" and "packed-refs" directories

Now that is really an original approach to all the release version number policy and its semver policy.

Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats.

For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db".
Older versions of git will not understand format "1" and bail.
Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run.
This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read.

Note that we are only defining the rules for format 1 here.
We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations.


As a first extension, you will have with git 2.7 preciousObjects:

If this extension is used in a repository, then no operations should run which may drop objects from the object storage. This can be useful if you are sharing that storage with other repositories whose refs you cannot see.

The doc mentions:

When the config key extensions.preciousObjects is set to true, objects in the repository MUST NOT be deleted (e.g., by git-prune or git repack -d).

That is:

For instance, if you do:

$ git clone -s parent child
$ git -C parent config extensions.preciousObjects true
$ git -C parent config core.repositoryformatversion 1

you now have additional safety when running git in the parent repository.
Prunes and repacks will bail with an error, and git gc will skip those operations (it will continue to pack refs and do other non-object operations).
Older versions of Git, when run in the repository, will fail on every operation.

Note that we do not set the preciousObjects extension by default when doing a "clone -s", as doing so breaks backwards compatibility. It is a decision the user should make explicitly.


Note that this core.repositoryformatversion business is old. Really old. commit ab9cb76, Nov. 2005, Git 0.99.9l.
It was done initially for the db version:

This makes init-db repository version aware.

It checks if an existing config file says the repository being reinitialized is of a wrong version and aborts before doing further harm.


Git 2.22 (Q2 2019) will avoid leaks around the
repository_format structure.

See commit e8805af (28 Feb 2019), and commit 1301997 (22 Jan 2019) by Martin Ågren (``).
(Merged by Junio C Hamano -- gitster -- in commit 6b5688b, 20 Mar 2019)

setup: fix memory leaks with struct repository_format

After we set up a struct repository_format, it owns various pieces of
allocated memory. We then either use those members, because we decide we
want to use the "candidate" repository format, or we discard the
candidate / scratch space.
In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory.

Introduce an initialization macro REPOSITORY_FORMAT_INIT and a
function clear_repository_format(), to be used on each side of
read_repository_format(). To have a clear and simple memory ownership,
let all users of struct repository_format duplicate the strings that
they take from it, rather than stealing the pointers.

Call clear_...() at the start of read_...() instead of just zeroing
the struct, since we sometimes enter the function multiple times.
Thus, it is important to initialize the struct before calling read_...(), so
document that.
It's also important because we might not even call read_...() before we call clear_...(), see, e.g., builtin/init-db.c.

Teach read_...() to clear the struct on error, so that it is reset to
a safe state, and document this. (In setup_git_directory_gently(), we
look at repo_fmt.hash_algo even if repo_fmt.version is -1, which we
weren't actually supposed to do per the API. After this commit, that's
ok.)


With Git 2.28 (Q3 2020), the runtime itself can upgrade the repository format version automatically, for example on an unshallow fetch.

See commit 14c7fa2, commit 98564d8, commit 01bbbbd, commit 16af5f1 (05 Jun 2020) by Xin Li (livid).
(Merged by Junio C Hamano -- gitster -- in commit 1033b98, 29 Jun 2020)

fetch: allow adding a filter after initial clone

Signed-off-by: Xin Li

Retroactively adding a filter can be useful for existing shallow clones as they allow users to see earlier change histories without downloading all git objects in a regular --unshallow fetch.

Without this patch, users can make a clone partial by editing the repository configuration to convert the remote into a promisor, like:

git config core.repositoryFormatVersion 1
git config extensions.partialClone origin   
git fetch --unshallow --filter=blob:none origin

Since the hard part of making this work is already in place and such edits can be error-prone, teach Git to perform the required configuration change automatically instead.

Note that this change does not modify the existing Git behavior which recognizes setting extensions.partialClone without changing repositoryFormatVersion.


Warning: In 2.28-rc0, we corrected a bug that some repository extensions are honored by mistake even in a version 0 repositories (these configuration variables in extensions.* namespace were supposed to have special meaning in repositories whose version numbers are 1 or higher), but this was a bit too big a change.

See commit 62f2eca, commit 1166419 (15 Jul 2020) by Jonathan Nieder (artagnon).
(Merged by Junio C Hamano -- gitster -- in commit d13b7f2, 16 Jul 2020)

Revert "check_repository_format_gently(): refuse extensions for old repositories"

Reported-by: Johannes Schindelin
Signed-off-by: Jonathan Nieder

This reverts commit 14c7fa269e42df4133edd9ae7763b678ed6594cd.

The core.repositoryFormatVersion field was introduced in ab9cb76f661 ("Repository format version check.", 2005-11-25, Git v0.99.9l -- merge), providing a welcome bit of forward compatibility, thanks to some welcome analysis by Martin Atukunda.

The semantics are simple: a repository with core.repositoryFormatVersion set to 0 should be comprehensible by all Git implementations in active use; and Git implementations should error out early instead of trying to act on Git repositories with higher core.repositoryFormatVersion values representing new formats that they do not understand.

A new repository format did not need to be defined until 00a09d57eb8 (introduce "extensions" form of core.repositoryformatversion, 2015-06-23).

This provided a finer-grained extension mechanism for Git repositories.

In a repository with core.repositoryFormatVersion set to 1, Git implementations can act on "extensions.*" settings that modify how a repository is interpreted.

In repository format version 1, unrecognized extensions settings cause Git to error out.

What happens if a user sets an extension setting but forgets to increase the repository format version to 1?
The extension settings were still recognized in that case; worse, unrecognized extensions settings do not cause Git to error out.

So combining repository format version 0 with extensions settings produces in some sense the worst of both worlds.

To improve that situation, since 14c7fa269e4 (check_repository_format_gently(): refuse extensions for old repositories, 2020-06-05) Git instead ignores extensions in v0 mode. This way, v0 repositories get the historical (pre-2015) behavior and maintain compatibility with Git implementations that do not know about the v1 format.

Unfortunately, users had been using this sort of configuration and this behavior change came to many as a surprise:

  • users of "git config --worktree" that had followed its advice to enable extensions.worktreeConfig (without also increasing the repository format version) would find their worktree configuration no longer taking effect
  • tools such as copybara that had set extensions.partialClone in existing repositories (without also increasing the repository format version) would find that setting no longer taking effect

The behavior introduced in 14c7fa269e4 might be a good behavior if we were traveling back in time to 2015, but we're far too late.

For some reason I thought that it was what had been originally implemented and that it had regressed.

Apologies for not doing my research when 14c7fa269e4 was under development.

Let's return to the behavior we've had since 2015: always act on extensions.* settings, regardless of repository format version.

While we're here, include some tests to describe the effect on the "upgrade repository version" code path.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文