如果第二次推送仅从第一次推送开始快进,那么并发 git 推送总是安全的吗?

发布于 2024-12-20 06:02:51 字数 1032 浏览 2 评论 0原文

我想自动将接收后挂钩中的提交从 LAN 上的中央存储库推送到云中的另一个中央存储库。 LAN 存储库是使用 git clone --mirror git@cloud:/path/to/repo 或等效命令创建的。

因为提交的文件相对于我们的上游带宽来说很大,所以完全有可能发生这样的事情:

  1. Alice 向 LAN 存储库发起推送。
  2. 当 post-receive 挂钩运行时,Bill 从 LAN 存储库中拉取。
    • LAN 存储库正在推送到云存储库。
    • 这也意味着 Bill 的本地存储库包含 Alice 推送的提交。通过测试确认。
  3. Bill 发起对 LAN 存储库的推送。
    • Bill 的推送是 Alice 推送的快进,因此 LAN 存储库将接受它。

当 LAN 存储库的 post-receive 挂钩执行时,将开始从 LAN 存储库到云存储库的第二次推送,并且两者将同时运行。

我不担心 git 对象。最坏的情况是两个推送都上传了 Alice 推送中的所有对象,但就我了解 git 的内部结构而言,这应该不重要。

我很担心裁判。 假设 Alice 使用较慢的连接进行推送,以便 Bill 的推送先完成。 假设数据包丢失或其他原因导致钩子从 LAN 存储库推送到 Bill 的推送云,然后在钩子的推送完成之前完成Alice 推送的 LAN 存储库到云端。如果 Alice 和 Bill 都在推送 master 分支,并且 Bill 的推送先完成,那么云存储库上的 master ref 会是什么?我希望它是比尔的头,因为那是后来的推动,但我担心它会是爱丽丝的头。

进一步说明:

我意识到,如果 Bill 从他的计算机到 LAN 存储库的推送先完成,那么 Alice 从她的计算机到 LAN 存储库的推送将会失败。在这种情况下,LAN 存储库的 post-receive 挂钩将不会执行。此外,请假设没有人会进行强制推送,因此如果 post-receive 挂钩在 LAN 存储库上运行,则所有引用更改都会快进。

I want to automatically push commits in the post-receive hook from a central repo on our LAN to another central repo in the cloud. The LAN repo is created using git clone --mirror git@cloud:/path/to/repo or equivalent commands.

Because the files being committed will be large relative to our upstream bandwidth, it's entirely possible something like this could happen:

  1. Alice initiates a push to the LAN repo.
  2. Bill pulls from the LAN repo while the post-receive hook is running.
    • The LAN repo is in the middle of pushing to the cloud repo.
    • This also means Bill's local repo contains the commits Alice pushed. Confirmed through testing.
  3. Bill initiates a push to the LAN repo.
    • Bill's push is a fast-forward of Alice's push, so the LAN repo will accept it.

When the post-receive hook for the LAN repo executes, a second push from the LAN repo to the cloud repo will start and the two will run concurrently.

I'm not worried about the git objects. The worst-case scenario is that both pushes upload all of the objects from Alice's push, but that shouldn't matter as far as I understand git's internals.

I'm concerned about the refs. Suppose Alice pushed using a much slower connection, so that Bill's push finishes first. Suppose packet loss or something else causes the hook's push from the LAN repo to the cloud of Bill's push to finish before the hook's push from the LAN repo to the cloud of Alice's push. If both Alice and Bill are pushing the master branch and Bill's push finishes first, What will the master ref be on the cloud repo? I want it to be Bill's HEAD, since that's the later push, but I'm concerned it will be Alice's HEAD.

Further clarification:

I realize Alice's push from her machine to the LAN repo will fail if Bill's push from his machine to the LAN repo finishes first. In that case, the LAN repo's post-receive hook will not execute. Furthermore, please assume nobody will be doing force pushes, so if the post-receive hook runs on the LAN repo, all ref changes are fast-forwards.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ゞ花落谁相伴 2024-12-27 06:02:51

如果 Bill 的推送先完成,Alice 的推送将会失败,因为在更新引用之前,git 确保存储库的引用仍然与之前相同。在这种情况下,情况不会如此。 Alice 最终会看到错误消息并需要解决问题。反之亦然,比尔也是如此。因此,在您的 post-receive 挂钩中,您必须确保存储库的原始引用和新引用现在不同。如果没有,那么根本不要推送到新的存储库以节省一些工作。

不过,我仍然在您的场景中发现了一个问题,它与推送到云有关。将两个有效引用推送到云位置的挂钩可能会遇到同样的问题。除了现在,如果第一次失败,您将不知道是否需要推送到脚本中的存储库,因为您不知道失败的引用是否比推送的引用更旧或更新......特别是如果它们不简单快进有时会发生。如果你只是强制推送,无论发生什么,云都有可能有一个 OLD ref,直到另一个钩子稍后推送其他内容。在爱丽丝的情况下,他会合并来自上游或任何其他解决方案的更改,但脚本可能不应该具有这样的决策能力。

在钩子中,您可能可以对当前存储库执行一些脚本魔法来确定时间戳等,并且仅在存在快进时才推送,但这看起来很混乱,而且更有可能需要合并。我认为比使用 post-receive hook 更好的解决方案是每五分钟(或您想要的频率)使用一个 cron 或计划任务,该任务只需在远程镜像的 master 分支上运行 git pull 即可。如果您无权访问该存储库,则可以使用 cron 作业从 LAN 存储库强制推送。我认为这比钩子更安全,也更简单。这将确保您每隔几分钟备份云上的分支总是位于正确的位置,并且不会冒险推送较旧的引用,并且在用户再次推送之前永远不会获取最新的引用,就像钩子一样。

If Bill's push finishes first Alice's push will fail because before the refs are updated git makes sure the ref for the repo is still the same one as before. In this scenario it will not be. Alice will end up seeing the error message and needs to resolve the issues. The same goes for Bill in the vice versa case. So in your post-receive hook you must make sure that the original and new refs for the repo are different now. If not, then do not push up to the new repo at all to save some work.

I still see a problem in your scenario though and it is with the push to the cloud. You can have the SAME issue with the hook pushing two valid refs up to the cloud location. Except now you wont know if you need to push to the repo in the script if it fails the first time because you won't know if the failed ref was older or newer than the one pushed... especially if they weren't simple fast forwards which can happen from time to time. If you just forced the push no matter what that would have a chance the cloud will have an OLD ref until another hook pushes something else up later. In the case with Alice he would have merged the changes from upstream or any number of other solutions, but the script probably shouldn't have such decision making capability.

In the hook you might be able to do some script magic on the current repo to determine timestamps and the like and only push if there is a fast forward, but that seems messy and it is more likely a merge is needed anyway. I think a better solution than using a post-receive hook is to use a cron, or scheduled, task every five minutes (or however frequent you want) that simply runs a git pull on the master branch of your remote mirror. If you don't have access to that repo, you can do the force push from your LAN repo with a cron job instead. I think this is safer than the hook and less complicated. This will assure you the branch on the backup cloud is always in the correct place every few minutes and doesn't risk pushing an older ref and never getting the newest one until there is another push from a user, like the hook does.

荒岛晴空 2024-12-27 06:02:51

Git 2.4+(2015 年第 2 季度)< /a> 将介绍原子推送,这应该使服务器更容易管理推送顺序。
查看
Stefan Beller (stefanbeller) 所做的工作:

  • 提交 ad35eca t5543-atomic-push.sh:添加原子推送的基本测试

这添加了原子推送选项的测试。
前四个测试检查原子选项是否在良好的条件下工作,最后三个补丁检查原子选项是否在只有一个引用无法更新时阻止推送任何更改。

如果可用,请在远程端使用原子事务。
要么更新所有参考,要么出现错误,不更新任何参考。
如果服务器不支持原子推送,推送将会失败。

这增加了对 send-pack 协商和使用原子推送的支持
如果服务器支持它。原子推送由新命令激活
行标志--atomic

这添加了原子协议选项允许receive-pack通知客户端它具有原子推送功能
此提交使先前提交中引入的功能在服务端上线。
文档中的更改反映了服务器的协议功能。

   atomic
   ------

如果服务器发送“原子”功能,它就能够接受原子推送。
如果推送客户端请求此功能,服务器将在一个原子事务中更新引用。
要么所有参考文献都已更新,要么没有。


使用 Git 2.29(2020 年第 4 季度),“git Push(man) 想要成为原子并且想要发送推送证书的人学会了不这样做当本地检查失败时,准备并签署推送证书(因此由于原子性,我们知道不需要证书)。

请参阅 提交 a4f324a(2020 年 9 月 19 日),作者:韩信 (chiyutianyi)
(由 Junio C Hamano -- gitster -- 合并于 提交 b5847b9,2020 年 9 月 25 日)

send-pack:在atomic之后运行GPG推送检查

签字人:韩信

refs更新命令可以通过两种不同的方式发送到服务器端:GPG签名或未签名。
我们应该在同一个“最后,告诉另一端!”中运行这两个操作。代码块,但它们由“清除每个引用的状态”代码块分隔。
这将导致轻微的性能损失,因为失败的原子推送仍将为浅广告和 GPG 签名命令缓冲区执行不必要的准备,并且当没有任何内容可供签名时,用户可能不得不被(可能的)GPG 密码输入所困扰.

向 t5534 添加新的测试用例,以确保 GPG 签名原子推送失败时不会调用 GPG。

Git 2.4+ (Q2 2015) will introduce atomic pushes, which should make easier for the server to manage the pushes order.
See the work done by Stefan Beller (stefanbeller):

  • commit ad35eca t5543-atomic-push.sh: add basic tests for atomic pushes

This adds tests for the atomic push option.
The first four tests check if the atomic option works in good conditions and the last three patches check if the atomic option prevents any change to be pushed if just one ref cannot be updated.

Use an atomic transaction on the remote side if available.
Either all refs are updated, or on error, no refs are updated.
If the server does not support atomic pushes the push will fail.

This adds support to send-pack to negotiate and use atomic pushes
iff the server supports it. Atomic pushes are activated by a new command
line flag --atomic.

This adds the atomic protocol option to allow receive-pack to inform the client that it has atomic push capability.
This commit makes the functionality introduced in the previous commits go live for the serving side.
The changes in documentation reflect the protocol capabilities of the server.

   atomic
   ------

If the server sends the 'atomic' capability it is capable of accepting atomic pushes.
If the pushing client requests this capability, the server will update the refs in one atomic transaction.
Either all refs are updated or none.


With Git 2.29 (Q4 2020), "git push"(man) that wants to be atomic and wants to send push certificate learned not to prepare and sign the push certificate when it fails the local check (hence due to atomicity it is known that no certificate is needed).

See commit a4f324a (19 Sep 2020) by Han Xin (chiyutianyi).
(Merged by Junio C Hamano -- gitster -- in commit b5847b9, 25 Sep 2020)

send-pack: run GPG after atomic push checking

Signed-off-by: Han Xin

The refs update commands can be sent to the server side in two different ways: GPG-signed or unsigned.
We should run these two operations in the same "Finally, tell the other end!" code block, but they are separated by the "Clear the status for each ref" code block.
This will result in a slight performance loss, because the failed atomic push will still perform unnecessary preparations for shallow advertise and GPG-signed commands buffers, and user may have to be bothered by the (possible) GPG passphrase input when there is nothing to sign.

Add a new test case to t5534 to ensure GPG will not be called when the GPG-signed atomic push fails.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文