使用分支命名空间高效备份多个版本的 git 存储库
在工作中,我们使用Perforce进行版本控制。这存在问题:1)使用这种集中式模型,在准备好回归之前我们无法签入更改。这意味着我们在开发过程中没有版本控制。 2) 我们不备份仓库的客户端视图,因此在我们签入之前我们的工作是不安全的。 3) 我们在与每个人共享代码时遇到问题,除非我们要求建立一个集成分支。我正在尝试为想要使用 git 解决这些问题的开发人员建立一个可选的 git 工作流程。
计划是使用 git-p4 与 perforce 服务器交互并创建一个私有 git 存储库。这解决了 1)。我计划使用 Git Pro 中描述的集成管理器工作流程 (http://progit.org/ book/ch5-1.html)让我们的开发人员发布公共存储库,处理 3)。
最后,我想要一个开发人员可以推送更改的地方,以便他们可以进行夜间备份/异地备份。我们现在不备份客户端视图的原因是因为每晚对每个人的客户端视图进行归档备份空间效率低下。我们有很多开发人员,他们编写了大量代码。我们不能多余地支持每个人的客户观点。我们只想保留他们所做的独特更改。
我的想法是拥有一个裸露的 git 存储库,称之为omni-backup
,每个人都可以将其所有分支推送到该存储库(并随意建议替代方案)。这将利用 git 的空间高效 sha-1 哈希并确保仅备份每个文件的唯一版本。诀窍在于,所有备份存储库必须属于同一存储库才能获得空间效率。
问题是当两个具有完全不同分支的人为他们的分支选择相同的名称时。 EG Bob 有一个 feature
分支,Jane 有一个 feature
分支,但它们用于不同的功能。如果鲍勃推动全方位备份,简将无法这样做,因为这不会是快进合并。
现在,我理想中希望发生的是,当 Bob 推送他的功能分支时,该分支将在 omni-backup
远程上重命名为 bob-feature
。当他从 omni-backup
中提取功能时,他会返回 bob-feature
。
这在 git 中似乎不太容易完成。看起来我可以使用 中记录的推钩http://www.kernel.org/pub/software/scm/git/docs/git-receive-pack.html post-receive 钩子在写入后立即重写引用的名称,然后 <可以采取一些措施来逆转返回过程中的过程,但感觉很脆弱。有人有更好的主意吗?
编辑:对于VonC(因为代码吸收了注释) VonC,你的方法听起来很有希望,但我不知道它是一个获取的事实将如何解决命名空间问题。您是否建议使用一个知道如何重命名分支的 cronjob?
喜欢(真的很脏):
foreach my $user (@users) {
my @branches = split(/s/,cat `$LDAPSERVER/$USER/$REPO/.git/refs/heads`);
foreach my $branch (@branches) {
system "git fetch $LDAPSERVER/$USER/$REPO/$BRANCH:+$USER$BRANCH"
}
}
At work, we use Perforce for version control. There are problems with this: 1) with this centralized model, we can't check in changes until they are ready for regression. This means that we have no revision control during the development process. 2) We don't back up our client view of the depot, so our work is unsafe until we can check it in. 3) We have problems sharing our code with each unless we beg to have an integration branch set up. I am trying to set up an optional git workflow for developers who want to use git to beat these problems.
The plan is to use git-p4 to interface with the perforce server and create a private git repo. This takes care of 1). I plan to use the integration-manager workflow depicted in Git Pro (http://progit.org/book/ch5-1.html) to have our developers publish public repos, taking care of 3).
Finally, I want a place where developers can push their changes so that they will pulled into nightly backups / offsite backups. The reason we don't backup our client views now is because doing nightly archival backups of everyone's client view is space inefficient. We have a lot of developers, and they produce a lot of code. We can't be redundantly backing up everyone's client view. We only want to preserve the unique changes that they are making only.
My thinking was to have one bare git repo, call it omni-backup
, that everyone can push all of their branches to (and feel free to suggest alternatives). This would utilize git's space efficient sha-1 hashing and ensure that only unique versions of each file are backed up. The trick is that all the backup repositories have to be part of the same repo to get the space efficiency.
The problem is when two people with completely different branches chose the same name for their branch. E.G. Bob has a feature
branch and Jane has a feature
branch, but they're for different features. If Bob pushes to omni-backup, Jane won't be able to, as it wouldn't be a fastforward merge.
Now what I would ideally want to have happen is that when Bob pushes his feature branch, the branch will be renamed to bob-feature
on the omni-backup
remote. And when he pulls feature from omni-backup
, he gets back bob-feature
.
This doesn't seem terribly easy to accomplish in git. It looks like I can use push hooks documented in http://www.kernel.org/pub/software/scm/git/docs/git-receive-pack.html post-receive hook to rewrite the name of the ref immediately after it written, and then something could be done to reverse the process on the way back, but it feels fragile. Anyone have a better idea?
edit: for VonC (Because code sucks in comments)
Your way sounds promising, VonC, but I don't see how the fact that it's a fetch will beat the namespacing problems. Are you suggesting a cronjob that knows how to rename the branch?
like (really dirty):
foreach my $user (@users) {
my @branches = split(/s/,cat `$LDAPSERVER/$USER/$REPO/.git/refs/heads`);
foreach my $branch (@branches) {
system "git fetch $LDAPSERVER/$USER/$REPO/$BRANCH:+$USER$BRANCH"
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您能让开发人员遵循某些准则,
git push
就可以正确地做到这一点。如果运行此命令:
其中omni-backup是存储库的远程引用,那么bob的功能分支将推送到omni-backup上的bob-feature。但如果将其委托给开发人员是不可取的,那么正如 VonC 所建议的那样,切换流程方向并使用全方位备份拉动开发人员存储库是更好的解决方案
If you can get the developers to follow certain guidelines,
git push
can do it correctly.If you run this command:
where omni-backup is the remote ref for the repository, then bob's feature branch will push to bob-feature on omni-backup. But if entrusting that to developers is undesirable, switching the direction of flow and having omni-backup pull the developers repositories, as VonC suggests, is the better solution
为什么需要开发人员推送到
omni-backup
存储库?出于备份目的,我宁愿将不同开发人员的存储库注册为远程存储库,并每晚在所有远程存储库上执行一次 git fetch(从 omni-backup 服务器)。
这样,就不可能出现分支名称串通的情况。还有一个更自动化的过程(开发人员不必在他/她不直接使用的存储库上显式推送任何内容,而只会考虑备份)
然后我会生成一个漂亮的小 <代码>git archive 从
omni-backup
中取出并将其存储起来。Why would you need for developer to push to
omni-backup
repo?For backup purposes, I would rather register the different developer's repository as remote, and do a
git fetch
every night (from theomni-backup
server) on all the remote repos.That way, no branch name collusion possible. And a more automated process (the developer doesn't have to explicitly push anything on a repo he/she doesn't directly work with, but would only consider for backup)
Then I would produce a nice little
git archive
out ofomni-backup
and store it away.