使用 Git 将夜间构建分发到工作室
简短版本
每天早上需要将夜间构建分发给 70 多人,想要使用 git 来平衡传输负载,并且想知道是否有提示、陷阱,或者在我开始设计系统之前这个想法存在缺陷。
长版本
每天早上,我们需要将夜间构建分发给 70 多人(艺术家、测试人员、程序员、制作人员等)的工作室。到目前为止,我们已经将构建复制到服务器并编写了一个用于获取它的sync程序(使用下面的Robocopy);即使设置了镜像,传输速度也慢得令人无法接受,在高峰时段需要长达一个小时或更长时间才能同步(非高峰时段大约为 15 分钟),这表明这是硬件 I/O 瓶颈。
我的一个绝妙(虽然绝对不是原创)的想法是将负载分配到整个工作室。在研究使用臭名昭著的 Bit-Torrent 协议编写客户端之后,我想到了另一个想法,我可以只使用 git,因为按照设计,它可以让我们分发构建和修订管理,并具有以下额外好处:服务器少。
问题
你如何开始使用 git?我有使用 Perforce 和 SVN 等集中式源代码控制系统的经验。阅读文档,似乎您需要做的就是运行 git init path\\to\folder ,然后在另一台机器上运行 git clone url ?
在哪里可以获取上述
git clone
命令的url
?我可以定义吗?我发现 url 的概念很奇怪,因为 git 没有中央服务器 - 或者有吗?例如,类似于 Bit-Torrent 跟踪器?识别构建、使用变更列表编号或标签的更好选择是什么?
是否可以限制存储的修订版本数量?这将很有用,因为除了夜间构建之外,我们还需要全天分发多个 CI 构建,但是让无限数量的修订徘徊是没有意义的。在Perforce中,您可以通过设置属性来限制修订。
Short version
Need to distribute nightly builds to 70+ people each morning, would like to use git to load balance the transfer, and would like to know if there are tips, pitfalls, or flaws with the idea before I start designing the system.
Long version
Each morning we need to distribute our nightly build to the studio of 70+ people (artists, testers, programmers, production etc). Up until now we have copied the build to a server and have written a sync program that fetches it (using Robocopy underneath); even with setting up mirrors the transfer speed is unacceptably slow with it taking up-to an hour or longer to sync at peak times (off-peak times are roughly 15 minutes) which points to being hardware I/O bottleneck.
A brilliant (though definitely not original) idea that I had was to distribute the load throughout the studio. After investigating writing a client using the infamous bit-torrent protocol, another thought occurred to me that I could just use git as by design it would give us distributing the build and revision management with the added benefit of being server less.
Questions
How do you get started using git? I have experience with centrally located source-control systems like Perforce and SVN. Reading the documentation, it appears that all you need to do is run
git init path\\to\folder
and then on another machine rungit clone url
?Where do I get the
url
for the abovegit clone
command? Can I define? I find concept of having a url strange as the git does not have a central server - or does it? e.g. similar to a bit-torrent tracker?What would be the better option to identify builds, use changelist numbers or labels?
Is it possible to limit the number of revisions stored? This would be useful as in addition to the nightly builds we also have several CI builds throughout the day that we want to distribute, however it does not make sense to have infinite number revisions lingering around. In Perforce you can limit the revisions by setting a property.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不认为 git 对你的情况有真正的帮助。是的,它是分布式的,但不是为了“尽可能地将某些东西分发给更多的人”。它不会帮助你减少带宽负载,如果你使用 git 而不是 ssh,还会有额外的负载。也许你应该退后一步,再给 BitTorrent 协议一次机会。
I don't think that git will be really helpful in your situation. Yes, it is distributed, but not in case of "distributing something to more people as possible". It does not help you to reduce bandwith load, also there will be an additional load if you will use git over ssh. May be you should take step back and give another chance to bittorrent protocol.
注意:将二进制文件放入分布式存储库中并不是一个能够很好扩展的解决方案。随着时间的推移,回购协议变得越来越大。 (您在此处有替代 git 设置)。
优点是通过中央 Git 存储库计算增量(这必须比 robocopy 更快),并将所述增量作为下游存储库完成的 git fetch 的答案发送。
Note: putting binaries in a distributed repo isn't a solution that will scale well in time, the repo getting bigger and bigger. (you have alternative git setups here).
The advantage is to compute the delta by the central Git repo (which will be must faster than a robocopy) and send said delta as an answer to a
git fetch
done by a downstream repo.是的,这就是它的本质。在某个地方创建一个存储库,然后您可以从其他地方克隆它。
您在 1) 中初始化的存储库必须可以从您要克隆到的计算机访问。 Git 是无服务器的,但每个存储库都必须从某个地方获取其内容。因此,您的所有 70 多台机器都必须知道应该在哪里获取新版本。如果您想分配负载,您必须制定一个策略,确定谁从谁那里获取更新。
URL 可以是文件路径、网络路径、带有路径的 SSH 主机等。
标签可以很好地工作。
您也许可以对 git 存储库进行变基以删除旧的修订。请参阅从历史记录中完全删除(旧)git 提交
但是,我不认为它不会解决您最初的问题,即分配负载。应调查其他途径。例如,多播复制,也许MQcast 和 MQcatch 可以帮助您吗?
Yes, that's the essence of it. Create a repository somewhere and then you can clone it from somewhere else.
The repository you init in 1) have to be accessible from the machine you're cloning to. Git is server-less but every repository have to get their stuff from somewhere. So all of your 70+ machines will have to know where they should get the new build. And if you want to distribute the load you'll have to figure out a stategy on who gets their update from whom.
The URL could be a filepath, a networkpath, an SSH host with path, etc.
Tags would work well.
You could perhaps rebase the git repo to remove old revisions. See Completely remove (old) git commits from history
However, I don't think it will solve your original problem, distributing the load. Other avenues should be investigated. Multicast copying for instance, perhaps MQcast and MQcatch can help you?