使用 BitTorrent 协议分发 Nightly 和 CI 构建

发布于 2024-12-03 15:45:19 字数 2321 浏览 11 评论 0 原文

这个问题是我从昨天的问题中学到的,标题为 使用 git分发夜间构建

在上述问题的答案中,很明显 git 不能满足我的需求,并鼓励我重新检查使用 BitTorrent。


短版

每天早上需要将夜间版本分发给 70 多人,希望使用 git BitTorrent 来平衡传输的负载。

长版

注意。如果您已阅读我的上一个问题<,您可以跳过下面的段落/a>.

每天早上,我们需要将夜间构建分发给 70 多人(艺术家、测试人员、程序员、制作人员等)的工作室。到目前为止,我们已经将构建复制到服务器并编写了一个同步程序来获取它(使用下面的 Robocopy);即使设置了镜像,传输速度也慢得令人无法接受,在高峰时段需要长达一个小时或更长时间才能同步(非高峰时段约为 15 分钟),这表明存在硬件 I/O 瓶颈,也可能存在网络带宽瓶颈。

到目前为止我所知道的

到目前为止我发现的:

  • 我在维基百科上找到了关于BitTorrent 协议,这是一本有趣的读物(我之前只知道 Torrent 工作原理的基础知识)。还发现了这个 StackOverflow 答案客户端与服务器握手后发生的 BITFIELD 交换。

  • 我还找到了 MonoTorrent C# 库 (GitHub 源),我可以用它来编写我们自己的跟踪器和客户端。我们无法使用现成的跟踪器或客户端(例如 uTorrent)。

问题

在我的初始设计中,我让构建系统创建一个 .torrent 文件并将其添加到跟踪器。我将使用我们现有的构建镜像来超级种子种子。

使用这种设计,我是否需要为每个新版本创建一个新的 .torrent 文件?换句话说,是否可以创建一个“滚动”.torrent,其中如果构建的内容仅更改 20%,则需要下载获取最新< /em>?

... 实际上。在编写上述问题时,我认为我需要创建新文件但是我将能够下载到用户计算机上的同一位置,并且哈希将自动确定我已经拥有的。这是正确的吗?

回应评论

  1. 为了完全全新同步整个版本(包括:游戏、源代码、本地化数据以及 PS3 和 X360 的光盘映像)约 37,000 个文件,略低于 50GB。随着生产的继续,这一数字将会增加。这次同步花了 29 分钟才能完成,当时只发生了 2 次其他同步,如果您考虑到上午 9 点我们将有 50 多人想要获取最新信息,那么这就是低高峰。

  2. 我们与IT部门一起调查了磁盘I/O和网络带宽;结论是网络存储已经饱和。我们还将统计数据记录到同步数据库中,这些记录显示,即使只有少数用户,我们也会获得不可接受的传输率。

  3. 关于不使用现成的客户端,在用户计算机上安装像 uTorrent 这样的应用程序会产生法律问题,因为可以使用该程序轻松下载其他项目。我们还希望有一个自定义工作流程来确定您想要获得哪个版本(例如,仅 PS3 或 X360,具体取决于您桌面上的 DEVKIT),并提供可用新版本的通知等。使用 MonoTorrent 创建客户端不是我们的一部分我担心的是。

This questions continues from what I learnt from my question yesterday titled using git to distribute nightly builds.

In the answers to the above questions it was clear that git would not suit my needs and was encouraged to re-examine using BitTorrent.


Short Version

Need to distribute nightly builds to 70+ people each morning, would like to use git BitTorrent to load balance the transfer.

Long Version

NB. You can skip the below paragraph if you have read my previous question.

Each morning we need to distribute our nightly build to the studio of 70+ people (artists, testers, programmers, production etc). Up until now we have copied the build to a server and have written a sync program that fetches it (using Robocopy underneath); even with setting up mirrors the transfer speed is unacceptably slow with it taking up-to an hour or longer to sync at peak times (off-peak times are roughly 15 minutes) which points to being hardware I/O bottleneck and possibly network bandwidth.

What I know so far

What I have found so far:

  • I have found the excellent entry on Wikipedia about the BitTorrent protocol which was an interesting read (I had only previously known the basics of how torrents worked). Also found this StackOverflow answer on the BITFIELD exchange that happens after the client-server handshake.

  • I have also found the MonoTorrent C# Library (GitHub Source) that I can use to write our own tracker and client. We cannot use off the shelf trackers or clients (e.g. uTorrent).

Questions

In my initial design, I have our build system creating a .torrent file and adding it to the tracker. I would super-seed the torrent using our existing mirrors of the build.

Using this design, would I need to create a new .torrent file for each new build? In other words, would it be possible to create a "rolling" .torrent where if the content of the build has only change 20% that is all that needs to be downloaded to get latest?

... Actually. In writing the above question, I think that I would need to create new file however I would be able download to the same location on the users machine and the hash will automatically determine what I already have. Is this correct?

In response to comments

  1. For completely fresh sync the entire build (including: the game, source code, localized data, and disc images for PS3 and X360) ~37,000 files and coming in just under 50GB. This is going to increase as production continues. This sync took 29 minutes to complete at time when there is was only 2 other syncs happening, which low-peak if you consider that at 9am we would have 50+ people wanting to get latest.

  2. We have investigated the disk I/O and network bandwidth with the IT dept; the conclusion was that the network storage was being saturated. We are also recording statistics to a database of syncs, these records show even with handful of users we are getting unacceptable transfer rates.

  3. In regard not using off-the-shelf clients, it is a legal concern with having an application like uTorrent installed on users machines given that other items can be easily downloaded using that program. We also want to have a custom workflow for determining which build you want to get (e.g. only PS3 or X360 depending on what DEVKIT you have on your desk) and have notifications of new builds available etc. Creating a client using MonoTorrent is not the part that I'm concerned about.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

还不是爱你 2024-12-10 15:45:19

对于是否需要创建新的 .torrent 的问题,答案是:

但是,根据数据的布局,您也许可以进行一些简单的半增量更新。

如果您分发的数据是大量单独文件的集合,则每次构建时一些文件可能会发生更改,您可以简单地创建一个新的 .torrent 文件,然后让所有客户端将其下载到与旧文件相同的位置(就像您建议的那样) 。客户端首先检查磁盘上已存在的文件,更新已更改的文件并下载新文件。主要缺点是删除的文件实际上不会在客户端删除。

如果您无论如何都要编写自己的客户端,则删除文件系统上不在 .torrent 文件中的文件是一个相当简单的步骤,可以单独完成。

如果您分发图像文件,则这不起作用,因为跨版本保持相同的位可能已移动,从而产生不同的片段哈希值。

我不一定推荐使用超级种子。根据您使用的超级种子实施的严格程度,它实际上可能会损害传输率。请记住,超级种子的目的是最小化从种子发送的字节数,而不是最大化传输速率。如果您所有的客户都表现得正确(即首先使用最稀有的),那么无论如何,件的分配都不会成为问题。

此外,要创建 torrent 并对 50 GiB torrent 进行哈希检查会给驱动器带来大量负载,您可能需要对为此使用的 bittorrent 实现进行基准测试,以确保其性能足够。在 50 GiB 时,不同实现之间的差异可能很大。

To the question whether or not you need to create a new .torrent, the answer is: yes.

However, depending a bit on the layout of your data, you may be able to do some simple semi-delta-updates.

If the data you distribute is a large collection of individual files, with each build some files may have changed you can simply create a new .torrent file and have all clients download it to the same location as the old one (just like you suggest). The clients would first check the files that already existed on disk, update the ones that had changed and download new files. The main drawback is that removed files would not actually be deleted at the clients.

If you're writing your own client anyway, deleting files on the filesystem that aren't in the .torrent file is a fairly simple step that can be done separately.

This does not work if you distribute an image file, since the bits that stayed the same across the versions may have moved, and thus yielding different piece hashes.

I would not necessarily recommend using super-seeding. Depending on how strict the super seeding implementation you use is, it may actually harm transfer rates. Keep in mind that the purpose of super seeding is to minimize the number of bytes sent from the seed, not to maximize the transfer rate. If all your clients are behaving properly (i.e. using rarest first), the piece distribution shouldn't be a problem anyway.

Also, to create a torrent and to hash-check a 50 GiB torrent puts a lot of load on the drive, you may want to benchmark the bittorrent implementation you use for this, to make sure it's performant enough. At 50 GiB, the difference between different implementations may be significant.

彡翼 2024-12-10 15:45:19

只是想添加一些非 BitTorrent 建议供您细读:

  • 如果夜间构建之间的增量并不显着,您可以使用 rsync 可减少网络流量并减少复制构建所需的时间。在以前的一家公司,我们使用 rsync 将构建提交给我们的发布者,因为我们发现我们的光盘映像在构建之间没有太大变化。

  • 您是否考虑过简单地错开复制操作,以便客户端不会减慢彼此的传输速度?当我们进行里程碑分支时,我们在内部使用了一个简单的 Python 脚本:脚本进入休眠状态,直到指定范围内的随机时间,唤醒,下载并签出所需的存储库并运行构建。用户在当天下班时运行该脚本,当他们返回时,他们已经准备好所有内容的新副本。

Just wanted to add a few non-BitTorrent suggestions for your perusal:

  • If the delta between nightly builds is not significant, you may be able to use rsync to reduce your network traffic and decrease the time it takes to copy the build. At a previous company we used rsync to submit builds to our publisher, as we found our disc images didn't change much build-to-build.

  • Have you considered simply staggering the copy operations so that clients aren't slowing down the transfer for each other? We've been using a simple Python script internally when we do milestone branches: the script goes to sleep until a random time in a specified range, wakes up, downloads and checks-out the required repositories and runs a build. The user runs the script when leaving work for the day, when they return they have a fresh copy of everything ready to go.

哭泣的笑容 2024-12-10 15:45:19

您可以使用 BitTorrent 同步 这在某种程度上是 dropbox 的替代方案,但在云中没有服务器。它允许您同步任意数量的文件夹和任意大小的文件。与几个人一起使用,它使用与 Bit Torrent 协议相同的算法。您可以创建只读文件夹并与其他人共享密钥。此方法无需为每个构建创建新的 torrent 文件。

You could use BitTorrent sync Which is somehow an alternative to dropbox but without a server in the cloud. It allows you to synchronize any number of folders and files of any size. with several people and it uses the same algorithms from the bit Torrent protocol. You can create a read-only folder and share the key with others. This method removes the need to create a new torrent file for each build.

囍孤女 2024-12-10 15:45:19

只是为了添加另一个选项,您是否考虑过 ?我自己没有使用它,但通过阅读文档,它支持分布式 对等缓存模型 听起来它会实现你想要的。

缺点是它是一项后台服务,因此它会放弃网络带宽以支持用户发起的活动 - 这对您的用户来说很好,但如果您急需计算机上的数据,则可能不是您想要的。

不过,这是另一种选择。

Just to throw another option into the mix, have you considered BITS? Not used it myself but from reading the documentation it supports a distributed peer caching model which sounds like it will achieve what you want.

The downside is that it is a background service so it will give up network bandwidth in favour of user initiated activity - nice for your users but possibly not what you want if you need data on a machine in a hurry.

Still, it's another option.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文