多次修改后的SVN性能

发布于 2024-07-06 13:25:00 字数 374 浏览 10 评论 0原文

我的项目目前正在使用 svn 存储库,每天都会获得数百个新修订。 该存储库驻留在 Win2k3 服务器上,并通过 Apache/mod_dav_svn 提供服务。

我现在担心随着时间的推移,由于修改过多,性能会下降。
这种担心合理吗?
我们已经计划升级到 1.5,因此从长远来看,一个目录中包含数千个文件不会成为问题。

Subversion 存储 2 个修订版之间的增量(差异),因此这有助于节省大量空间,特别是如果您只提交代码(文本)而不提交二进制文件(图像和文档)。

这是否意味着为了查看文件 foo.baz 的修订版 10,svn 将采用修订版 1,然后应用增量 2-10?

My project is currently using a svn repository which gains several hundred new revisions per day.
The repository resides on a Win2k3-server and is served through Apache/mod_dav_svn.

I now fear that over time the performance will degrade due to too many revisions.
Is this fear reasonable?
We are already planning to upgrade to 1.5, so having thousands of files in one directory will not be a problem in the long term.

Subversion on stores the delta (differences), between 2 revisions, so this helps saving a LOT of space, specially if you only commit code (text) and no binaries (images and docs).

Does that mean that in order to check out the revision 10 of the file foo.baz, svn will take revision 1 and then apply the deltas 2-10?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

柠檬 2024-07-13 13:25:00

您有什么类型的回购协议? FSFS还是BDB?

(现在我们假设使用 FSFS,因为这是默认设置。)

对于 FSFS,每个修订版都存储为与前一个修订版的差异。 所以,你会认为,是的,经过多次修改后,它会很慢。

然而,事实并非如此。 FSFS 使用所谓的“跳过增量”来避免对先前的转速进行太多查找。

(因此,如果您使用的是 FSFS 存储库,则 Brad Wilson 的答案是错误的。)

对于 BDB 存储库,HEAD(最新)修订版是全文,但早期修订版是作为一系列与头。 这意味着每次提交后都必须重新计算之前的转速。

欲了解更多信息: http://svn.apache.org/repos /asf/subversion/trunk/notes/skip-deltas

PS 我们的仓库大约有 20GB,有大约 35,000 个修订,我们没有注意到任何性能下降。

What type of repo do you have? FSFS or BDB?

(Let's assume FSFS for now, since that's the default.)

In the case of FSFS, each revision is stored as a diff against the previous. So, you would think that yes, after many revisions, it would be very slow.

However, this isn't the case. FSFS uses what are called "skip deltas" to avoid having to do too many lookups on previous revs.

(So, if you are using an FSFS repo, Brad Wilson's answer is wrong.)

In the case of a BDB repo, the HEAD (latest) revision is full-text, but the earlier revisions are built as a series of diffs against the head. This means the previous revs have to be re-calculated after each commit.

For more info: http://svn.apache.org/repos/asf/subversion/trunk/notes/skip-deltas

P.S. Our repo is about 20GB, with about 35,000 revisions, and we have not noticed any performance degradation.

我的影子我的梦 2024-07-13 13:25:00

Subversion 将最新版本存储为全文,并带有向后查看的差异。 这意味着头部的更新总是很快,而你逐渐付出的代价就是在历史中寻找越来越远的东西。

Subversion stores the most current version as full text, with backward-looking diffs. This means that updates to head are always fast, and what you incrementally pay for is looking farther and farther back in history.

两相知 2024-07-13 13:25:00

我个人还没有在实际项目中处理过代码库大于 80K LOC 的 Subversion 存储库。 我实际拥有的最大存储库约为 1.2 GB,但这包括该项目使用的所有库和实用程序。

我认为日常使用不会受到太大影响,但任何需要查看不同修订的内容可能会减慢一点。 它甚至可能不明显。

现在,从系统管理员的角度来看,有一些事情可以帮助您最大限度地减少性能瓶颈。 由于 Subversion 主要是基于文件的系统,因此您可以执行以下操作:

  • 将实际存储库放在不同的驱动器中
  • 确保除 svn 之外没有任何文件锁定应用程序在上述驱动器上运行
  • 使驱动器的转速至少为 7,500 RPM。 您可以尝试获得 10,000 RPM,但如果每个人都在同一个办公室,则可能会过度
  • 将 LAN 更新为千兆位。

对于您的情况来说,这可能有点过分了,但这就是我通常对其他文件密集型应用程序所做的事情。

如果您曾经“超越”Subversion,那么 Perforce 将是您的下一步。 对于大型项目来说,它无疑是最快的源代码控制应用程序。

I personally haven't dealt with Subversion repositories with codebases bigger than 80K LOC for the actual project. The biggest repository I've actually had was about 1.2 gigs, but this included all of the libraries and utilities that the project uses.

I don't think the day to day usage will be affected that much, but anything that needs to look through the different revisions might slow down a tad. It may not even be noticeable.

Now, from a sys admin point of view, there are a few things that can help you minimize performance bottlenecks. Since Subversion is mostly a file-based system, you can do this:

  • Put the actual repositories in a different drive
  • Make sure that no file locking apps, other than svn, are working on the drive above
  • Make the drives at least 7,500 RPM. You could try getting 10,000 RPM, but it may be overkill
  • Update the LAN to gigabit, if everybody is in the same office.

This may be overkill for your situation, but that's what I've usually done for other file-intensive applications.

If you ever "outgrow" Subversion, then Perforce will be your next step up. It's hands down the fastest source control app for very large projects.

如何视而不见 2024-07-13 13:25:00

我们正在运行一个包含千兆字节的代码和二进制文件的颠覆服务器,并且它的修订版本多达两万多个。 还没有减速。

We're running a subversion server with gigabytes worth of code and binaries, and it's up to over twenty thousand revisions. No slowdowns yet.

软的没边 2024-07-13 13:25:00

Subversion 仅存储 2 个修订版之间的增量(差异),因此这有助于节省大量空间,特别是如果您只提交代码(文本)而不提交二进制文件(图像和文档)。

此外,我见过很多使用 svn 的大型项目,并且从未抱怨过性能。

也许您担心结账时间? 那么我想这确实是一个网络问题。

哦,我曾在 CVS 存储库上工作过 2Gb 以上的内容(代码、imgs、文档),并且从未遇到过性能问题。 由于 svn 是 cvs 的一个很大的改进,我认为你不应该担心。

希望它能帮助你放松一点;)

Subversion only stores the delta (differences), between 2 revisions, so this helps saving a LOT of space, specially if you only commit code (text) and no binaries (images and docs).

Additionally I´ve seen a lot of very big projects using svn and never complained about performance.

Maybe you are worried about checkout times? then I guess this would really be a networking problem.

Oh, and I´ve worked on CVS repositories with 2Gb+ of stuff (code, imgs, docs) and never had an performance problem. Since svn is a great improvement on cvs I don´t think you should worry about.

Hope it helps easy your mind a little ;)

倾其所爱 2024-07-13 13:25:00

我不认为我们的颠覆会因衰老而减慢。 我们目前拥有数 TB 的数据,其中大部分是二进制数据。 我们每天签出/提交多达 50 GB 的数据。 目前我们总共有 50000 个修订。 我们使用 FSFS 作为存储类型,并直接连接 SVN:(Windows 服务器)或通过 Apache mod_dav_svn(Gentoo Linux 服务器)。

我无法确认这会导致 svn 随着时间的推移而变慢,因为我们设置了一个干净的服务器来进行性能比较,我们可以进行比较。 我们无法测量到显着的退化。

然而我不得不说,我们的颠覆默认情况下异常缓慢,显然它是颠覆本身,因为我们尝试使用另一个计算机系统。

由于某些未知的原因,颠覆似乎完全受到服务器 CPU 的限制。 我们的签出/提交速率限制在每个客户端 15-30 MB/s 之间,因为这样一来,一个服务器 CPU 核心就会完全用完。 这对于几乎空的存储库(1 GB,5 个修订版)和我们的完整服务器(~5 TeraByte,50000 个修订版)来说是一样的。 像将压缩设置为 0 = 关闭这样的调整并不能改善这一点。

我们的高带宽(提供约 1 GigaByte/s)FC 阵列闲置,其他核心闲置,网络(当前客户端 1 GigaBit/s,服务器 10 GigaBits/s)也闲置。 好吧,并不是真正空闲,但如果仅使用了 2-3% 的可用容量,我称之为空闲。

看到所有组件都闲置并不是一件真正有趣的事情,我们需要等待我们的工作副本被检出或提交。 基本上我不知道服务器进程在签出/提交期间始终完全消耗一个 CPU 核心正在做什么。

然而我只是想找到一种调整颠覆的方法。 如果这不可能,我们可能需要切换到另一个系统。

因此: 答案:不会 SVN 不会降低性能,只是最初很慢。

当然,如果您不需要(高)性能,那么就不会有问题。
顺便提一句。 以上均适用于 subversion 1.7 最新稳定版本

I do not think that our subversion slowed down by aging. We have currently several TeraBytes of data, mostly binary. We checkout/commit daily up to 50 GigaByte of data. In total we have currently 50000 revisions. We are using FSFS as storage type and are interfacing either directly SVN: (Windows server) or via Apache mod_dav_svn (Gentoo Linux Server).

I cannot confirm that this gets svn to slowdown over time, as we set up a clean server for performance comparison which we could compare to. We could NOT measure a significant degration.

However I have to say that our subversion is uncommonly slow by default and obviously it is subversion itself as we tried with another computer system.

For some unknown reasons subversion seems to be completly server CPU limited. Our checkout/commit rates are limited to in between 15-30 MegaBytes/s per client because then one server CPU core is completly used up. This is the same for an almost empty repository (1 GigaByte, 5 revisions) as for our full server (~5 TeraByte, 50000 revisions). Tuning like setting compression to 0 = off did not improve this.

Our High Bandwith (delivers ~1 GigaByte/s) FC-Array idles, the other cores idle and network (currently 1 GigaBit/s for clients, 10 GigaBits/s for server) idles as well. Okay not really idling but if only 2-3% of available capacity is used I call it idling.

It is no real fun to see all components idling and we need to wait for our working copies to get checked out or comitted. Basically I have no idea what the server process is doing by fully consuming one CPU core all the time during checkout/commit.

However I am just trying to find a way to tune subversion. If this is not possible we might need to switch to another system.

Therefore: Answer: No SVN does not degrade in performance it is initially slow.

Of course if you do not need (high) performance you won't have a problem.
Btw. all the above applies to subversioon 1.7 latest stable version

屋檐 2024-07-13 13:25:00

唯一可能减慢速度的操作是从多个修订版读取信息的操作(例如 SVN Blame)。

The only operations which are likely to slow down are things which read information from multiple revisions (e.g. SVN Blame).

时光沙漏 2024-07-13 13:25:00

我不确定......我在 Centos 5.2 上使用 SVN 和 apache。 工作正常。 修订号是 8230 之类的东西......并且在所有客户端计算机上,提交速度非常慢,以至于我们必须等待至少 2 分钟才能获得 1kb 的文件。 我说的是 1 个文件大小不大的文件。

然后我创建了一个新的存储库。 从修订版开始。 1.现在可以正常使用了。 快速地。
使用 svnadmin 创建 xxxxxx。
没有检查是FSFS还是BDB......

I am not sure..... I am using SVN with apache on Centos 5.2. Works ok. Revision number was 8230 something like that... And on all client machines Commit was so slow that we had to wait at least 2min for a file that is 1kb. I am talking about 1 file that has no big filesize.

Then I made a new repository. Started from rev. 1. Now works ok. Fast.
used svnadmin create xxxxxx.
did not check if it is FSFS or BDB.....

岛徒 2024-07-13 13:25:00

也许您应该考虑改进您的工作流程。

我不知道存储库在这些条件下是否会出现性能问题,但您可以返回到理智的修订版。

在您的情况下,您可能希望包括验证过程,因此团队在团队领导存储库中提交,每个人都向团队经理存储库提交,而团队经理存储库又提交只读的干净公司存储库。 您已经在这个阶段做出了明确的选择,哪些提交必须到达顶部。

这样,任何人都可以返回到干净的副本,并具有易于浏览的历史记录。 合并变得更加容易,开发人员仍然可以随心所欲地解决他们的混乱。

Maybe you should consider improving your workflow.

I don't know if a repos will have perf issues in these conditions, but you ability to go back to a sane revision will.

In your case, you may want to include a validation process, so a team commit in a team leader repo, and each of them commit to the team manager repo who commit to the read-only clean company repos. You have make a clean selection at it stage of what commit must go to the top.

This way, anybody can go back to a clean copy, with an easy to browse history. Merge are much easier, and dev can still commit their mess as much as they want.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文