是否建议将 Git 用于大型(>250GB)内容存储库

发布于 2024-07-24 20:34:44 字数 698 浏览 2 评论 0 原文

Web 应用程序是一个定制的 CMS,它具有多个子应用程序,每个子应用程序都具有驻留在同一目录结构中的代码和内容。 由于应用程序框架的体系结构,代码和内容是交织在一起的(内容取决于其显示和其他功能的代码),因此是不可分割的。 内容不存储为 BLOB,而是存储为文件,并使用底层 DB 来链接它们。 子应用程序的大小范围从 20GB - 250GB 甚至更多(这是杀手)。

网络应用程序将在代码中进行一些增强(新的子应用程序、错误修复等),同时用户将通过现有系统添加/更新内容。 因此,需要部署/发布过程,最重要的是需要为代码和内容建议版本控制系统。

Git 之所以出现是因为它是开源的并且是开源的。 自由、易于分支和 合并,它不是集中的& 因此不存在单点故障。

但在网络上进行了一些初步研究后,我发现了一些适用于我们的应用程序的令人失望的事实 - 在像我们这样的大型系统中使用 Git 很痛苦(签出、克隆、合并、推送、拉取)并且命令很复杂(“极客”)对于不了解 DVCS 且主要是 Windows 用户的开发者群体来说,可能更合适。

Git 没有固定的思维方式,但如果我必须采用集中式方法(在最糟糕的情况下),那么应该采用什么方式(CVS 和 SVN 除外)。 我读过有关 Perforce 是一种稳定的 Perforce 的文章,并且也在 Google 中使用(我希望这里有一些废话!!)。

请分享、指导和评论您的观点。 我真的需要他们。

The web-application is a custom-built CMS which has several sub-applications and each one of them has code and content residing in the same directory structure. Due to the application framework's architecture the code and content are intertwined (content depends upon the code for its display and other functionalities) and hence are inseparable. The contents are not stored as BLOB rather they are stored as files and the underlying DB is used to link them. Size of sub-applications ranges from 20GB - 250GB and more (this is the killer).

The web-application will go for some enhancements in code (new sub-applications, bug-fixes etc.) and at the same time users will add/update the contents through the already live system. Hence, a deployment/release process is required and most importantly a version control system needs to be suggested for both code and content.

Git comes to the picture because of reasons - it is open-source & free, ease of branching & merging, its not centralized & hence no single-point-of-failure.

BUT after some initial research in the web, I found out some disappointing facts which are applicable to our application - using Git for large systems like ours is painful (checkout, clone, merge, push, pull) and commands are complicated ("geeky" would be more appropriate) for a developer base which is DVCS ignorant and mostly Windows users.

There is no fixed mindset for Git but if I have to go for a centralized approach (in really WORST case) then what should be the way (CVS & SVN apart). I have read about Perforce being a stable one and is also used in Google (I expect some brashes here!!).

Please share, guide and comment your views. I really require them.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

毁虫ゝ 2024-07-31 20:34:44

我只是碰巧在一分钟前阅读这篇博文。 这有点对 git 的可扩展性的抱怨。

编辑:八年后,Git 有了大文件存储 (LFS),微软也开源了< a href="https://blogs.msdn.microsoft.com/visualstudioalm/2017/02/03/announcing-gvfs-git-virtual-file-system/" rel="noreferrer">Git 虚拟文件系统 (GVFS) 这样他们就可以使用 git 来开发 Windows。

I just happened to be reading this blog post not one minute ago. It's a bit of a rant about the scalability of git.

Edit: Eight years later, and Git has Large File Storage (LFS), and Microsoft is open sourcing Git Virtual File System (GVFS) so they can use git to develop Windows.

花桑 2024-07-31 20:34:44

首先,我不同意 Git 不适合非技术用户。 是的,有些功能新手不会使用(例如 git-send-email)。 但也有像 TortoiseGit 这样的 GUI 可以让简单的事情变得简单。

然而,我认为你处理事情的方式是错误的。 基本上,您的内容会经常更改,并且需要 Joe Bloggs 轻松编辑,而代码则不会被编码员频繁修改。 传统的解决方案是使用真正的 CMS(例如 AlfrescoSugarCRMDrupal 等或 Wiki (MediaWiki, MoinMon 等),请记住,wiki(和大多数 CMS)允许以“用户友好”的方式对内容进行版本控制,

即使您必须保留自己的内容。代码,我认为您仍然应该提取内容,以便可以单独处理它们。一旦您将代码和内容分开,您的存储库将具有更合理的大小,然后,您可以使用您想要的任何 VCS。我不太确定你说的对,Git 对于大型仓库来说本质上是不好的)。

First, I don't agree that Git is inappropriate for non-technical users. Yes, there are certain features that newbies won't use (e.g. git-send-email). But there are also GUIs like TortoiseGit to make simple things simple.

However, I think you're approaching things the wrong way. Basically, you have content that will change frequently and needs to be editable very easily by Joe Bloggs, and code that will be modified less frequently by coders. The traditional solution is to use a real CMS (e.g. Alfresco, SugarCRM, Drupal, etc. or a Wiki (MediaWiki, MoinMon, etc.), with optional plug-ins. Keep in mind, wikis (and most CMSes) allow versioning of content, in a "user-friendly" way.

Even if you must keep your in-house code, I think you should still want to extricate the content so they can be treated separately. Once you have the code and content separate, your repository will be a more reasonable size. Then, you can use whatever VCS you want (though I'm not really sure you're right that Git is inherently bad for large repos).

清秋悲枫 2024-07-31 20:34:44

git 无法扩展大型存储库。 这不是空间,而是文件的数量。 请阅读我的 我不久前写的关于此的博客文章

根据我的经验,如果您想要一个可扩展、快速、集中的源代码控制系统,P4 是最佳选择。

git does not scale for large repositories. It's not the space, it's the number of files. Please read my blog article that I wrote a while back about this.

In my experience, if you want a scalable, fast, centralized source control system, P4 is the way to go.

救赎№ 2024-07-31 20:34:44

SVN 真的是一个糟糕的选择吗?

优点:

  • 可以处理大型存储库,例如许多 linux 发行版都使用它,还有 Apache、Sourceforge
  • 具有带有 TortoiseSVN 的漂亮 GUI 前端,让您的 Windows 用户满意
  • 可以与 Windows 集成身份验证一起使用,让管理员满意
  • 可以根据不同的情况采用许多不同的备份策略您的要求(svnadmin 热复制或转储、svnsync、提交后挂钩)有助于缓解您对单点故障的担忧。

缺点:

  • 集中式 VCS

免责声明:我从未使用过 Perforce,并且已经成为一名快乐的 SVN 管理员和用户约 6 年(自 v0.29 起)

Is SVN really such a bad option?

PROS:

  • Can handle large repositories e.g. many linux distro's use it, also Apache, Sourceforge
  • Has nice GUI front end with TortoiseSVN to keep your windows users happy
  • Can be used with windows integrated authentication to keep admins happy
  • Many different backup strategies can be adopted based on your requirements (svnadmin hotcopy or dump, svnsync, post-commit hooks) to help ease your single point of failure concern.

CONS:

  • Centralised VCS

Disclaimer: I've never used Perforce and have been a happy SVN admin and user for ~6 years (since v0.29)

怎樣才叫好 2024-07-31 20:34:44

有一个名为 git-split 的实用程序脚本,它可以切碎git repo 使其更加高效。

There's a utility script called git-split that chops up a git repo to make it more efficient.

反话 2024-07-31 20:34:44

Microsoft 刚刚发布了 Git 虚拟文件系统 (GVFS),专门用于使用 git 处理大型代码库。 更多详细信息请访问 msdn

还有 Microsoft 将 Windows 源代码托管在一个巨大的 300GB Git 存储库中

我没有任何使用 GVFS 的经验。

Microsoft just released Git Virtual File System (GVFS) specifically to handle large code base with git. More details here at msdn

Also Microsoft hosts the Windows source in a monstrous 300GB Git repository

I do not have any experience using GVFS.

三五鸿雁 2024-07-31 20:34:44

我只在学校项目(带有 Zend Framework 的 php 网站)中使用过一次 git。

我们使用了 git,但老师需要在 svn 存储库上发布最终版本。

比较 checkout 大小:

git checkout 是 svn checkout MB 大小的一半。

我的两分钱。

I used git only once for a school project (php site with Zend Framework).

We used git but the teacher needed to have the final release on a svn repo.

Comparing the checkout size:

git checkout was half the size of MB of the svn checkout.

My two cents.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文