当前位置：文江博客话题详情

颠覆外部是一种反模式吗？

发布于 2024-07-09 04:17:29 字数 534 浏览 6 评论 0原文

Subversion 允许您使用 externals 嵌入其他存储库的工作副本，从而轻松实现项目中第三方库软件的版本控制。

虽然这些似乎对于库的重用和供应商软件，他们并非没有他们的批评者：

请不要使用 Subversion 外部组件（或其他工具中的类似组件），它们是反模式，因此没有必要

使用外部组件是否存在隐藏风险？请解释为什么它们被视为反模式。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

谁对谁错谁最难过 2024-07-16 04:17:29

我是问题中引用的作者，该引用来自之前的回答。

杰森对我这样的简短陈述表示怀疑，并要求做出解释，这是正确的。当然，如果我完全解释了该答案中的所有内容，我就需要写一本书了。

Mike 还正确地指出，类似 svn:external 的功能的问题之一是目标源中的更改可能会破坏您自己的源，特别是如果该目标源位于一个存储库中，你不拥有。

在进一步解释我的评论时，我首先要说的是，有“安全”的方法来使用类似 svn:external 的功能，就像使用任何其他工具或功能一样。但是，我将其称为反模式，因为该功能更有可能被滥用。根据我的经验，它一直被滥用，我发现自己不太可能以这种安全的方式使用它，也永远不会推荐这种使用。请进一步注意，我并没有贬低 Subversion 团队的意思——我喜欢 Subversion，尽管我计划转到 Bazaar。

此功能的主要问题是它鼓励并且通常用于将一个构建的源（“项目”）直接链接到另一个构建的源，或者将项目链接到二进制文件（DLL、JAR 等）。这取决于什么。这些用法都不明智，而且它们构成了反模式。

正如我在其他答案中所说，我相信软件构建的一个基本原则是每个项目都构建一个二进制或主要可交付成果。这可以被视为构建过程中关注点分离原则的应用。对于一个项目直接引用另一个项目的源代码尤其如此，这也违反了封装原则。这种违规的另一种形式是尝试创建构建层次结构，以通过递归调用子构建来构建整个系统或子系统。 Maven 强烈鼓励/强制执行这种行为，这是我不推荐它的众多原因之一。

最后，我发现有各种实际问题使得这个功能不受欢迎。首先，svn:external 有一些有趣的行为特征（但我暂时没有了解细节）。另一方面，我总是发现我需要这样的依赖项对我的项目（构建过程）显式可见，而不是作为某些源代码控制元数据隐藏。

那么，使用此功能的“安全”方式是什么？我认为这是仅由一个人临时使用的情况，例如“配置”工作环境的一种方式。我可以看到程序员可以在存储库中创建自己的文件夹（或为每个程序员创建一个文件夹），在其中配置指向他们当前正在处理的存储库的各个其他部分的 svn:external 链接。然后，检出该文件夹将创建所有当前项目的工作副本。添加或完成项目后，可以调整 svn:external 定义并适当更新工作副本。但是，我更喜欢一种不依赖于特定源代码控制系统的方法，例如使用调用签出的脚本来执行此操作。

根据记录，我最近一次接触到这个问题是在 2008 年夏天，当时一家咨询客户正在大规模使用 svn:external —— 一切都交叉链接以生成单个主版本工作副本。他们的蚂蚁和基于 Jython（用于 WebLogic）的构建脚本是在此主工作副本之上构建的。最终结果是：没有任何东西可以独立构建，实际上有数十个子项目，但没有一个可以安全地单独检查/工作。因此，该系统上的任何工作首先需要检出/更新超过 2 GB 的文件（他们也将二进制文件放入存储库中）。完成任何事情都是徒劳的，我在尝试了三个月后离开了（还有许多其他反模式）。

编辑：阐述递归构建 -

多年来（尤其是过去十年），我为财富 500 强公司和大型政府机构构建了大型系统，涉及数十个子项目，这些子项目排列在多层深度的目录层次结构中。我已经使用 Microsoft Visual Studio 项目/解决方案来组织基于 .NET 的系统，使用 Ant 或 Maven 2 来组织基于 Java 的系统，并且我已经开始使用 distutils 和 setuptools (easyinstall) 来组织基于 Python 的系统。这些系统还包括通常位于 Oracle 或 Microsoft SQL Server 中的大型数据库。

我在设计这些大型构建时取得了巨大的成功，以实现易用性和可重复性。我的设计标准是，新开发人员可以在第一天出现，获得一个新工作站（可能直接来自戴尔，只需安装典型的操作系统），获得一份简单的安装文档（通常只有一页安装说明），并能够在无人监督、无人协助的情况下，在半天或更短的时间内完全设置工作站并从源头构建完整的系统。调用构建本身涉及打开命令外壳，更改为源树的根目录，并发出一行命令来构建所有内容。

尽管取得了成功，但构建如此庞大的构建系统需要非常小心并严格遵守可靠的设计原则，就像构建大型业务关键型应用程序/系统一样。我发现一个关键部分是每个项目（生成单个工件/可交付成果）必须有一个构建脚本，该脚本必须有一个定义良好的接口（用于调用构建过程的部分的命令），并且它必须站立独立于所有其他（子）项目。从历史上看，构建整个系统很容易，但仅构建一个系统就很难/不可能。直到最近，我才学会仔细确保每个项目真正独立。

实际上，这意味着必须至少有两层构建脚本。最底层是生成每个可交付成果/工件的项目构建脚本。每个这样的脚本都驻留在其项目源代码树的根目录中（实际上，该脚本定义了其项目源代码树），这些脚本对源代码控制一无所知，它们期望从命令行运行，它们引用项目中相关的所有内容到构建脚本，它们根据一些可配置的设置（环境变量、配置文件等）引用其外部依赖项（工具或二进制工件，没有其他源项目）。

第二层构建脚本也旨在从命令行调用，但它们了解源代码控制。事实上，第二层通常是使用项目名称和版本调用的单个脚本，然后它将指定项目的源检查到新的临时目录（可能在命令行上指定）并调用其构建脚本。

可能需要更多的变化来适应持续集成服务器、多个平台和各种发布场景。

有时需要第三层脚本来调用第二层脚本（它调用第一层），以便构建整个项目集的特定子集。例如，每个开发人员可能都有自己的脚本来构建他们今天正在处理的项目。可能有一个脚本来构建所有内容，以便生成主文档或计算指标。

无论如何，我发现尝试将系统视为项目层次结构会适得其反。它将项目相互联系起来，以便它们不能单独自由构建，也不能在任意位置（持续集成服务器上的临时目录）或以任意顺序（假设满足依赖关系）构建。通常，尝试强制建立层次结构会破坏可能尝试的任何 IDE 集成。

最后，构建庞大的项目层次结构可能会导致性能过于密集。例如，在 2007 年春天，我尝试使用 Ant 构建一个适度的源层次结构（Java 加 Oracle），但最终失败了，因为构建总是因 Java OutOfMemoryException 而中止。这是在具有 3.5 GB 交换空间的 2 GB RAM 工作站上，我已将 JVM 调整为能够使用所有可用内存。就代码量而言，应用程序/系统相对微不足道，但递归构建调用最终会耗尽内存，无论我给它多少内存。当然，它也需要很长时间才能执行（30-60 分钟很常见，然后才会中止）。我知道如何很好地进行调优，但最终我只是超出了工具的限制（在本例中为 Java/Ant）。

因此，帮自己一个忙，将构建构建为独立项目，然后将它们组合成一个完整的系统。保持轻便灵活。享受。

编辑：有关反模式的更多信息

严格来说，反模式是一种常见的解决方案，看起来它解决了问题，但实际上并没有，因为它留下了重要的空白，或者因为它引入了额外的问题（通常比原始问题更糟糕）。解决方案必然涉及一个或多个工具以及将它们应用于当前问题的技术。因此，将工具或工具的特定功能称为反模式是一种延伸，而且人们似乎正在检测这种延伸并做出反应——这很公平。

另一方面，由于关注工具而不是技术似乎是我们行业的常见做法，因此引起关注的是工具/功能（在 StackOverflow 上对问题进行的随意调查似乎很容易说明这一点）。我的评论以及这个问题本身都反映了这种做法。

然而，有时进行这种延伸似乎特别合理，例如在本例中。有些工具似乎“引导”用户使用特定的技术来应用它们，以至于有些人认为工具塑造思想（稍微改写）。正是本着这种精神，我建议 svn:external 是一种反模式。

为了更严格地说明问题，反模式是设计一个构建解决方案，包括在源代码级别将项目捆绑在一起，或者隐式版本化项目之间的依赖关系，或者允许此类依赖关系隐式更改，因为这些都会调用非常负面的结果。类 svn:external 功能的本质使得避免这些负面后果变得非常困难。

正确处理项目之间的依赖关系涉及解决这些动态以及基本问题，并且工具和技术会走上不同的道路。应该考虑的一个例子是 Ivy，它以类似于 Maven 的方式提供帮助，但没有太多帮助缺点。我正在研究 Ivy 和 Ant，作为我解决 Java 构建问题的短期解决方案。从长远来看，我希望将核心概念和功能融入到一个开源工具中，以促进多平台解决方案。

I am the author of the quote in the question, which came from a previous answer.

Jason is right to be suspicious of brief statements such as mine, and to ask for an explanation. Of course, if I fully explained everything in that answer, I would need to have written a book.

Mike is also right to point out that one of the problems with an svn:external-like feature is that changes in the targeted source could break your own source, especially if that targeted source is in a repository that you do not own.

In further explaining my comment, let me first say that there are "safe" ways to use the svn:external-like feature, just as with any other tool or feature. However, I refer to it as an antipattern because the feature is far more likely to be misused. In my experience, it has always been misused, and I find myself very unlikely to ever use it in that safe manner nor to ever recommend that use. Please further note that I mean NO disparagement to the Subversion team--I love Subversion, although I plan to move on to Bazaar.

The primary issue with this feature is that it encourages and it is typically used to directly link the source of one build ("project") to the source of another, or to link the project to a binary (DLL, JAR, etc.) on which it depends. Neither of these uses is wise, and they constitute an antipattern.

As I said in my other answer, I believe that an essential principle for software builds is that each project constructs exactly ONE binary or primary deliverable. This can be considered an application of the principle of separation of concerns to the build process. This is particularly true regarding one project directly referencing the source of another, which is also a violation of the principle of encapsulation. Another form of this kind of violation is attempting to create a build hierarchy to construct an entire system or subsystem by recursively invoking sub-builds. Maven strongly encourages/enforces this behavior, which is one of the many reasons that I don't recommend it.

Finally, I find that there are various practical matters that make this feature undesirable. For one, svn:external has some interesting behavioral characteristics (but the details escape me for the moment). For another, I always find that I need such dependencies to be explicitly visible to my project (build process), not buried as some source control metadata.

So, what is a "safe" manner of using this feature? I would consider that to be when it is used temporarily by only one person, such as a way to "configure" a working environment. I could see where a programmer might create their own folder in the repository (or one for each programmer) where they would configure svn:external links to the various other parts of the repository that they are currently working on. Then, a checkout of that one folder will create a working copy of all their current projects. When a project is added or finished, the svn:external definitions could be adjusted and the working copy updated appropriately. However, I prefer an approach that is not tied to a particular source control system, such as doing this with a script that invokes the checkouts.

For the record, my most recent exposure to this issue occurred during the summer of 2008 at a consulting client that was using svn:external on a massive scale--EVERYTHING was cross-linked to produce a single master working copy. Their Ant & Jython-based (for WebLogic) build scripts were built on top of this master working copy. The net result: NOTHING could be built stand-alone, there were literally dozens of subprojects, but not one was safe to checkout/work on by itself. Therefore, any work on this system first required a checkout/update of over 2 GB of files (they put binaries in the repository also). Getting anything done was a exercise in futility, and I left after trying for three months (there were many other antipatterns present as well).

EDIT: Expound on recursive builds -

Over the years (especially the last decade), I have built massive systems for Fortune 500 companies and large government agencies involving many dozens of subprojects arranged in directory hierarchies that are many levels deep. I have used Microsoft Visual Studio projects/solutions to organize .NET-based systems, Ant or Maven 2 for Java-based systems, and I have begun using distutils and setuptools (easyinstall) for Python-based systems. These systems have also included huge databases typically in Oracle or Microsoft SQL Server.

I have had great success designing these massive builds for ease of use and repeatability. My design standard is that a new developer can show up on their first day, be given a new workstation (perhaps straight from Dell with just a typical OS installation), be given a simple setup document (usually just one page of installation instructions), and be able to fully setup the workstation and build the full system from source, unsupervised, unassisted, and in half a day or less. Invoking the build itself involves opening a command shell, changing to the root directory of the source tree, and issuing a one-line command to build EVERYTHING.

Despite that success, constructing such a massive build system requires great care and close adherence to solid design principles, just as with constructing a massive business-critical application/system. I have found that a crucial part is that each project (which produces a single artifact/deliverable) must have a single build script, which must have a well-defined interface (commands for invoking portions of the build process), and it must stand alone from all other (sub)projects. Historically, it is easy to build the whole system, but hard/impossible to build only one piece. Only recently have I learned to carefully ensure that each project truly stands alone.

In practice, this means that there must be at least two layers of build scripts. The lowest layer are the project build scripts that produce each deliverable/artifact. Each such script resides in the root directory of its project source tree (indeed, this script DEFINES its project source tree), these scripts know nothing about source control, they expect to be run from the command line, they reference everything in the project relative to the build script, and they reference their external dependencies (tools or binary artifacts, no other source projects) based on a few configurable settings (environment variables, configuration files, etc.).

The second layer of build scripts is also intended to be invoked from the command line, but these know about source control. Indeed, this second layer is often a single script that is invoked with a project name and a version, then it checks out the source for the named project to a new temporary directory (perhaps specified on the command line) and invokes its build script.

There may need to be more variation to accommodate continuous integration servers, multiple platforms, and various release scenarios.

Sometimes there is a need for a third layer of scripts that invokes the second layer of scripts (which invoke the first layer) for the purpose of building specific subsets of the overall project set. For example, each developer may have their own script that builds the projects that they are working on today. There may be a script to build everything in order to generate the master documentation, or to calculate metrics.

Regardless, I have found that attempting to treat the system as a hierarchy of projects is counterproductive. It ties the projects to each other so that they cannot be freely built alone, or in arbitrary locations (temporary directory on the continuous integration server), or in arbitrary order (assuming dependencies are satisfied). Often, attempting to force a hierarchy breaks any IDE integration that one might attempt.

Finally, building a massive hierarchy of projects can simply be too performance intensive. For example, during the spring of 2007 I attempted a modest source hierarchy (Java plus Oracle) that I built using Ant, which eventually failed because the build always aborted with a Java OutOfMemoryException. This was on a 2 GB RAM workstation with 3.5 GB swap space for which I had tuned the JVM to be able to use all available memory. The application/system was relatively trivial in terms of amount of code, but the recursive build invocations eventually exhausted memory, no matter how much memory I gave it. Of course, it also took forever to execute as well (30-60 minutes was common, before it aborted). I know how to tune VERY well, but ultimately I was simply exceeding the limits of the tools (Java/Ant in this case).

So do yourself a favor, construct your build as stand-alone projects, then compose them into a full system. Keep it light and flexible. Enjoy.

EDIT: More on antipatterns

Strictly speaking, an antipattern is a common solution that looks like it solves the problem but doesn't, either because it leaves important gaps or because it introduces additional problems (often worse than the original problem). A solution necessarily involves one or more tools plus the technique for applying them to the problem at hand. Therefore, it is a stretch to refer to a tool or a specific feature of a tool as an antipattern, and it seems that people are detecting and reacting to that stretch--fair enough.

On the other hand, since it seems to be common practice in our industry to focus on tools rather than technique, it is the tool/feature that gets the attention (a casual survey of questions here on StackOverflow seems to easily illustrate). My comments, and this question itself, reflect that practice.

However, sometimes it seems particularly justified to make that stretch, such as in this case. Some tools seem to "lead" the user to particular techniques for applying them, to the point where some argue that tools shape thought (slightly rephrased). It is mostly in that spirit that I suggest that svn:external is an antipattern.

To more strictly state the issue, the antipattern is to design a build solution that includes tying projects together at the source level, or to implicitly version the dependencies between projects, or to allow such dependencies to implicitly change, because each of these invokes very negative consequences. The nature of the svn:external-like feature makes avoiding those negative consequences very difficult.

Properly handling the dependencies between projects involves addressing those dynamics along with the base problem, and the tools and techniques lead down a different path. An example that should be considered is Ivy, which helps in a manner similar to Maven but without the many downsides. I am investigating Ivy, coupled with Ant, as my short-term solution to the Java build problem. Long term, I am looking to incorporate the core concepts and features into an open-source tool that facilitates a multiplatform solution.

回复收藏 0 原文

我做我的改变 2024-07-16 04:17:29

我认为这根本不是一种反模式。我在谷歌上进行了一些快速搜索，但基本上什么也没找到......没有人抱怨使用 svn:externals 是不好的或有害的。当然，有一些警告您必须注意...并且您不应该将其大量撒入所有存储库中...但至于原始引用，这只是他的个人（和主观）意见。他从未真正讨论过 svn:externals，只是谴责它们是一种反模式。这种笼统的言论，没有任何支持，或者至少没有推理出该人是如何做出该声明的，总是值得怀疑的。

也就是说，使用外部组件存在一些问题。就像迈克回答的那样，它们对于指向已发布软件的稳定分支非常有帮助......尤其是您已经控制的软件。我们在许多实用程序库等项目内部使用它们。我们有一个小组负责增强和处理实用程序库基础，但该基础代码在许多项目之间共享。我们不希望各个团队只检查实用程序项目代码，也不希望处理一百万个分支，因此对我们来说 svn:externals 效果很好。对于某些人来说，它们可能不是答案。但是，我强烈不同意“请不要使用...”这一说法，并且这些工具代表了一种反模式。

回复收藏 0 原文

草莓酥 2024-07-16 04:17:29

使用 svn:externals 的主要风险是引用的存储库会发生更改，从而破坏您的代码或引入安全漏洞。如果外部存储库也在您的控制之下，那么这可能是可以接受的。

就我个人而言，我只使用 svn:externals 来指向我拥有的存储库的“稳定”分支。

回复收藏 0 原文

做个少女永远怀春 2024-07-16 04:17:29

一个旧线程，但我想解决更改外部可能会破坏您的代码的担忧。如前所述，这通常是由于外部属性的不正确使用造成的。在几乎所有情况下，外部引用都应指向外部存储库 URI 中的特定修订号。这确保了外部永远不会改变，除非您将其更改为指向不同的修订号。

对于我们在最终用户项目中用作外部库的一些内部库，我发现在 Major.Minor 版本上创建库标签很有用，在该版本中我们不会强制执行任何重大更改。通过四点版本控制方案（Major.Minor.BugFix.Build），我们允许标签与 BugFix.Build 更改保持同步（再次强调，不强制进行重大更改）。这允许我们使用标签的外部引用而无需修订号。如果发生重大更改或其他重大更改，则会创建新标签。

外部本身并不坏，但这并不能阻止人们创建糟糕的实现。不需要进行太多研究，只需阅读一些文档即可了解如何安全有效地使用它们。

回复收藏 0 原文