颠覆外部是一种反模式吗?
Subversion 允许您使用 externals 嵌入其他存储库的工作副本,从而轻松实现项目中第三方库软件的版本控制。
虽然这些似乎对于库的重用和供应商软件,他们并非没有他们的批评者:
请不要使用 Subversion 外部组件(或其他工具中的类似组件),它们是反模式,因此没有必要
使用外部组件是否存在隐藏风险? 请解释为什么它们被视为反模式。
Subversion lets you embed working copies of other repositories using externals, allowing easy version control of third-party library software in your project.
While these seem ideal for the reuse of libraries and version control of vendor software, they aren't without their critics:
Please don't use Subversion externals (or similar in other tools), they are an anti-pattern and, therefore, unnecessary
Are there hidden risks in using externals? Please explain why they would they be considered an antipattern.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
我是问题中引用的作者,该引用来自 之前的回答。
杰森对我这样的简短陈述表示怀疑,并要求做出解释,这是正确的。 当然,如果我完全解释了该答案中的所有内容,我就需要写一本书了。
Mike 还正确地指出,类似 svn:external 的功能的问题之一是目标源中的更改可能会破坏您自己的源,特别是如果该目标源位于一个存储库中,你不拥有。
在进一步解释我的评论时,我首先要说的是,有“安全”的方法来使用类似 svn:external 的功能,就像使用任何其他工具或功能一样。 但是,我将其称为反模式,因为该功能更有可能被滥用。 根据我的经验,它一直被滥用,我发现自己不太可能以这种安全的方式使用它,也永远不会推荐这种使用。 请进一步注意,我并没有贬低 Subversion 团队的意思——我喜欢 Subversion,尽管我计划转到 Bazaar。
此功能的主要问题是它鼓励并且通常用于将一个构建的源(“项目”)直接链接到另一个构建的源,或者将项目链接到二进制文件(DLL、JAR 等)。这取决于什么。 这些用法都不明智,而且它们构成了反模式。
正如我在其他答案中所说,我相信软件构建的一个基本原则是每个项目都构建一个二进制或主要可交付成果。 这可以被视为构建过程中关注点分离原则的应用。 对于一个项目直接引用另一个项目的源代码尤其如此,这也违反了封装原则。 这种违规的另一种形式是尝试创建构建层次结构,以通过递归调用子构建来构建整个系统或子系统。 Maven 强烈鼓励/强制执行这种行为,这是我不推荐它的众多原因之一。
最后,我发现有各种实际问题使得这个功能不受欢迎。 首先,
svn:external
有一些有趣的行为特征(但我暂时没有了解细节)。 另一方面,我总是发现我需要这样的依赖项对我的项目(构建过程)显式可见,而不是作为某些源代码控制元数据隐藏。那么,使用此功能的“安全”方式是什么? 我认为这是仅由一个人临时使用的情况,例如“配置”工作环境的一种方式。 我可以看到程序员可以在存储库中创建自己的文件夹(或为每个程序员创建一个文件夹),在其中配置指向他们当前正在处理的存储库的各个其他部分的 svn:external 链接。 然后,检出该文件夹将创建所有当前项目的工作副本。 添加或完成项目后,可以调整 svn:external 定义并适当更新工作副本。 但是,我更喜欢一种不依赖于特定源代码控制系统的方法,例如使用调用签出的脚本来执行此操作。
根据记录,我最近一次接触到这个问题是在 2008 年夏天,当时一家咨询客户正在大规模使用
svn:external
—— 一切都交叉链接以生成单个主版本工作副本。 他们的蚂蚁和 基于 Jython(用于 WebLogic)的构建脚本是在此主工作副本之上构建的。 最终结果是:没有任何东西可以独立构建,实际上有数十个子项目,但没有一个可以安全地单独检查/工作。 因此,该系统上的任何工作首先需要检出/更新超过 2 GB 的文件(他们也将二进制文件放入存储库中)。 完成任何事情都是徒劳的,我在尝试了三个月后离开了(还有许多其他反模式)。编辑:阐述递归构建 -
多年来(尤其是过去十年),我为财富 500 强公司和大型政府机构构建了大型系统,涉及数十个子项目,这些子项目排列在多层深度的目录层次结构中。 我已经使用 Microsoft Visual Studio 项目/解决方案来组织基于 .NET 的系统,使用 Ant 或 Maven 2 来组织基于 Java 的系统,并且我已经开始使用 distutils 和 setuptools (easyinstall) 来组织基于 Python 的系统。 这些系统还包括通常位于 Oracle 或 Microsoft SQL Server 中的大型数据库。
我在设计这些大型构建时取得了巨大的成功,以实现易用性和可重复性。 我的设计标准是,新开发人员可以在第一天出现,获得一个新工作站(可能直接来自戴尔,只需安装典型的操作系统),获得一份简单的安装文档(通常只有一页安装说明),并能够在无人监督、无人协助的情况下,在半天或更短的时间内完全设置工作站并从源头构建完整的系统。 调用构建本身涉及打开命令外壳,更改为源树的根目录,并发出一行命令来构建所有内容。
尽管取得了成功,但构建如此庞大的构建系统需要非常小心并严格遵守可靠的设计原则,就像构建大型业务关键型应用程序/系统一样。 我发现一个关键部分是每个项目(生成单个工件/可交付成果)必须有一个构建脚本,该脚本必须有一个定义良好的接口(用于调用构建过程的部分的命令),并且它必须站立独立于所有其他(子)项目。 从历史上看,构建整个系统很容易,但仅构建一个系统就很难/不可能。 直到最近,我才学会仔细确保每个项目真正独立。
实际上,这意味着必须至少有两层构建脚本。 最底层是生成每个可交付成果/工件的项目构建脚本。 每个这样的脚本都驻留在其项目源代码树的根目录中(实际上,该脚本定义了其项目源代码树),这些脚本对源代码控制一无所知,它们期望从命令行运行,它们引用项目中相关的所有内容到构建脚本,它们根据一些可配置的设置(环境变量、配置文件等)引用其外部依赖项(工具或二进制工件,没有其他源项目)。
第二层构建脚本也旨在从命令行调用,但它们了解源代码控制。 事实上,第二层通常是使用项目名称和版本调用的单个脚本,然后它将指定项目的源检查到新的临时目录(可能在命令行上指定)并调用其构建脚本。
可能需要更多的变化来适应持续集成服务器、多个平台和各种发布场景。
有时需要第三层脚本来调用第二层脚本(它调用第一层),以便构建整个项目集的特定子集。 例如,每个开发人员可能都有自己的脚本来构建他们今天正在处理的项目。 可能有一个脚本来构建所有内容,以便生成主文档或计算指标。
无论如何,我发现尝试将系统视为项目层次结构会适得其反。 它将项目相互联系起来,以便它们不能单独自由构建,也不能在任意位置(持续集成服务器上的临时目录)或以任意顺序(假设满足依赖关系)构建。 通常,尝试强制建立层次结构会破坏可能尝试的任何 IDE 集成。
最后,构建庞大的项目层次结构可能会导致性能过于密集。 例如,在 2007 年春天,我尝试使用 Ant 构建一个适度的源层次结构(Java 加 Oracle),但最终失败了,因为构建总是因 Java OutOfMemoryException 而中止。 这是在具有 3.5 GB 交换空间的 2 GB RAM 工作站上,我已将 JVM 调整为能够使用所有可用内存。 就代码量而言,应用程序/系统相对微不足道,但递归构建调用最终会耗尽内存,无论我给它多少内存。 当然,它也需要很长时间才能执行(30-60 分钟很常见,然后才会中止)。 我知道如何很好地进行调优,但最终我只是超出了工具的限制(在本例中为 Java/Ant)。
因此,帮自己一个忙,将构建构建为独立项目,然后将它们组合成一个完整的系统。 保持轻便灵活。 享受。
编辑:有关反模式的更多信息
严格来说,反模式是一种常见的解决方案,看起来它解决了问题,但实际上并没有,因为它留下了重要的空白,或者因为它引入了额外的问题(通常比原始问题更糟糕)。 解决方案必然涉及一个或多个工具以及将它们应用于当前问题的技术。 因此,将工具或工具的特定功能称为反模式是一种延伸,而且人们似乎正在检测这种延伸并做出反应——这很公平。
另一方面,由于关注工具而不是技术似乎是我们行业的常见做法,因此引起关注的是工具/功能(在 StackOverflow 上对问题进行的随意调查似乎很容易说明这一点)。 我的评论以及这个问题本身都反映了这种做法。
然而,有时进行这种延伸似乎特别合理,例如在本例中。 有些工具似乎“引导”用户使用特定的技术来应用它们,以至于有些人认为工具塑造思想(稍微改写)。 正是本着这种精神,我建议 svn:external 是一种反模式。
为了更严格地说明问题,反模式是设计一个构建解决方案,包括在源代码级别将项目捆绑在一起,或者隐式版本化项目之间的依赖关系,或者允许此类依赖关系隐式更改,因为这些都会调用非常负面的结果。 类
svn:external
功能的本质使得避免这些负面后果变得非常困难。正确处理项目之间的依赖关系涉及解决这些动态以及基本问题,并且工具和技术会走上不同的道路。 应该考虑的一个例子是 Ivy,它以类似于 Maven 的方式提供帮助,但没有太多帮助缺点。 我正在研究 Ivy 和 Ant,作为我解决 Java 构建问题的短期解决方案。 从长远来看,我希望将核心概念和功能融入到一个开源工具中,以促进多平台解决方案。
I am the author of the quote in the question, which came from a previous answer.
Jason is right to be suspicious of brief statements such as mine, and to ask for an explanation. Of course, if I fully explained everything in that answer, I would need to have written a book.
Mike is also right to point out that one of the problems with an
svn:external
-like feature is that changes in the targeted source could break your own source, especially if that targeted source is in a repository that you do not own.In further explaining my comment, let me first say that there are "safe" ways to use the
svn:external
-like feature, just as with any other tool or feature. However, I refer to it as an antipattern because the feature is far more likely to be misused. In my experience, it has always been misused, and I find myself very unlikely to ever use it in that safe manner nor to ever recommend that use. Please further note that I mean NO disparagement to the Subversion team--I love Subversion, although I plan to move on to Bazaar.The primary issue with this feature is that it encourages and it is typically used to directly link the source of one build ("project") to the source of another, or to link the project to a binary (DLL, JAR, etc.) on which it depends. Neither of these uses is wise, and they constitute an antipattern.
As I said in my other answer, I believe that an essential principle for software builds is that each project constructs exactly ONE binary or primary deliverable. This can be considered an application of the principle of separation of concerns to the build process. This is particularly true regarding one project directly referencing the source of another, which is also a violation of the principle of encapsulation. Another form of this kind of violation is attempting to create a build hierarchy to construct an entire system or subsystem by recursively invoking sub-builds. Maven strongly encourages/enforces this behavior, which is one of the many reasons that I don't recommend it.
Finally, I find that there are various practical matters that make this feature undesirable. For one,
svn:external
has some interesting behavioral characteristics (but the details escape me for the moment). For another, I always find that I need such dependencies to be explicitly visible to my project (build process), not buried as some source control metadata.So, what is a "safe" manner of using this feature? I would consider that to be when it is used temporarily by only one person, such as a way to "configure" a working environment. I could see where a programmer might create their own folder in the repository (or one for each programmer) where they would configure
svn:external
links to the various other parts of the repository that they are currently working on. Then, a checkout of that one folder will create a working copy of all their current projects. When a project is added or finished, thesvn:external
definitions could be adjusted and the working copy updated appropriately. However, I prefer an approach that is not tied to a particular source control system, such as doing this with a script that invokes the checkouts.For the record, my most recent exposure to this issue occurred during the summer of 2008 at a consulting client that was using
svn:external
on a massive scale--EVERYTHING was cross-linked to produce a single master working copy. Their Ant & Jython-based (for WebLogic) build scripts were built on top of this master working copy. The net result: NOTHING could be built stand-alone, there were literally dozens of subprojects, but not one was safe to checkout/work on by itself. Therefore, any work on this system first required a checkout/update of over 2 GB of files (they put binaries in the repository also). Getting anything done was a exercise in futility, and I left after trying for three months (there were many other antipatterns present as well).EDIT: Expound on recursive builds -
Over the years (especially the last decade), I have built massive systems for Fortune 500 companies and large government agencies involving many dozens of subprojects arranged in directory hierarchies that are many levels deep. I have used Microsoft Visual Studio projects/solutions to organize .NET-based systems, Ant or Maven 2 for Java-based systems, and I have begun using distutils and setuptools (easyinstall) for Python-based systems. These systems have also included huge databases typically in Oracle or Microsoft SQL Server.
I have had great success designing these massive builds for ease of use and repeatability. My design standard is that a new developer can show up on their first day, be given a new workstation (perhaps straight from Dell with just a typical OS installation), be given a simple setup document (usually just one page of installation instructions), and be able to fully setup the workstation and build the full system from source, unsupervised, unassisted, and in half a day or less. Invoking the build itself involves opening a command shell, changing to the root directory of the source tree, and issuing a one-line command to build EVERYTHING.
Despite that success, constructing such a massive build system requires great care and close adherence to solid design principles, just as with constructing a massive business-critical application/system. I have found that a crucial part is that each project (which produces a single artifact/deliverable) must have a single build script, which must have a well-defined interface (commands for invoking portions of the build process), and it must stand alone from all other (sub)projects. Historically, it is easy to build the whole system, but hard/impossible to build only one piece. Only recently have I learned to carefully ensure that each project truly stands alone.
In practice, this means that there must be at least two layers of build scripts. The lowest layer are the project build scripts that produce each deliverable/artifact. Each such script resides in the root directory of its project source tree (indeed, this script DEFINES its project source tree), these scripts know nothing about source control, they expect to be run from the command line, they reference everything in the project relative to the build script, and they reference their external dependencies (tools or binary artifacts, no other source projects) based on a few configurable settings (environment variables, configuration files, etc.).
The second layer of build scripts is also intended to be invoked from the command line, but these know about source control. Indeed, this second layer is often a single script that is invoked with a project name and a version, then it checks out the source for the named project to a new temporary directory (perhaps specified on the command line) and invokes its build script.
There may need to be more variation to accommodate continuous integration servers, multiple platforms, and various release scenarios.
Sometimes there is a need for a third layer of scripts that invokes the second layer of scripts (which invoke the first layer) for the purpose of building specific subsets of the overall project set. For example, each developer may have their own script that builds the projects that they are working on today. There may be a script to build everything in order to generate the master documentation, or to calculate metrics.
Regardless, I have found that attempting to treat the system as a hierarchy of projects is counterproductive. It ties the projects to each other so that they cannot be freely built alone, or in arbitrary locations (temporary directory on the continuous integration server), or in arbitrary order (assuming dependencies are satisfied). Often, attempting to force a hierarchy breaks any IDE integration that one might attempt.
Finally, building a massive hierarchy of projects can simply be too performance intensive. For example, during the spring of 2007 I attempted a modest source hierarchy (Java plus Oracle) that I built using Ant, which eventually failed because the build always aborted with a Java OutOfMemoryException. This was on a 2 GB RAM workstation with 3.5 GB swap space for which I had tuned the JVM to be able to use all available memory. The application/system was relatively trivial in terms of amount of code, but the recursive build invocations eventually exhausted memory, no matter how much memory I gave it. Of course, it also took forever to execute as well (30-60 minutes was common, before it aborted). I know how to tune VERY well, but ultimately I was simply exceeding the limits of the tools (Java/Ant in this case).
So do yourself a favor, construct your build as stand-alone projects, then compose them into a full system. Keep it light and flexible. Enjoy.
EDIT: More on antipatterns
Strictly speaking, an antipattern is a common solution that looks like it solves the problem but doesn't, either because it leaves important gaps or because it introduces additional problems (often worse than the original problem). A solution necessarily involves one or more tools plus the technique for applying them to the problem at hand. Therefore, it is a stretch to refer to a tool or a specific feature of a tool as an antipattern, and it seems that people are detecting and reacting to that stretch--fair enough.
On the other hand, since it seems to be common practice in our industry to focus on tools rather than technique, it is the tool/feature that gets the attention (a casual survey of questions here on StackOverflow seems to easily illustrate). My comments, and this question itself, reflect that practice.
However, sometimes it seems particularly justified to make that stretch, such as in this case. Some tools seem to "lead" the user to particular techniques for applying them, to the point where some argue that tools shape thought (slightly rephrased). It is mostly in that spirit that I suggest that
svn:external
is an antipattern.To more strictly state the issue, the antipattern is to design a build solution that includes tying projects together at the source level, or to implicitly version the dependencies between projects, or to allow such dependencies to implicitly change, because each of these invokes very negative consequences. The nature of the
svn:external
-like feature makes avoiding those negative consequences very difficult.Properly handling the dependencies between projects involves addressing those dynamics along with the base problem, and the tools and techniques lead down a different path. An example that should be considered is Ivy, which helps in a manner similar to Maven but without the many downsides. I am investigating Ivy, coupled with Ant, as my short-term solution to the Java build problem. Long term, I am looking to incorporate the core concepts and features into an open-source tool that facilitates a multiplatform solution.
我认为这根本不是一种反模式。 我在谷歌上进行了一些快速搜索,但基本上什么也没找到......没有人抱怨使用 svn:externals 是不好的或有害的。 当然,有一些警告您必须注意...并且您不应该将其大量撒入所有存储库中...但至于原始引用,这只是他的个人(和主观)意见。 他从未真正讨论过 svn:externals,只是谴责它们是一种反模式。 这种笼统的言论,没有任何支持,或者至少没有推理出该人是如何做出该声明的,总是值得怀疑的。
也就是说,使用外部组件存在一些问题。 就像迈克回答的那样,它们对于指向已发布软件的稳定分支非常有帮助......尤其是您已经控制的软件。 我们在许多实用程序库等项目内部使用它们。 我们有一个小组负责增强和处理实用程序库基础,但该基础代码在许多项目之间共享。 我们不希望各个团队只检查实用程序项目代码,也不希望处理一百万个分支,因此对我们来说 svn:externals 效果很好。 对于某些人来说,它们可能不是答案。 但是,我强烈不同意“请不要使用...”这一说法,并且这些工具代表了一种反模式。
I don't think this is an anti-pattern at all. I did a few quick searches on google and came up with basically nothing... nobody is complaining that using svn:externals is bad or harmful. Of course there are some caveats that you have to be aware of... and it's not something that you should just sprinkle heavily into all of your repositories... but as for the original quotation, that's just his personal (and subjective) opinion. He never really discussed svn:externals, except to condemn them as an anti-pattern. Such sweeping statements without any support or at least reasoning as to how the person came to make the statement are always suspect.
That said, there are some issues with using externals. Like Mike answered, they can be very helpful for pointing to stable branches of released software... especially software that you already control. We use them internally in a number of projects for utility libraries and such. We have a small group that enhances and works on the utility library base, but that base code is shared across a number of projects. We don't want various teams just checking in utility project code and we don't want to deal with a million branches, so for us svn:externals works very well. For some people, they may not be the answer. However, I would strongly disagree with the statement "Please don't use..." and that these tools represent an anti-pattern.
使用 svn:externals 的主要风险是引用的存储库会发生更改,从而破坏您的代码或引入安全漏洞。 如果外部存储库也在您的控制之下,那么这可能是可以接受的。
就我个人而言,我只使用 svn:externals 来指向我拥有的存储库的“稳定”分支。
The main risk with using svn:externals is that the referenced repository will be changed in a way that breaks your code or introduces a security vulnerability. If the external repository is also under your control, then this may be acceptable.
Personally, I only use svn:externals to point to "stable" branches of a repository that I own.
一个旧线程,但我想解决更改外部可能会破坏您的代码的担忧。 如前所述,这通常是由于外部属性的不正确使用造成的。 在几乎所有情况下,外部引用都应指向外部存储库 URI 中的特定修订号。 这确保了外部永远不会改变,除非您将其更改为指向不同的修订号。
对于我们在最终用户项目中用作外部库的一些内部库,我发现在 Major.Minor 版本上创建库标签很有用,在该版本中我们不会强制执行任何重大更改。 通过四点版本控制方案(Major.Minor.BugFix.Build),我们允许标签与 BugFix.Build 更改保持同步(再次强调,不强制进行重大更改)。 这允许我们使用标签的外部引用而无需修订号。 如果发生重大更改或其他重大更改,则会创建新标签。
外部本身并不坏,但这并不能阻止人们创建糟糕的实现。 不需要进行太多研究,只需阅读一些文档即可了解如何安全有效地使用它们。
An old thread, but I want to address the concern that a changing external could break your code. As pointed out previously, this is most often due to an incorrect usage of the external property. External references should, in almost all instances, point to a specific revision number in the external repository URI. This ensures that the external will never change unless you change it to point to a different revision number.
For some of our internal libraries, which we use as externals in our end-user projects, I've found it useful to create a tag of the library at Major.Minor version, where we enforce no breaking changes. With a four-point versioning scheme (Major.Minor.BugFix.Build), we allow the tag to be kept current with BugFix.Build changes (again, enforcing no breaking changes). This allows us to use an external reference to the tag without a revision number. In the case of major or other breaking changes, a new tag is created.
Externals themselves aren't bad, but that doesn't stop people from creating bad implementations of them. It doesn't take much research, just a little bit of reading through some documentation, to learn how to use them safely and effectively.
如果纯外部是一种反模式,因为它可能会破坏您的存储库,那么具有显式修订的版本则不应该。
摘自 svn 书籍:
我认为这完全取决于您使用该功能的目的,它本身并不是反模式。
If plain external is an anti-pattern because it can break your repository, then one with explicit revision should'nt.
Excerpt from svn book:
I think it's all depend your purpose of using the feature, it is not an anti-pattern by itself.
Subversion 外部存在明显的缺陷,但我们似乎相当成功地使用它们来包含当前项目所依赖的库(我们自己的和供应商的)。 所以我不认为它们是“反模式”。 对我来说重要的使用点是:
我也对这种安排的任何主要风险以及更好的方法感兴趣。
There are definite flaws in subversion externals, but we seem to use them reasonably successfully for including libraries (both our own and vendor) that the current project depends on. So I don't see them as an "anti-pattern". The important usage points for me are:
I too would be interested in any major risks of this arrangement, and better approaches.
说a是b并不能使a成为b,除非你说为什么会这样。
我在 Subversion 中看到的外部引用的主要缺陷是,当您更新工作副本时,无法保证存储库存在。
Subversion 外部引用可以被使用和滥用,而功能本身只不过是一个功能。 它不能说是模式,也不能说是反模式。
我读过你引用的人的答案,我必须说我不同意。 如果您的项目需要来自存储库的文件版本 XYZ,则外部 subversion 参考可以轻松为您提供。
是的,如果没有具体指定您需要哪个版本的参考,您可能会错误地使用它。 这会给你带来麻烦吗? 有可能!
它是反模式吗? 这得看情况。 如果您点击您引用的文本的作者给出的链接,即。 这里,那就不行。 某些东西可以被用来提供糟糕的解决方案,但这并不会使整个方法成为反模式。 如果这是规则,那么我会说编程语言总的来说是反模式,因为在每种编程语言中你都可能做出糟糕的解决方案。
Saying that a is b does not make a a b unless you say why this is so.
The main flaw I see with external references in subversion is that you're not guaranteed that the repository is present when you update your working copy.
Subversion external references can be used, and abused, and the feature itself is nothing but just that, a feature. It cannot be said to be a pattern, nor a antipattern.
I've read the answer by the person you quote, and I must say that I disagree. If your project requires files version XYZ from a repository, an external subversion reference can easily give you that.
Yes, you can use it wrong by not specifically specifying which version of that reference you need. Will that give you problems? Likely!
Is it an antipattern? Well, it depends. If you follow the link given by the author of the text you quote, ie. here, then no. That something can be used to provide a bad solution does not make the entire method of doing so an antipattern. If that was the rule, then I would say that programming languages by and large are antipatterns, because in every programming language you can make bad solutions.