如何使用源代码控制管理同一项目的开源版本和商业版本?
我们正在开发一个开源项目,我们使用 Mercurial 进行源代码管理控制。该项目的 Mercurial 存储库是公共的(我们使用的是 Bitbucket)。
现在我们有一个客户,我们需要为其定制我们的开源软件。这些定制必须保密,因此我们可能需要为此客户端创建一个新的 Hg 存储库;这个新的存储库将是私有的。
但问题是我们需要[不时]将开放存储库中的更改(例如新功能或错误修复)合并到我们的私有存储库中。
实现这一目标的最佳方法是什么?我读到可以合并两个或多个 Mercurial 存储库,但历史记录将会丢失。由于许多冲突,合并也可能会很痛苦。如果我们将来有更多的客户怎么办,我们应该如何管理他们的存储库?我们应该使用一个存储库和多个分支吗?如果两个项目版本开始走向不同的方向,并且两个存储库变得越来越不同怎么办?
请分享您对此的经验。
提前致谢!
We are developing an open source project, and we are using Mercurial for source management control. The Mercurial repository for this project is public (we are using Bitbucket).
Now we have a client for whom we need to customize our open source software. These customizations must be kept private, so we probably need to create a new Hg repository for this client; this new repository would be private.
But the problem is we would need to [from time to time] merge changes (such as new features or bug fixes) from the open repository into our private repository.
What is best way to achieve this? I read that it is possible to merge two or more Mercurial repositories, but the history will be lost. Also merging could be painful because of many conflicts. What if we get a few more clients in future, how we should manage their repositories? Should we use one repository and multiple branches? What if the two project versions start to head in different directions, and the two repositories become increasingly different?
Please share your experience about this.
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您所描述的是分布式版本控制系统的标准事物:在两个存储库中进行开发并将一个存储库保留为另一个存储库的子集。首先为私有开发创建克隆:
然后进入
private
并在那里创建新功能。照常提交。私有
存储库现在将包含比开放
存储库更多的变更集——即新功能。当错误修复和新功能作为正常开源流程的一部分放入
open
存储库时,您可以将它们拉入private
存储库:这样您就可以保持不变:
私有
存储库始终包含开放版本中的所有内容,以及私有增强功能。如果您正在开发私有版本并发现错误,请记住查看开放版本以查看该错误是否也存在。如果是这样,那么先在开放版本中修复它,然后将错误修复合并到私有版本中。如果您错误地修复了私有版本中的错误,请使用hg移植
将错误修复复制到其他开放版本。历史不会有任何损失。当您执行
hg merge
时,您必须像平常一样解决合并,并且冲突只会与您的私人更改所需的大小相同。要记住的重要一点是永远不要以其他方式推送(或拉动),除非您想开始将一些私有更改发布到开源版本中。
您可以对不同的客户端多次使用此设置,如果多个客户端需要相同的私有增强功能,您还可以根据需要在不同的私有存储库之间推送/拉取变更集。
What you describe is a standard thing with a distributed version control system: developing in two repositories and keeping one a subset of the other. Start by making a clone for the private development:
Then go into
private
and make the new features there. Commit as normal. Theprivate
repository will now contain more changesets than theopen
repository -- namely the new features.When bugfixes and new features are put into the
open
repository as part of the normal open source process, then you pull them into theprivate
repository:That way you keep the invariant: the
private
repository always contains everything in the open version, plus the private enhancements. If you're working on the private version and discover a bug, then remember to take a look at the open version to see if the bug exist there too. If so, then fix it in the open version first and merge the bugfix into the private version. If you fix a bug in the private version by mistake, then usehg transplant
to copy the bugfix over to the other open version.There wont be any loss of history. You will have to resolve the merge like normal when you do
hg merge
and the conflicts will only be as large as required by your private changes.The important thing to remember is to never push (or pull) the other way, unless you want to begin releasing some of the private changes into the open source version.
You can use this setup several times with different clients and you can also push/pull changesets between different private repositories as needed if several clients require the same private enhancement.
原则上,基本模型相对简单;有一个单独的私有存储库,它是公共存储库的克隆(分支),在那里进行所有私有更改,然后定期将公共存储库合并到私有存储库中。历史保存方面没有问题,我不知道为什么你会读到这种情况。
然而,挑战在于不要最终陷入无法维护的合并地狱,而这只能通过严格的纪律来实现。
对于任何长期分支来说,最基本的经验法则是:
使私有分支尽可能小。尽量减少其中的更改量,并保持较小的规模,这样就不要开始重构大部分代码或更改缩进。在像这里这样的单向合并情况下,您修改的任何代码都有可能发生冲突,甚至是直接发生冲突。
经常合并。越频繁越好。如果您不这样做,那么每当您确实想要集成公共存储库中的更改时,您最终都会遇到一个存在大量冲突的超级合并。
此外,您还应该严格组织和编写代码以促进这种情况的发生。对于哪个分支上的内容以及代码片段的划分有明确的规则。
理想情况下,您可以将自定义功能建模为插件或外部库,甚至是一个单独的项目。这可能并不总是容易实现,在这种情况下,至少尝试根据使用工厂方法实例化的原始子类来编写所有私有修改。通过在仅存在于私有分支上的独立文件中进行所有更改,可以最大限度地降低冲突风险。
还编写自动化测试。很多。否则,您将无法立即检测到合并问题(这将会发生),并且私有分支通常会被破坏。
最后一个提示:在公共存储库上创建一个推送钩子,拒绝任何包含您知道是私有的变更集的推送;这将防止私人代码的意外发布,并可能为您省去很多麻烦。
Well in principle the basic model is relatively simple; have a separate private repository which is a clone (branch) of the public one, make all private changes on there, and then regularly merge the public one into the private one. There are no problems in regard to history preservation, I don’t know why you read that would happen.
However the challenge is to not end up with an unmaintainable merge hell, and this can only be achieved through strict discipline.
The most basic rules of thumb for any long-lived branches are:
Keep the private branch as small as possible. Minimise the amount of changes in there, and keep them small so don’t start refactoring huge parts of code or change indentation. In a one-way merge situation like here, any code that you modify has the potential to conflict, even way down the line.
Merge frequently. The more frequent the better. If you don’t do this, ever time you do want to integrate the changes from the public repository you will end up with one super-merge that has a ton of conflicts.
Additionally, you should also be disciplined in organising and write your code to facilitate this scenario. Have clear rules about what goes where on which branch, and sectioning off the pieces of code.
Ideally you would model the customised functionality as a plug-in or external library, a separate project even. That may not always be easily achievable, in that case at least try to write all private modifications in terms of sub-classes of the original which you instantiate with factory methods. By making all your changes in independent files that only exist on the private branch, you minimise the risk for conflicts.
Also write automated tests. Lots of them. Else you won’t promptly detect merge problems (which will happen), and the private branch will often be broken.
Finally a tip: make a push hook on the public repository that denies any push containing a changeset that you know is private; this will prevent accidental publication of the private code and potentially save you a lot of headaches.
嗯,一些扩展和变化。
在这种情况下,增加的分叉数量会导致每个分叉“+2 个命令+1 个存储库”。
在这种情况下增加分叉数量,每个分叉
在这种情况下增加分叉数量,每个分叉需要“+1 个补丁”,也许“+1 个队列”(见上文) 。为了简单性和可管理性,我更喜欢带有警卫的单个队列
Well, some extensions and variations.
Increased amount of forks in this case costs "+2 commands +1 repository" per fork
Increased amount of forks in this case costs "+1 commands +1 branch" per fork
Increased amount of forks in this case costs "+1 patch" per fork and, maybe "+1 queue" (see above). I'll prefer single queue with guards for simplicity and manageability
像往常一样,项目由一组模块组成。
根据我的经验,有时将一些模块放在单独的源代码控制存储库中会更好。例如,一些实用程序模块或核心模块,如 Web 框架或 DAO (ORM) 模块。在这种情况下,您可以按照应该的方式使用源代码控制中的分支 - 支持主干开发并支持同一源代码控制存储库中的每个发布版本,以便能够合并分支。
因此,我建议您重新设计应用程序模块的结构,以便将核心(开源)功能与商业(依赖于客户)定制分开。
因此,要管理开源和开放源代码。商业版本您需要有一个单独的组装过程 - 它们可以或多或少相似,甚至商业版本可以使用开源版本作为完整的工件集并扩展它们。
事实上,这是一个非常有趣的任务——去年我在这上面花了很多时间。我的决定是拥有一个具有功能齐全的 Maven 任务的核心存储库(开源)来发布它。并为每个客户提供一个单独的存储库,仅保留设计定制和定制。一些特定于客户的业务逻辑(只需在客户的 Spring XML 中使用别名来覆盖您的“核心”Spring 服务 - 请参阅 BeanDefinitionOverriding)我的客户的 maven-task 是基于 core-artifacts 的使用(通常扩展其中一些 - 例如参见 maven-war-plugin 允许扩展现有的 WAR)。以这种方式处理,您将永远不会在另一个分支中拥有同一类的克隆 - 您将使用它或扩展它,就像在应用程序中使用 log4j 类一样。您应该只是扩展开源版本。
另一个有趣的任务是如何管理配置文件。我建议您查看 Maven 远程资源插件 而不是默认的 < a href="http://maven.apache.org/plugins/maven-resources-plugin/" rel="nofollow">Maven 资源插件。它允许您拥有配置文件模板并将所有值移动到 maven 配置文件 这应该针对每个客户。并参见 Maven Tiles 插件 -它帮助我极大地简化了客户项目中的“pom.xml”(我可以重用maven构建和组装过程的“tiles”)
Project as usual consists of a set of modules.
In my experience sometime even better to have some modules in separate source-controls repositories. For example, some utility-module or core-module like web-framework or DAO (ORM) module. In this case you able to use branches in source-controls as they should be used - to support trunk-development and support of each released version in the same source-controls repository to have ability to merge branches.
So I propose you re-design structure of your application modules in such way that allows you to separate core (open-source) functions from commercial (customer dependent) customization.
So to manage open-source & commercial releases you need to have a separate assembly procedures - they can be more-or-less similar or even commercial release can use an open-source release as set of complete artifacts and extends they.
In fact that is very interest task - I've spent a lot of time on it last year. My decision is to have one core-repositoy (open-source) with fully functioned maven task to release it. And a separate repo for each customer that keeps only design customization & some customer-specific business-logic (just use aliases in customer's spring XML to override your "core" Spring services - see BeanDefinitionOverriding) and the maven-task for my customer is based on usage of core-artifacts (often extends some of them - see for example "overlays" in maven-war-plugin that allows to extend existed WAR). Dealing in such way you will never have a clone of the same class in another branch - you will use it or extends it exactly like you use log4j classes in your application. You should just extends the open-source release.
Another interest task is how to manage config-files. I recommend you to see on Maven Remote Resources Plugin instead of default Maven Resources Plugin. It allows you o have a template of configuration files and move all values to maven profiles that should be specific for each customer. And see on Maven Tiles Plugin - it helps me to dramatically simplify "pom.xml" in customer's project (I can re-use "tiles" of maven build & assembly procedure)