数据库结构和源代码控制 - 最佳实践
背景
我在一家公司工作了几年,所有数据库对象都存储在源代码管理中,每个对象一个文件。我们有一份添加新项目时维护的所有对象的列表(以允许我们按顺序运行脚本并处理依赖关系)和一个 VB 脚本,该脚本运行以创建一个针对数据库运行的大脚本。
所有表都是“如果不存在则创建”,所有 SP 等都被删除并重新创建。
到目前为止,我现在工作的地方以数据库为主,并且没有数据库对象的源代码控制,但我们确实使用 redgate 的工具来更新我们的生产数据库(SQL 比较),这非常方便,并且需要很少的工作。
问题
您如何处理数据库对象?我喜欢将它们置于源代码控制之下(并且,当我们使用 GIT 时,我希望能够处理脚本中的合并冲突,而不是数据库中的冲突),但我将被迫过去使用 SQL 来更新数据库比较容易。
我真的不想让我们在 GIT 中更新脚本,然后使用 SQL 比较从我们的 DEV DB 更新生产数据库,因为我宁愿拥有“事实的一个版本”,但我真的不想重新编写一个自定义软件,将所有脚本捆绑在一起。
我认为 Visual Studio 数据库版本可能会做类似的事情,但我不确定我们是否有预算。
我确信这已经被问死了,但我找不到任何似乎完全有我正在寻找的答案的东西。 与此类似,但不完全相同:
开始赏金,因为我有兴趣征求更多意见 - 这里的答案是合理的,但我觉得确实应该有一个更简单的方法。
感谢所有出色的答案 - 都有各自的优点,所以我将投最高票,但为所有的投入欢呼。
Background
I came from several years working in a company where all the database objects were stored in source control, one file per object. We had a list of all the objects that was maintained when new items were added (to allow us to have scripts run in order and handle dependencies) and a VB script that ran to create one big script for running against the database.
All the tables were 'create if not exists' and all the SP's etc. were drop and recreate.
Up to the present and I am now working in a place where the database is the master and there is no source control for DB objects, but we do use redgate's tools for updating our production database (SQL compare), which is very handy, and requires little work.
Question
How do you handle your DB objects? I like to have them under source control (and, as we're using GIT, I'd like to be able to handle merge conflicts in the scripts, rather than the DB), but I'm going to be pressed to get past the ease of using SQL compare to update the database.
I don't really want to have us updating scripts in GIT and then using SQL compare to update the production database from our DEV DB, as I'd rather have 'one version of the truth', but I don't really want to get into re-writing a custom bit of software to bundle the whole lot of scripts together.
I think that visual studio database edition may do something similar to this, but I'm not sure if we will have the budget for it.
I'm sure that this has been asked to death, but I can't find anything that seems to quite have the answer I'm looking for.
Similar to this, but not quite the same:
What are the best practices for database scripts under code control
Started a bounty, as I'm interested in canvassing for a few more opinions - the answers here are sound, but I feel that there should really be an easier way.
Thanks for all the great answers - all have their merits, so I'm going to take the highest vote, but cheers for all the input.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
看看这个关于数据库版本控制的原理和实践的五部分系列(作者:K. Scott Allen):
这五个部分很重要,但基本上的想法是有一个基线,然后更改脚本(使用版本表)。更新数据库意味着应用当前版本“之上”的更改脚本。而且这个策略对VCS非常友好(没有冲突)。
Have a look at this five part series on the principles and practices of database version control (by K. Scott Allen):
The five parts are important but basically the idea is to have a baseline and then change scripts (with a version table). Updating the database means applying change scripts "above" the current version. And this strategy is very VCS friendly (no conflicts).
我们使用 Visual Studio Database Edition (DBPro) 将所有数据库对象置于源代码控制之下。它是一个很棒的工具,可以版本控制我们的模式,进行构建、验证,允许代码分析、模式比较、部署、数据比较、重构等。它从头开始设计为一个数据库管理和版本控制系统。强烈推荐。
这是 DBPro 首席架构师的博客网站:
We have all our database objects under source control using Visual Studio Database Edition (DBPro). It is a wonderful tool that version controls our schema, does builds, validations, allows code analysis, schema comparisons, deployments, data comparisons, refactoring etc. It was designed from the ground up to be a DB management and version control system. Highly recommended.
This is the blog site of the lead architect for DBPro:
click here
使用第 3 方 SSMS 插件ApexSQL Source Control,可以自动编写数据库对象脚本并将其推送到远程 Git 存储库,如果您更喜欢使用本地存储库,甚至可以克隆到本地存储库。
ApexSQL 源代码控制支持开箱即用的 Git 源代码控制系统。这意味着您不需要安装任何额外的 Git 客户端。除此之外,分支和合并是集成的,并可通过加载项 UI 进行使用。
Using a 3rd party SSMS add-in ApexSQL Source Control, database objects can be automatically scripted and pushed to a remote Git repository, or even to a cloned local one, if you prefer working with local repository.
ApexSQL Source Control support Git source control system out of the box. That means you don’t need any additional Git client installed. Besides this, Branching and Merging are integrated and available through the add-in UI.
假设您使用 .net 框架,请查看 Fluent Migrator 以及 聆听有关该项目的代码播客。
我认为主要目标是轻松地对迁移进行编码,就像使用与数据库无关的方法使用流畅的界面进行正常编码一样。
它构建在 .net 框架之上。并适用于多种数据库格式,包括 SQL Server、SqlLite 和 MySQL。
这种方法的优点是它与代码的其余部分一起存在,因此可以由 SCM
示例进行管理:
Assuming that you use the .net framework, have a look at the Fluent Migrator and also the Hearding Code Podcast that talks about the project.
The main aim as I see it is to easily code the migrations as you do your normal coding using a fluent interface using a database agnostic approach.
It is built on top of the .net framework. and works with a number of database formats including SQL Server, SqlLite and MySQL.
The advantage of the this approach is that it lives with the rest of your code and can therefore be managed by SCM
Example:
如果您已经在使用 Red Gate 工具,则可以考虑使用 SQL 源代码管理,它与 SQL Compare 和 SQL Data Compare 并行工作,以允许源代码管理中存在一种事实版本。目前尚处于早期访问阶段,但大部分功能都可供试用。您可以从 http://www.red-gate.com/Products 下载此文件/SQL_Source_Control/index.htm 。但目前仅支持SVN和TFS。你标准化了 GIT 吗?
David(红门产品经理)
If you're already using Red Gate tools, you might consider using SQL Source Control, which works side by side with SQL Compare and SQL Data Compare to allow one version of the truth to exist in source control. It's in early access at the moment, but most of the functionality is in there to be tried out. You can download this from http://www.red-gate.com/Products/SQL_Source_Control/index.htm . However, it only supports SVN and TFS for the moment. Have you standardized on GIT?
David (Product Manager at Red Gate)
我们有一个系统,其中数据库名义上是我们的源代码控制系统中的主机,我们维护一系列“架构更改”脚本(.sql 文件),每个脚本负责幂等地回滚更改,然后应用它。每个脚本都只是编号,因此我们有 000.sql(用于创建数据库并设置标准对象)、001.sql 等。
在开发过程中,开发人员编写架构更改脚本并针对开发数据库运行它。每个更改都需要在
dba.change_info
表中添加一行,其中包含更改编号和简要说明。为了回滚更改,只需运行它的第一部分即可。对于 SQL Server,回滚部分的幂等性是通过在发出 DROP 命令之前检查 sysobjects 等来处理的 - 类似于“drop ... if contains”结构。如果要更改模型而不是简单地添加模型,则架构更改可能需要进行数据迁移,并且还用于维护参考数据。在发布过程中,DBA(我们是一家小公司,所以这是由开发人员之一担任的角色)在停止旧版本的应用程序和启动之间将发布的架构更改应用到生产数据库更新的。
这完全是一个手动过程,但满足了诸如将数据从一个模型迁移到另一个模型等要求:例如,将布尔标志扩展为一组选项,或者将多对一关联转换为多对多关联。无论如何,这通常不是可以使用简单的模式比较工具生成的东西。它还允许角色分离 - 尽管实际上我们都可以完全访问生产,但那里有足够的解耦,以便“DBA”可以读取和审查要在生产中应用的 .sql 文件。
至少从理论上讲,只需按照 000.sql 的顺序运行所有模式更改即可构建完整的数据库(仅包含参考数据)。在实践中,我们并不定期这样做,而是将生产数据库复制到开发人员,然后在发布之前运行回归测试之前应用更改脚本。这用于测试更改脚本本身,但仅适用于中等规模的生产数据库。
We have a system where the database is nominally the master-inside our source control system, we maintain a sequence of "schema change" scripts (.sql files), each of which is responsible for idempotently rolling back the change and then applying it. Each script is just numbered, so we have 000.sql (which creates the database and sets up standard objects), 001.sql etc.
During development, a developer writes a schema change script and runs it against the development database. Each change is required to add a row into a
dba.change_info
table, containing the change number and a brief description. In order to roll back a change, one can just run the first section of it. For SQL Server, the idempotence of the rollback section is handled by examining sysobjects etc before issuing DROP commands- similar to "drop ... if exists" constructs. Schema changes may need to do migration of data if a model is being changed rather than simply being added, and also are used to maintain reference data.During the release process, a DBA (we're a small company, so this is a role taken on by one of the developers anyway) applies the schema changes for the release to the production database between stopping the old version of the applications and starting the updated ones.
This is all quite a manual process, but satisfies requirements such as migrating data from one model to another: e.g. expanding a boolean flag to a set of options, or converting a many-to-one association to a many-to-many. This typically isn't something that can be generated with simple schema-comparison tools anyway. It also allows for role separation- although in practice we all have full access to production, there is enough decoupling there so that the "DBA" can read and review the .sql files to be applied in production.
In theory, at least, a complete database (containing only reference data) could be built by simply running all schema changes in order for 000.sql onwards. In practice we don't regularly do this, but rather copy our production database to dev and then apply the change scripts before running regression tests prior to a release. This serves to test the change scripts themselves, but is only practical with a medium size production database.
我对 RedGate 工具包不是很熟悉,但它是否与 dbGhost,必须有一个实用程序允许您将数据库对象编写为每个对象一个文件的脚本。在这种情况下,我建议执行以下操作:
diff
(如果有)。这将表明DEV数据库的结构已更改并且未反映在源代码管理中,.diff
文件为此)如果您有许多 DEV 数据库(每个用户或开发分支一个)并且太麻烦,那么更好的组合可能是在 STAGE(发布前测试)版本上执行此类任务数据库的,此时您可以将 PROD 模式存储在存储库中,并且仅在预发布测试阶段从 STAGE 更新它,在此阶段您将确保您的模式更改也位于存储库中。
这样,开发人员仍然可以按照通常的方式工作:首先更改 DEV 数据库上的架构,希望您能在您想要的
灵活性
和单一事实
之间取得平衡。在我的团队中,一旦更改 DEV 数据库,我们就会立即对 VCS 进行更改,但我们仍然有这样的任务来比较不同数据库(DEV、STAGE 和 PROD)之间的模式。基本上,我们遵循我曾经在 How 中回答的内容您应该从源代码控制构建数据库吗?。
I am not very familiar with RedGate toolkit, but if it is any similar to dbGhost, there must be a utility that allows you to script the database objects to the files one per object. In this case I would suggest following:
diff
if any. This will indicate that the structure of the DEV database has changed and is not reflected in the source control,.diff
file for this)If you many DEV databases (one per user or development branch) and it is too cumbersome, then probably a better combination would be to do such task on the
STAGE
(TEST just before release) version of the database, at which point you would store the PROD schema in the repository and would update it from the STAGE only during the pre-release testing phase, where you will ensure that your schema changes are also in the repository.This way developers can still work in the usual way: change the schema on the DEV database first, and hopefully you get the balance between the
flexibility
andone truth
you would like.In my team we add change to VCS as soon as we change the DEV database, but we still do have such task to compare the schema between different databases (DEV, STAGE and PROD). Basically, we follow what I once answered in How should you build your database from source control?.
在工作中,我们大量使用一个强大的工具,它是 ActiveRecord 的一部分(这是 Rails Web 框架附带的默认 ORM,名为 迁移。
基本迁移如下所示:
为每个数据库更改创建一个迁移,并且它们是按顺序创建的您可以运行预定义的方法以正确的顺序运行这些迁移,以便始终可以创建和/或回滚数据库,其中一些功能如下:
迁移具有用于创建表、删除表、更新的方法。表、添加索引等。全套迁移还会自动添加一个 id 列,并且 t.timestamps 部分自动生成一个“created_at”字段和一个“ update_at”字段。
大多数语言都有诸如此类的 ORM 工具,它们允许数据库以类似代码的状态进行维护,这对于开发人员来说很容易理解,并且对于 DBA 来说也足够简单使用和维护。
At work we make heavy use of a powerful tool which comes as part of ActiveRecord (which is the default ORM that comes with the Rails web framework called Migrations.
A basic migration would look like the following:
There is a Migration created for every database change, and they are created in sequential order by timestamp. You can run pre-defined methods to run these migrations in the proper order so your database can always be created and/or rolled back. Some of the functions are below:
Migrations have methods for creating tables, dropping tables, updating tables, adding indexes, etc. The full suite. The migrations also automatically add an
id
column, and thet.timestamps
section automatically generates a "created_at" field and an "updated_at" field.Most languages have ORM facilities such as these, and they allow the database to be maintained in a code-like state, which is easy for developers to understand, as well as being simple enough for DBA's to use and maintain.
我目前在建模工具(DeZine for Databases)中维护数据库设计,并将其存储在源代码控制下。在我的表设计中,我添加了一个包含两行的表,其中包含架构和参考数据的版本号,每次更改/发布数据库时都会更新该表(用户不访问该表)。
参考数据保存在 Excel 电子表格中(也在源代码控制下),该电子表格可以生成 INSERT 语句的 SQL 脚本来填充新数据库。
当需要新版本时,会发送架构脚本、参考数据脚本和安装程序包。安装程序包重命名旧数据库,从脚本创建新数据库并导入新参考数据(也可能已更改)。然后,用户的数据从旧的(重命名的)数据库复制到新数据库。
这样做的优点是,当出现问题时,您可以恢复到原始数据库,因为它尚未被修改。
I currently maintain a database design in a modelling tool (DeZine for Databases) and store that under source control. Within my table design I add a table with two rows which have the version number of the schema and of the reference data, this is updated each time the database is changed/released (users do not access this table).
Reference data is maintained in an Excel spreadsheet (also under source control) which can generate a SQL script of INSERT statements to populate new databases.
When a new release is required the schema script, reference data script and an installer package are sent out. The installer package renames the old database, creates a new database from the script and imports the new reference data (which may also have changed). The user's data is then copied from the old (renamed) database to the new one.
This has the advantage that when things go wrong you can revert to the original database as it has not been modified.
有一个专门的工具可以解决这个问题。它的名称为 Wizardby:
(来源:googlecode.com)
There's a special tool for this exact thing. It's called Wizardby:
(source: googlecode.com)