预加载数据的好方法是什么?
是否有将数据加载到数据库中以与新安装的应用程序一起使用的最佳实践?例如,要运行应用程序 foo,它需要一些基本数据才能启动。我过去使用过几个选项:
TSQL 用于需要预加载的每一行:
IF NOT EXISTS (SELECT * FROM Master.Site WHERE Name = @SiteName)
INSERT INTO [Master].[Site] ([EnterpriseID], [Name], [LastModifiedTime], [LastModifiedUser])
VALUES (@EnterpriseId, @SiteName, GETDATE(), @LastModifiedUser)
另一个选项是电子表格。每个选项卡代表一个表格,当我们意识到需要时,数据就会输入到电子表格中。然后,程序可以读取该电子表格并填充数据库。
有一些复杂的因素,包括表之间的关系。所以,这并不像自己加载表格那么简单。例如,如果我们创建 Security.Member 行,那么我们想要将这些成员添加到 Security.Role,我们需要一种维护这种关系的方法。
另一个因素是并非所有数据库都会丢失此数据。一些地点已经拥有大部分数据,而其他地点(可能是世界各地的新地点)将从头开始。
任何想法表示赞赏。
Are there best practices out there for loading data into a database, to be used with a new installation of an application? For example, for application foo to run, it needs some basic data before it can even be started. I've used a couple options in the past:
TSQL for every row that needs to be preloaded:
IF NOT EXISTS (SELECT * FROM Master.Site WHERE Name = @SiteName)
INSERT INTO [Master].[Site] ([EnterpriseID], [Name], [LastModifiedTime], [LastModifiedUser])
VALUES (@EnterpriseId, @SiteName, GETDATE(), @LastModifiedUser)
Another option is a spreadsheet. Each tab represents a table, and data is entered into the spreadsheet as we realize we need it. Then, a program can read this spreadsheet and populate the DB.
There are complicating factors, including the relationships between tables. So, it's not as simple as loading tables by themselves. For example, if we create Security.Member rows, then we want to add those members to Security.Role, we need a way of maintaining that relationship.
Another factor is that not all databases will be missing this data. Some locations will already have most of the data, and others (that may be new locations around the world), will start from scratch.
Any ideas are appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果数据不是很多,则配置数据的裸初始化 - 我们通常使用任何数据库创建/修改来编写脚本。
使用脚本,您可以进行很多控制,因此您可以仅插入缺失的行,删除已知已过时的行,不覆盖已自定义的某些列等。
如果数据很多,那么您可能想要外部文件 - 我会避免使用电子表格,而使用纯文本文件(批量插入)。您可以将其加载到暂存区域,并仍然使用可能在脚本中使用的技术,以确保不会破坏目标中的任何特殊自定义。由于它受脚本控制,因此您可以控制操作顺序以确保引用完整性。
If it's not a lot of data, the bare initialization of configuration data - we typically script it with any database creation/modification.
With scripts you have a lot of control, so you can insert only missing rows, remove rows which are known to be obsolete, not override certain columns which have been customized, etc.
If it's a lot of data, then you probably want to have an external file(s) - I would avoid a spreadsheet, and use a plain text file(s) instead (BULK INSERT). You could load this into a staging area and still use techniques like you might use in a script to ensure you don't clobber any special customization in the destination. And because it's under script control, you've got control of the order of operations to ensure referential integrity.
我建议结合凯德的答案指出的两种方法。
步骤 1. 将所有需要的数据加载到临时表中(例如,在 Sybase 上,将表“db1..table1”的数据加载到“temp..db1_table1”中)。为了能够处理大型数据集,请使用批量复制机制(无论您的数据库服务器支持哪种机制),而不写入事务日志。
步骤 2. 运行一个脚本,作为主要步骤,将迭代要加载的每个表,如果需要,在新创建的临时表上创建索引,将临时表中的数据与主表进行比较,并插入/更新/删除差异。然后,根据需要,脚本可以执行辅助任务,例如您提到的安全角色设置。
I'd recommend a combination of the 2 approaches indicated by Cade's answer.
Step 1. Load all the needed data into temp tables (on Sybase, for example, load data for table "db1..table1" into "temp..db1_table1"). In order to be able to handle large datasets, use bulk copy mechanism (whichever one your DB server supports) without writing to transaction log.
Step 2. Run a script which as a main step will iterate over each table to be loaded, if needed create indexes on newly created temp table, compare the data in temp table to main table, and insert/update/delete differences. Then as needed the script can do auxillary tasks like the security role setup you mentioned.