我应该如何从“坏”环境中迁移数据? 数据库设计到可用的设计?

发布于 2024-07-15 06:25:35 字数 500 浏览 7 评论 0原文

我目前继承的项目主要围绕一个非标准化表展开。 有一些标准化的尝试,但必要的限制尚未到位。

示例:在“项目”表中,有一个客户名称(以及其他值),还有一个仅包含客户名称的客户表[任何地方都没有键]。 客户表仅用作添加新项目时向用户提供的值池。 客户端表上没有主键或外键。

诸如此类的“设计模式”在数据库的当前状态以及使用它的应用程序中很常见。 我可以使用的工具是 SQL Server 2005、SQL Server Management Studio 和 Visual Studio 2008。我最初的方法是手动确定哪些信息需要规范化并运行 Select INTO 查询。 有没有比具体情况更好的方法,或者无论如何这可以自动化?

编辑: 另外,我发现“工作订单编号”不是 IDENTITY(自动编号、唯一)字段,它们是按顺序生成的,并且对于每个工作订单都是唯一的。 现有编号也存在一些空白,但都是唯一的。 在迁移之前编写存​​储过程来生成虚拟行的最佳方法是吗?

The current project I inherited mainly revolves around one unnormalized table. There are some attempts at normalization but the necessary constraints weren't put in place.

Example: In the Project table, there is a client name (among other values) and there is also a clients table which just contains client names [no keys anywhere]. The clients table is just used as a pool of values to offer the user when adding a new project. There isn't a primary key on the clients table or a foreign key.

"Design patterns" such as this is common through the current state of the database and in the applications that use it. The tools I have my disposal are SQL Server 2005, SQL Server Management Studio, and Visual Studio 2008. My initial approach has been to manually determine which information needs normalization and running Select INTO queries. Is there a better approach than a case by case or anyway this could be automated?

Edit:
Also, I've discovered that a "work order number" isn't an IDENTITY (autonumber, unique) field and they are generated sequentially and are unique to each work order. There are also some gaps in the existing numbering but all are unique. Is the best approach for this writing a store procedure to generate dummy rows before migrating?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

不交电费瞎发啥光 2024-07-22 06:25:35

迁移到可用设计的最佳方法是什么? 小心

除非您愿意破坏(并修复)当前使用数据库的每个应用程序,否则您的选择是有限的,因为您无法对现有结构进行太大更改。

在开始之前,仔细考虑一下你的动机 - 如果你有一个现有的问题(需要修复的错误,需要进行的增强),那么慢慢地继续。 然而,仅仅为了实现其他人不会注意到的改进而对正在运行的生产系统进行胡闹是不值得的。 请注意,这对您有利 - 如果存在现有问题,您可以向管理层指出解决问题的最经济有效的方法是以这种方式更改数据库结构。 这意味着您拥有对变更的管理支持 - 并且(希望)他们在事情变得梨形时提供支持。

一些实际的想法...

一次进行一项更改...并且进行一项更改。 在继续之前,请确保每个更改都是正确的。 “测量两次,切割一次”这句老话是有道理的。

自动化自动化自动化...永远不要使用 SQL Server Management Studio“实时”对生产系统进行更改。 编写一次性执行整个更改的 SQL 脚本; 针对数据库的副本来开发和测试它们,以确保它们是正确的。 不要使用生产作为测试服务器 - 您可能会意外地针对生产运行脚本; 使用专用的测试服务器(如果数据库大小低于 4G,请使用在您自己的机器上运行的 SQL Server Express)。

备份 ...任何脚本的第一步都应该是备份数据库,以便在出现问题时有办法恢复。

文档 ...如果有人在十二个月内来找您,询问为什么他们的应用程序的功能X被破坏,您将需要对应用程序所做的确切更改的历史记录数据库帮助诊断和修复。 第一个好的步骤是保留所有更改脚本。

...通常最好将主键和外键保持在数据库中抽象状态,而不是通过应用程序泄露。 在业务层面上看起来像钥匙的东西(比如你的工作订单号)有一个令人不安的例外习惯。 将您的键作为具有适当约束的附加列引入,但不要更改现有键的定义。

祝你好运!

The best approach to migrating to a usable design? CAREFULLY

Unless you're willing to break (and fix) every application that currently uses the database, your options are limited, because you can't change the existing structure very much.

Before you begin, think carefully about your motivations - if you have an existing issue (a bug to fix, an enhancement to make) then go ahead slowly. However, it's rarely worthwhile to monkey around with a working production system just to achieve an improvement that nonone else will ever notice. Note that this can play into your favour - if there's an existing issue, you can point out to management that the most cost-effective way to fix things is to alter the database structure in this way. This means you have management support for the changes - and (hopefully) their backup if something turns pear shaped.

Some practical thoughts ...

Make one change at a time ... and only one change. Make sure each change is correct before you move on. The old proverb of "measure twice, cut once" is relevant.

Automate Automate Automate ... Never ever make the changes to the production system "live" using SQL Server Management Studio. Write SQL scripts that perform the entire change in one go; develop and test these against a copy of the database to make sure you get them right. Don't use production as your test server - you might accidentally run the script against production; use a dedicated test server (if the database size is under 4G, use SQL Server Express running on your own box).

Backups ... the first step in any script should be to backup the database, so that you've got a way back if something does go wrong.

Documentation ... if someone comes to you in twelve months, asking why feature X of their application is broken, you'll need a history of the exact changes made to the database to help diagnosis and repair. First good step is to keep all your change scripts.

Keys ... it's usually a good idea to keep the primary and foreign keys abstract, within the database and not revealed through the application. Things that look like keys at a business level (like your work order number) have a disturbing habit of having exceptions. Introduce your keys as additional columns with appropriate constraints, but don't change the definitions of existing ones.

Good luck!

盛夏已如深秋| 2024-07-22 06:25:35
  1. 按照您认为应该构建的方式创建新数据库。
  2. 在新数据库中创建一个 importError 表,其中包含“oldId”和“errorDesc”等列。
  3. 编写一个简单、程序化、易读的脚本,尝试从旧结构中选择一行并将其插入到新结构中。 如果插入失败,请在 importError 表中记录尽可能具体的错误(具体来说,插入失败的原因)。
  4. 运行脚本。
  5. 验证新数据。 检查 importError 表中是否记录了错误。 如果数据无效或存在错误,请重构脚本并再次运行它,必要时可能会修改新的数据库结构。
  6. 重复步骤 1-5,直到获得可靠的转换脚本。

此过程的结果将是您:
a)新的数据库结构,针对旧结构进行验证并针对“实用主义”进行测试;
b)您可能需要编码的潜在问题的日志(例如您无法通过转换修复的错误,因为它们需要您不希望的架构中的让步)

(我可能会注意到,编写使用您选择的脚本/编程语言(而不是 SQL)编写的脚本。)

  1. Create the new database the way you think it should be structured.
  2. Create an importError table in the new database with columns like "oldId" and "errorDesc"
  3. Write a straightforward, procedural, legible script that attempts to select a row from the old structure and insert it into the new structure. If an insert fails, log as specific an error as possible to the importError table (specifically, why the insert failed).
  4. Run the script.
  5. Validate the new data. Check whether there are errors logged to the importError table. If the data is invalid or there are errors, refactor your script and run it again, possibly modifying your new database structure where necessary.
  6. Repeat steps 1-5 until you have a solid conversion script.

The result of this process will be that you have:
a) a new db structure that is validated against the old structure and tested against "pragmatism";
b) a log of potential issues you may need to code against (such as errors that you can't fix through your conversion because they require a concession in your schema that you don't want)

(I might note that it's helpful to write the script in your scripting/programming language of choice, rather than in, say, SQL.)

以为你会在 2024-07-22 06:25:35

我想不出一种明智的方法来实现自动化......如果您希望输出有用,那么一些人工输入是此类重构的关键。

重新工单编号; 假设您希望它继续作为 IDENTITY 列; 您是否可以填充数据,找到最大的数据,然后使用 ALTER TABLE 使其成为 IDENTITY? 不幸的是,我手头没有任何 TSQL 工具,因此无法测试。 或者,只需将其视为自然键

I can't think of a sensible way of automating this.... some human input is key in such refactorings, if you want the output to be useful.

Re work order number; assuming you want this to continue being an IDENTITY column; can you perhaps fill the data, find the largest, then use ALTER TABLE to make it IDENTITY? I don't have any TSQL tools to hand, so I can't test, unfortunately. Alternatively, just consider it a natural key.

一场春暖 2024-07-22 06:25:35

我建议使用存储过程来帮助翻译过程。

具体来说:

  1. 将代码中使用的查询一一替换为存储过程。 作为替换的一部分,直接针对存储过程编写单元(或集成)测试。 考虑使用代码级 StoredProcs 帮助程序类来巩固那里的数据库访问。
  2. 在所有查询都是存储过程之后,您可以重构数据库,使用这些单元测试来确保您没有更改预期行为。
  3. 额外的优势:您将拥有这些单元测试来防止未来的损坏。

I recommend using stored procedures to aid the translation process.

Specifically:

  1. One by one, replace queries used in the code with stored procedures. As part of the replacement, write unit (or integration) tests against the stored procedures directly. Consider a code-level StoredProcs helper class to consolidate database access there.
  2. After all queries are sprocs, you can refactor the database, using those unit tests to make sure you're not changing expected behavior.
  3. Added advantage: You'll have those unit tests to guard against future breakages.
与君绝 2024-07-22 06:25:35

您没有说是否需要保留当前的应用程序界面,或者是否计划重写应用程序中的任何查询。

无论哪种方式,我都会

  • 设计新的模式
  • 写入 T-SQL 批处理,在必要时使用游标来迁移数据

游标虽然不是操作查询中的首选,但对于此类应用程序来说非常有用,因为您可以继续执行任务以一种非常结构化的方式。 这些脚本往往具有很强的可读性,当它不能立即工作并且您已经经历了几次迭代时,这一点很重要。

You did not say whether you need to keep the current application interface, or whether you are planning to rewrite any queries in the application.

Either way, I would

  • design the new schema
  • write T-SQL batches, using cursors where necessary, to migrate the data

Cursors, while not a first choice in operational queries, are great for this type of application, because you can go about the task in a very structured way. These scripts tend to be very readable, which is important when it does not work right away and you have go through a few iterations.

廻憶裏菂餘溫 2024-07-22 06:25:35

您可以使用 SQL Server Integration Services (SSIS)(SQL Server 2005 的一部分)来帮助您进行迁移。 它用于将数据从一种形式传输到另一种形式:

http://en.wikipedia.org/wiki /SQL_Server_Integration_Services
http://www.microsoft.com/sqlserver/2005/ en/us/integration-services.aspx

You can use SQL Server Integration Services (SSIS) which is part of the SQL Server 2005 to help you with migration. It is used to transfer data from one form to the other:

http://en.wikipedia.org/wiki/SQL_Server_Integration_Services
http://www.microsoft.com/sqlserver/2005/en/us/integration-services.aspx

So尛奶瓶 2024-07-22 06:25:35

只是添加一个简单的提示。 当您面前有一张 A4 或 A3 上的实体关系图时,正确的规范化将意味着没有多对多的关系。
也检查这本书或至少网站。

Just to add a simple hint. When you have your Entity Relationship diagram on one A4 or A3 in front of you proper normalization will mean no many to many relationships.
Check this book or at least the site also.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文