向 SQL 表添加聚集索引:实时生产系统存在哪些危险?
我负责一个已有 10 年历史的事务系统,其中大部分业务逻辑都是在数据库级别实现的(触发器、存储过程等)。 Win2000服务器,MSSQL 2000企业版。 目前还没有考虑立即更换或更新系统的计划。
核心进程是一个执行事务的程序——具体来说,它执行一个带有各种参数的存储过程;我们将其称为 sp_ProcessTrans
。该程序以异步间隔执行存储过程。
就其本身而言,一切工作正常,但该程序在远程工作站上有 30 个实例,所有这些实例都异步执行 sp_ProcessTrans
,然后从 SQL 服务器检索数据。执行非常有规律 - 每分钟 0 到 60 次,具体取决于程序实例负责的项目。
随着 10 年的数据增长,系统性能大幅下降:原因是 Employee
表上的死锁,特别是死锁等待时间。
我发现:
- 在
sp_ProcessTrans
的执行中,它从Employee
表中选择 7 次 - 选择是在不是主键的字段上完成的
- 该上不存在索引场地。因此,每个事务执行表扫描7次,
因此死锁的原因很清楚。我在字段上创建了一个非唯一的有序聚集索引(几乎唯一,NUM(7)
,很少更改)。测试环境立即得到改善。 问题是我无法在测试环境中模拟死锁。我需要 30 个工作站,并且需要模拟这些工作站上的“真实”活动,因此可视化已经过时了。
我需要知道是否必须安排停机时间。 对于 MSSQL 来说,创建索引不应该是一个危险的操作,但是在事务仍在进行的情况下在生产数据库上创建此字段索引是否存在任何危险(数据损坏、额外的等待时间等)?我可以选择 30 个车站交易相对安静的时间。
难道还有什么我没发现的隐患吗? (如果出现问题,我不希望恢复数据库。对于 10 年的数据来说,这将花费大量时间。)
I've been put in charge of a 10-year old transactional system where the majority of the business logic is implemented at the database level (triggers, stored procedures, etc). Win2000 server, MSSQL 2000 Enterprise.
No immediate plans for replacing or updating the system are being considered.
The core process is a program that executes transactions - specifically, it executes a stored procedure with various parameters; let's call it sp_ProcessTrans
. The program executes the stored procedure at asynchronous intervals.
By itself, things work fine, but there are 30 instances of this program on remotely located workstations, all of them asynchronously executing sp_ProcessTrans
and then retrieving data from the SQL server. Execution is pretty regular - ranging 0 to 60 times a minute, depending on what items the program instance is responsible for.
Performance of the system has dropped considerably with 10 years of data growth: the reason is the deadlocks, specifically deadlock wait times, on the Employee
table.
I have discovered:
- In
sp_ProcessTrans
's execution, it selects from anEmployee
table 7 times - The select is done on a field that is NOT the primary key
- No index exists on this field. Thus a table scan is performed 7 times per transaction
So the reason for deadlocks is clear. I created a non-unique ordered clustered index on the field (almost unique, NUM(7)
, very rarely changes). There was immediate improvement in the test environment.
The problem is that I cannot simulate the deadlocks in a test environment. I'd need 30 workstations, and I'd need to simulate 'realistic' activity on those stations, so visualization is out.
I need to know if I must schedule downtime.
Creating an index shouldn't be a risky operation for MSSQL, but is there any danger (data corruption, extra wait time, etc.) in creating this field index on the production database while the transactions are still taking place? I can select a time when transactions are fairly quiet through the 30 stations.
Are there any hidden dangers I'm not seeing? (I'm not looking forward to restoring the DB if something goes wrong. It would take a lot of time with 10 years of data.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
数据损坏不应该成为问题,但如果您尝试将索引添加到实时生产表中,您可能会遇到问题,因为表在索引创建期间不会响应查询。创建索引将应用独占表锁直到完成,所需的时间取决于许多因素(尤其是行数)。
强烈建议安排停机时间,这也是一个值得养成的好习惯。显然,已经采取了备份,并制定了计划,以防万一您必须撤消您的意图。
Data corruption shouldn't be an issue, but if you try adding an index to a live production table you are likely to experience problems as the table will not be responsive to queries during the index creation. Creating an index will apply an exclusive table lock until it is complete, and the time this takes will depend on numerous factors (especially the number of rows).
scheduled downtime is strongly recommended and also a good habit to get into. And obviously backup taken, and a plan in case you have to undo what you're intending.