大表的 SQL Server 性能
我正在寻找有关 sql 中表结构的一些建议。
基本上我会有一个包含大约 30 列字符串、整数和小数的表。服务每天大约会写入该表 500 次。表中的每条记录都可以是“非活动”或“活动”。该表将不断增长,并且在任何时候都会有大约 100 条“活动”记录需要返回。
虽然表很小,但返回“活动”记录的性能响应良好。我担心的是 12-18 个月后,当表变得更大时,甚至更晚时,表中将有数百万条记录。
从性能视图中维护两张表,一张用于“活动”记录,一张用于“非活动”记录,或者在活动列上创建索引是否可以解决任何潜在的性能问题?
I'm looking for some advice for a table structure in sql.
Basically I will have a table with about 30 columns of strings, ints and decimals. A service will be writing to this table about 500 times a day. Each record in the table can either be 'inactive' or 'active'. This table will constantly grow and at any one time there will be about 100 'active' records that need to be returned.
While the table is small the performance to return the 'active' records is responsive. My concern comes 12-18 months down the line when the table is much larger or even later when there will be millions of records in the table.
Is it better to maintain two tables one for 'active' records and one for 'inactive' records from a performance view or will creating a index on the active column solve any potential performance issues?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
拥有一个小的“活动”表肯定会提高性能。最明显的成本是正确维护记录比维护一张表更麻烦。我可能不会立即这样做,但请记住它作为潜在的优化。
活动列上的索引将极大地改善问题。更重要的是,多列索引(或多个索引)适合最常使用的查询(或多个查询)。例如,如果您经常要求在特定日期之后创建活动行,则可以使用日期和活动的索引来获得单个索引以进行检索。同样,如果您希望所有活动行均按 id 排序,则可以同时使用 id 和 active 行。
使用数据库引擎优化顾问进行测试可以提供非常丰富的信息,但不如您在这里所做的那样,能够很好地预测您希望在未来几个月内更改的数据的最佳方法。
索引视图很可能是您的最佳方法,因为这样您就可以创建与 SQLServer 2005 中可用的部分索引最接近的内容(您的标签建议您正在使用该索引)。请参阅 http://technet.microsoft.com/en-us/library/ cc917715.aspx#XSLTsection124121120120 这将根据您的常规搜索/联接/排序条件创建索引,但仅在相关行上创建索引(完全忽略其他行)。
更好的是,如果您可以使用 SQLServer 2008,则使用过滤索引(Microsoft 决定将其称为部分索引)。请参阅 http://technet.microsoft.com/en-us/library/cc280372。 aspx 了解更多信息。
如果您使用 2008 年而不是 2005 年进行标记,我肯定会建议过滤索引,因为我可能会选择索引视图,但可能只会选择多列索引。
It certainly will be more performant to have a small "active" table. The most obvious cost is that maintaining the records correctly is more troublesome than with one table. I would probably not do so immediately, but bear it in mind as a potential optimisation.
An index on the active column is going to massively improve matters. Even more so, would multi-column index (or indices) appropriate for the query (or queries) most often used. For example, if you would often ask for active rows created after a certain date, then an index on both date and active could be used to have a single index for retrieval. Likewise, if you wanted all active rows ordered by id, then one on both id and active could be used.
Testing with Database Engine Tuning Advisor can be very informative here, though not as good at predicting what the best approach for data you expect to change in months to come - as you do here.
An indexed view may well be your best approach, as that way you can create the closest thing to a partial index that is available in SQLServer 2005 (which your tags suggest you are using). See http://technet.microsoft.com/en-us/library/cc917715.aspx#XSLTsection124121120120 This will create an index based on your general search/join/order criteria, but only on the relevant rows (ignoring the others entirely).
Better still, if you can use SQLServer 2008, then use a filtered index (what Microsoft have decided to call partial indices). See http://technet.microsoft.com/en-us/library/cc280372.aspx for more on them.
If you'd tagged with 2008 rather than 2005 I'd definitely be suggesting filtered indices, as is I'd probably go for the indexed view, but might just go for the multi-column index.
对活动字段建立索引并每个周末重建索引,如果每天确实只有 500 条记录,那么您将受益很长一段时间。
365 天乘以 500 是 182500,你写道
,但每天只有 500 条记录,需要 11 年时间。
Index the active field and rebuild the index each weekend and you will be good for ages if it's really only 500 records a day.
365 days times 500 is 182500 and you wrote
but with only 500 a day that would take eleven years.
索引可能是在这样的表上提高性能的方法。
您可以考虑使用另一个表,放置您确定不会使用的数据,除非在某些特定报告上。
Index is probably the way to go for performance on a table like that.
You can consider using another table by putting data you are sure you won't use unless on certain specific report.