SQL Server 2000 表优化
我一整天都在优化一个大约有 900 万行的 SQL Server 2000 数据库表。我唯一的数据库经验是使用具有数百行的表,因此我从未真正需要处理优化。
我正在根据 21 位数字进行选择和更新。
使用索引 char(21) 类型,查询需要超过 2 秒的时间,并且 SQL Server 进程需要 1.5gigs 的 ram 和 100% 的 cpu。
使用索引 bigint 类型时,我的查询需要几毫秒,并且该过程需要约 100MB 的内存。
我只是想了解这里发生了什么,这是正常的,还是有一种特定的方法可以对 char 类型进行索引以获得更好的性能?
这是我的一些 sql:
CREATE TABLE data
(
tableID int NOT NULL IDENTITY PRIMARY KEY,
tag char(21) NOT NULL UNIQUE,
dataColumn1 datetime NULL,
dataColumn2 char(8) NULL,
lastModified datetime NOT NULL
)
来自 c# 的参数化查询:
SELECT tag FROM data WHERE tag = @tag;
感谢您的帮助。
Ive been working all day on optimizing a SQL Server 2000 db table with ~9million rows. My only db experience has been with tables with a few hundred rows, so I've never really had to deal with optimization.
I am doing selects and updates based on a 21 digit number.
Using an indexed char(21) type, the queries take more then 2 seconds and the SQL Server process takes 1.5gigs of ram and 100% cpu.
With a indexed bigint type my queries take a few milliseconds and the process takes ~100MB of ram.
I just want to understand what is happening here, is this normal, or is there a specific way a char type can be indexed for better performance?
Heres some of my sql:
CREATE TABLE data
(
tableID int NOT NULL IDENTITY PRIMARY KEY,
tag char(21) NOT NULL UNIQUE,
dataColumn1 datetime NULL,
dataColumn2 char(8) NULL,
lastModified datetime NOT NULL
)
Parameterized query from c#:
SELECT tag FROM data WHERE tag = @tag;
Thanks for any help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这确实并不罕见,SQL 处理数字比处理字符要好得多。 bigInt 字段使用 8 个字节,正好适合一个内存页。一个 char 字段需要 21 个字节,这几乎是将其放入索引中的存储量的三倍。
另一个考虑因素是索引是否聚集?聚集索引的执行速度比非聚集索引快得多。除了数字性能更好并且在索引中使用更少空间这一简单而笼统的说法之外,还有许多其他因素需要考虑。
This really isn't unusual, SQL handles numbers much better than characters. A bigInt field uses 8 bytes, which fits neatly into a memory page. A char field takes 21 bytes which almost triples the amount of storage to put it in an index..
Another consideration, is either index clustered? The clustered index will perform much faster than a non-clustered index. There are a lot of additional factors to consider, beyond the simple, general statement that numbers will perform better and use less space in the index.
字符比较速度稍慢 - 必须考虑排序规则 - 更不用说 21 个字符的字符串和 bigint(8 字节)之间的物理大小差异。索引查找的效率不高,因为它必须评估 char(21) 值中的每个字节,决定字符的排序顺序,然后决定如何与您要查找的值中的匹配字符进行比较。
聚集索引对于几乎所有查询都会有更好的性能,因为数据(包括索引页 iirc;我不是 DBA)是按磁盘查找顺序排列的。或者至少更接近它。
Character comparisons are somewhat slower - the collation sequence has to be considered - not to mention the physical difference in size between a 21-character string and a bigint (8 bytes). The index seek just can't be as efficient because it has to evaluate each byte in your char(21) values, decide what the character's sort order is, then decide how that compares to the matching character in the value you're looking for.
A clustered index will perform better for almost any query because the data (including the index pages iirc; I'm not a DBA) are in disk-seek order. Or at least closer to it.
我的钱花在您尝试查询 char 列的某些函数的可能性上,即:
显然我只是在猜测,但这是
char
/ 性能问题的常见来源varchar
/nvarchar
列。如果您正在编写这样的查询,您必须意识到将列包装在函数中会阻止 SQL Server 执行索引查找。还要考虑您实际返回的行数;表中的 900 万行不一定很多,但结果集中即使有 100,000 行也是一个巨大的数量。有时瓶颈只是流式传输所有结果。
如果这些都不能描述您的问题,那么我建议使用有关架构、您拥有的索引以及运行缓慢的查询的一些信息来更新您的帖子。尽管 SQL Server 2000 已经发展多年,但您还可以进行进一步的优化,因此您的选择很有限。
根据您发布的架构,假设您在
tag
列上有一个索引,并且您仅选择tag
并且没有其他可能未被覆盖的列,那么您的 < code>WHERE 条件是tag
列上的简单相等,并且返回的结果数量相当小,我认为您无法进行任何进一步的优化。您已经将其简化为最简单的情况。My money's on the possibility that you're trying to query on some function of the char column, i.e. something like:
Obviously I'm just guessing, but this is a common source of performance issues with
char
/varchar
/nvarchar
columns. If you're writing queries like this, you have to realize that wrapping the column in a function prevents SQL Server from performing an index seek.Also take into account the number of rows you're actually returning; 9 million rows isn't necessarily a lot to have in the table, but even 100,000 rows is a huge amount to have in a result set. Sometimes the bottleneck is just streaming all the results.
If none of this describes your problem, then I suggest updating your post with some information about the schema, the indexes you have, and the query that's running slow. There are further optimizations you can do, although SQL Server 2000 is getting on in years, so your options are limited.
Based on the schema you posted, assuming you have an index on the
tag
column, that you are only selectingtag
and no other columns that might not be covered, that yourWHERE
condition is a simple equality on thetag
column, and that the number of results returned is reasonably small, I don't think that there are any further optimizations you can do. You've pretty much reduced this to the simplest possible case.