细长的 SQL Server 2005 表
我有一个具有以下结构的 SQL 表:
Code1 - int
Code2 - int
Val1 - real
Val2 - real
Val3 - real
只有一个索引(聚集),Code1 是第一个索引列,Code2 是第二个索引列。单个记录的大小为20字节。
我需要能够存储大约 150,000,000 条记录,最大的选择操作将针对 500,000 条记录。我假设表的大小约为 3GB,
我想知道这种设计是否可行,或者在处理如此大的表时可能会出现“无法解释的”问题或速度减慢。
I have an SQL table with the following structure:
Code1 - int
Code2 - int
Val1 - real
Val2 - real
Val3 - real
There is only one index (clustered) were Code1 is the first indexed column and Code2 is the second. The size of a single record is 20bytes.
I need to be able to store around 150,000,000 records and the biggest select operation would be on 500,000 records. I assume that the size of the table will be around 3GB
I would like to know if this design will work or there might be 'unexplained' problems or slowdowns when dealing with such a big table.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
基本上,一个包含 1.5 亿行的表对于 SQL Server 来说没什么意义 - 甚至不会真正费力:-)
真正的重点是:如何访问数据?您会有什么样的疑问?例如,如果您的查询具有仅包含“col2”列的 WHERE 子句,那么您就没有在 (col1,col2) 上建立聚集索引的良好设置。
另外:数据在您的字段中是如何分布的?哪些是选择性的,哪些是更统一的?如果 col1 或 col2 具有高度选择性(例如,单个值选择的数据明显少于 2%),则尽可能使用该字段进行选择。对可能有两个、三个不同值的“性别”字段之类的字段建立索引并没有真正的帮助,因为任何使用该字段作为 WHERE 子句的选择总是会返回太多数据而无法有效。
Basically, a table with 150 million rows is nothing for SQL Server - won't even break a real sweat :-)
The point really is: how do you access the data? What kind of queries will you have? E.g. if you have queries that have a WHERE clause with just the "col2" column, then you don't have a good setup with a clustered index on (col1,col2).
Also: how is the data distributed in your fields? Which are selective, which are more uniform? If col1 or col2 are highly selective (e.g. a single value selects significantly less than 2% of the data), then use that field for your selects, if ever possible. Indexing something like a "gender" field which might have two, three different values won't really help, since any select using that field as a WHERE clause will always return way too much data to be effective.
有一个关于 SQL 性能和大型表的非常完整的问题:SQL Server 中的非常大的表< /a>
There is a very complete question about SQL performance and large tables: Very large tables in SQL Server