寻找 ROW_OVERFLOW_DATA 如何发生的精确性
我目前正处于计划重写 CRM 应用程序中的大型模块的初始阶段。
我目前正在研究的一个领域是数据库优化,我还没有做出任何决定,但我只是想确保我正确理解 ROW_OVERFLOW_DATA 的概念 - http://msdn.microsoft.com/en-us/library/ms186981.aspx
我们使用的是 SQL Server 2005,据我了解,行大小限制为 8,060 字节,超过此限制将会发生溢出。
我运行了一个查询来获取特定读取密集型数据库的最大行大小,
SELECT OBJECT_NAME (sc.[id]) tablename
, COUNT (1) nr_columns
, SUM (sc.length) maxrowlength
FROM syscolumns sc
join sysobjects so
on sc.[id] = so.[id]
WHERE so.xtype = 'U'
GROUP BY OBJECT_NAME (sc.[id])
ORDER BY SUM (sc.length) desc
这给了我一些 maxrowlength 略高于 8,000 但低于 10,000 的表。另一个查询显示平均行大小实际上非常小,大约 1,000 字节。
我的问题是: ROW_OVERFLOW_DATA 是基于每行还是每列?一旦扩展了 8,060 字节限制,导致溢出的整个列是否会移至另一页,还是只是特定行?
例如,给出以下简化模式:
col1 (int) | col 2 (varchar (4000)) | col 3(varchar(5000))
1 | 4000 characters | 5000 characters ***This row is overflowing
2 | 4000 characters | 100 characters
3 | 150 characters | 150 characters
4 | 500 characters | 600 characters
第 1 行到第 4 行的每个第 3 列都会被 24 字节指针替换还是仅被 rowID 1 替换?
我想知道,因为如果每一行都有一个指针,那么修复它就变得很重要,如果只有几行,也许我们会受到性能影响。
另外,我看到许多博客建议将可为空的列移到数据库的末尾,这样如果值实际上为 NULL,它们就不会占用任何行空间。这是真的吗?我们倾向于将时间戳和跟踪列保留在最后,因为它更容易可视化。现在我想知道我们是否不应该将它们进一步移动,因为它们永远不会为空。
I'm currently in the initial phases of planning a rewrite for a large module in our CRM application.
One area I am currently looking into is database optimization, I haven't made any decision yet but I just want to make sure I understand the concept of ROW_OVERFLOW_DATA properly - http://msdn.microsoft.com/en-us/library/ms186981.aspx
We are using SQL server 2005, it's my understanding that the row size limit is 8,060 bytes and that after that overflow will occur.
I ran a query to get my max row size for a particular read intensive database
SELECT OBJECT_NAME (sc.[id]) tablename
, COUNT (1) nr_columns
, SUM (sc.length) maxrowlength
FROM syscolumns sc
join sysobjects so
on sc.[id] = so.[id]
WHERE so.xtype = 'U'
GROUP BY OBJECT_NAME (sc.[id])
ORDER BY SUM (sc.length) desc
This gave me a few tables with a maxrowlength that was sligtly above 8,000, but under 10,000. Another query shows that the average row size is actually quite small, around 1,000 bytes.
My question is: is ROW_OVERFLOW_DATA based on each row or is it per column? Once the 8,060 bytes limit is expanded is the entire column that caused it to overflow moved to another page or is it only the specific row?
So for example given the following simplified schema:
col1 (int) | col 2 (varchar (4000)) | col 3(varchar(5000))
1 | 4000 characters | 5000 characters ***This row is overflowing
2 | 4000 characters | 100 characters
3 | 150 characters | 150 characters
4 | 500 characters | 600 characters
Would every the col 3 of row 1 to 4 get replaced by a 24 bytes pointer or only rowID 1?
I am wondering cause if it's every row gets a pointer it becomes important to fix it, if it's only a few rows maybe we can take the performance hit.
Also, I've seen many blogs suggesting to move nullable columns toward the end of the database so that if the values are in fact NULL they don't take any row space. Is this true? We tend to keep our timestamp and tracking columns at the end cause it's easier to visualize. Now I am wondering if maybe we shouldn't move them further up as they are never NULL.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果有一亿行数据溢出,您会移动整列吗?否。
仅供参考,来自 Paul Randal 的 technet 文章是这个东西的上帝(我的大胆)
和 MSDN (我的粗体)
至于您的 NULLable 列,这是错误的。无论表定义中的列顺序如何,可为 NULL 的列都存储在磁盘结构的末尾。以及来自 的参考Paul Randal:存储引擎内部:再次剖析记录。任何以前的答案我在这里
If you have one row in, say, a 100 million that overflows would you move the whole column? No.
For reference, a technet article from Paul Randal who is the God of this stuff (my bold)
And MSDN (my bold)
As for your NULLable columns, this is false. NULLable columns are stored at the end of the disk structure anyway regardless of column order in the table definition. And a reference from Paul Randal: Inside the Storage Engine: Anatomy of a record again. Any some previous answers from me here on SO
仅当某一特定行溢出时,该行的违规数据才会被移至单独的溢出页中 - 想象一下,如果仅仅因为一列中的一个值溢出而整个表需要重建,那会多么令人头痛!
我没有听说过将 NULLables 移动到表末尾的想法 - 我必须检查一下!
Only if a particular row overflows will the offending data for that row be moved off into a separate overflow page - imagine the headache if the entire table needed rebuilding just because one value in one column overflowed!
I'd not heard of the idea of moving NULLables to the end of the table - I'll have to check into that!