寻找 ROW_OVERFLOW_DATA 如何发生的精确性

发布于 2024-11-08 01:59:22 字数 1381 浏览 5 评论 0原文

我目前正处于计划重写 CRM 应用程序中的大型模块的初始阶段。

我目前正在研究的一个领域是数据库优化,我还没有做出任何决定,但我只是想确保我正确理解 ROW_OVERFLOW_DATA 的概念 - http://msdn.microsoft.com/en-us/library/ms186981.aspx

我们使用的是 SQL Server 2005,据我了解,行大小限制为 8,060 字节,超过此限制将会发生溢出。

我运行了一个查询来获取特定读取密集型数据库的最大行大小,

SELECT OBJECT_NAME (sc.[id]) tablename
, COUNT (1) nr_columns
, SUM (sc.length) maxrowlength
FROM syscolumns sc
join sysobjects so
on sc.[id] = so.[id]
WHERE so.xtype = 'U'
GROUP BY OBJECT_NAME (sc.[id])
ORDER BY SUM (sc.length) desc

这给了我一些 maxrowlength 略高于 8,000 但低于 10,000 的表。另一个查询显示平均行大小实际上非常小,大约 1,000 字节。

我的问题是: ROW_OVERFLOW_DATA 是基于每行还是每列?一旦扩展了 8,060 字节限制,导致溢出的整个列是否会移至另一页,还是只是特定行?

例如,给出以下简化模式:

col1 (int) | col 2 (varchar (4000)) | col 3(varchar(5000))
    1      |    4000 characters   |    5000 characters ***This row is overflowing
    2      |    4000 characters   |    100 characters
    3      |    150 characters    |    150 characters
    4      |    500 characters    |    600 characters

第 1 行到第 4 行的每个第 3 列都会被 24 字节指针替换还是仅被 rowID 1 替换?

我想知道,因为如果每一行都有一个指针,那么修复它就变得很重要,如果只有几行,也许我们会受到性能影响。

另外,我看到许多博客建议将可为空的列移到数据库的末尾,这样如果值实际上为 NULL,它们就不会占用任何行空间。这是真的吗?我们倾向于将时间戳和跟踪列保留在最后,因为它更容易可视化。现在我想知道我们是否不应该将它们进一步移动,因为它们永远不会为空。

I'm currently in the initial phases of planning a rewrite for a large module in our CRM application.

One area I am currently looking into is database optimization, I haven't made any decision yet but I just want to make sure I understand the concept of ROW_OVERFLOW_DATA properly - http://msdn.microsoft.com/en-us/library/ms186981.aspx

We are using SQL server 2005, it's my understanding that the row size limit is 8,060 bytes and that after that overflow will occur.

I ran a query to get my max row size for a particular read intensive database

SELECT OBJECT_NAME (sc.[id]) tablename
, COUNT (1) nr_columns
, SUM (sc.length) maxrowlength
FROM syscolumns sc
join sysobjects so
on sc.[id] = so.[id]
WHERE so.xtype = 'U'
GROUP BY OBJECT_NAME (sc.[id])
ORDER BY SUM (sc.length) desc

This gave me a few tables with a maxrowlength that was sligtly above 8,000, but under 10,000. Another query shows that the average row size is actually quite small, around 1,000 bytes.

My question is: is ROW_OVERFLOW_DATA based on each row or is it per column? Once the 8,060 bytes limit is expanded is the entire column that caused it to overflow moved to another page or is it only the specific row?

So for example given the following simplified schema:

col1 (int) | col 2 (varchar (4000)) | col 3(varchar(5000))
    1      |    4000 characters   |    5000 characters ***This row is overflowing
    2      |    4000 characters   |    100 characters
    3      |    150 characters    |    150 characters
    4      |    500 characters    |    600 characters

Would every the col 3 of row 1 to 4 get replaced by a 24 bytes pointer or only rowID 1?

I am wondering cause if it's every row gets a pointer it becomes important to fix it, if it's only a few rows maybe we can take the performance hit.

Also, I've seen many blogs suggesting to move nullable columns toward the end of the database so that if the values are in fact NULL they don't take any row space. Is this true? We tend to keep our timestamp and tracking columns at the end cause it's easier to visualize. Now I am wondering if maybe we shouldn't move them further up as they are never NULL.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

野侃 2024-11-15 01:59:22

如果有一亿行数据溢出,您会移动整列吗?否。

仅供参考,来自 Paul Randal 的 technet 文章是这个东西的上帝(我的大胆)

您使用的行溢出功能非常适合允许偶尔的行超过 8,060 字节,但它不太适合大多数行 strong> 过大可能会导致查询性能下降,正如您所经历的那样。

这样做的原因是,当一行即将变得过大时,该行中的一个可变长度列会被推到“行外”。这意味着该列取自数据或索引页上的行并移动到文本页。替换旧列值的是一个指向数据文件中列值的新位置的指针。

MSDN (我的粗体)

ROW_OVERFLOW_DATA 分配单元

对于表(堆或聚集表)、索引或索引视图使用的每个分区,都有一个 ROW_OVERFLOW_DATA 分配单元。此分配单元包含零 (0) 个页,直到 IN_ROW_DATA 分配单元中具有可变长度列(varchar、nvarchar、varbinary 或 sql_variant)的数据行超过 8 KB 行大小限制。当达到大小限制时,SQL Server 将最大宽度的列从该行移动到 ROW_OVERFLOW_DATA 分配单元中的页面。指向该行外数据的 24 字节指针保留在原始页上。

至于您的 NULLable 列,这是错误的。无论表定义中的列顺序如何,可为 NULL 的列都存储在磁盘结构的末尾。以及来自 的参考Paul Randal:存储引擎内部:再次剖析记录。任何以前的答案我在这里

If you have one row in, say, a 100 million that overflows would you move the whole column? No.

For reference, a technet article from Paul Randal who is the God of this stuff (my bold)

The feature you are using, row-overflow, is great for allowing the occasional row to be longer than 8,060 bytes, but it is not well suited for the majority of rows being oversized and can lead to a drop in query performance, as you are experiencing.

The reason for this is that when a row is about to become oversized, one of the variable-length columns in the row is pushed "off-row." This means the column is taken from the row on the data or index page and moved to a text page. In place of the old column value, a pointer is substituted that points to the new location of the column value in the data file.

And MSDN (my bold)

ROW_OVERFLOW_DATA Allocation Unit

For every partition used by a table (heap or clustered table), index, or indexed view, there is one ROW_OVERFLOW_DATA allocation unit. This allocation unit contains zero (0) pages until a data row with variable length columns (varchar, nvarchar, varbinary, or sql_variant) in the IN_ROW_DATA allocation unit exceeds the 8 KB row size limit. When the size limitation is reached, SQL Server moves the column with the largest width from that row to a page in the ROW_OVERFLOW_DATA allocation unit. A 24-byte pointer to this off-row data is maintained on the original page.

As for your NULLable columns, this is false. NULLable columns are stored at the end of the disk structure anyway regardless of column order in the table definition. And a reference from Paul Randal: Inside the Storage Engine: Anatomy of a record again. Any some previous answers from me here on SO

只怪假的太真实 2024-11-15 01:59:22

仅当某一特定行溢出时,该行的违规数据才会被移至单独的溢出页中 - 想象一下,如果仅仅因为一列中的一个值溢出而整个表需要重建,那会多么令人头痛!

我没有听说过将 NULLables 移动到表末尾的想法 - 我必须检查一下!

Only if a particular row overflows will the offending data for that row be moved off into a separate overflow page - imagine the headache if the entire table needed rebuilding just because one value in one column overflowed!

I'd not heard of the idea of moving NULLables to the end of the table - I'll have to check into that!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文