SQL Server 2008 R2 Express DataReader 性能

发布于 2024-12-23 12:01:58 字数 268 浏览 1 评论 0原文

我有一个包含 250,000 条记录的数据库。我正在使用 DataReader 循环记录并导出到文件。仅使用 DataReader 循环记录且没有 WHERE 条件大约需要 22 分钟。我只选择两列（id 和一个 nvarchar(max) 列，其中包含大约 1000 个字符）。

对于 SQL Server Express 来说 22 分钟听起来正确吗？ 1GB RAM 或 1CPU 对此有影响吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

睫毛溺水了 2024-12-30 12:01:58

对于针对 250K 条记录的单个基本（非聚合）SELECT 来说，22 分钟听起来太长了（对我来说，甚至 22 秒听起来也非常长）。

要说出原因，如果您可以发布一些代码和架构定义，将会有所帮助。您是否配置了任何触发器？

由于每条记录 (2KB) 中有 1K 个字符，250K 条记录 (500MB) 应符合 SQL Express 的 1GB 限制，因此内存对于该查询而言不应成为问题。

您看到的性能问题的可能原因包括：

来自其他应用程序的争用行的宽度
比您提到的两列宽得多
表或数据库 MDF 文件的磁盘碎片过多
应用程序之间的网络连接速度较慢和数据库

更新：我做了一个快速测试。在我的机器上，使用 SqlDataReader 读取 250K 2KB 行只需不到 1 秒。

首先，创建包含 256K 行的测试表（这只花费了大约 30 秒）：

CREATE TABLE dbo.data (num int PRIMARY KEY, val nvarchar(max))
GO
DECLARE @txt nvarchar(max)
SET @txt = N'put 1000 characters here....'
INSERT dbo.data VALUES (1, @txt);
GO 
INSERT dbo.data 
    SELECT num + (SELECT COUNT(*) FROM dbo.data), val FROM dbo.data 
GO 18

测试网页读取数据并显示统计信息：

using System;
using System.Collections;
using System.Data.SqlClient;
using System.Text;

public partial class pages_default
{
    protected override void OnLoad(EventArgs e)
    {
        base.OnLoad(e);
        using (SqlConnection conn = new SqlConnection(DAL.ConnectionString))
        {
            using (SqlCommand cmd = new SqlCommand("SELECT num, val FROM dbo.data", conn))
            {
                conn.Open();
                conn.StatisticsEnabled = true;
                using (SqlDataReader reader = cmd.ExecuteReader())
                {
                    while (reader.Read())
                    {
                    }
                }
                StringBuilder result = new StringBuilder();
                IDictionary stats = conn.RetrieveStatistics();
                foreach (string key in stats.Keys)
                {
                    result.Append(key);
                    result.Append(" = ");
                    result.Append(stats[key]);
                    result.Append("<br/>");
                }
                this.info.Text = result.ToString();
            }
        }
    }
}

结果（执行时间以毫秒为单位）：

IduRows = 0
Prepares = 0
PreparedExecs = 0
ConnectionTime = 930
SelectCount = 1
Transactions = 0
BytesSent = 88
NetworkServerTime = 0
SumResultSets = 1
BuffersReceived = 66324
BytesReceived = 530586745
UnpreparedExecs = 1
ServerRoundtrips = 1
IduCount = 0
BuffersSent = 1
ExecutionTime = 893
SelectRows = 262144
CursorOpens = 0

我使用 SQL Enterprise 和 SQL Express 重复了测试，得到了类似的结果。

从每行捕获“val”元素将执行时间增加到 4093 毫秒 (string val = (string)reader["val"];)。使用 DataTable.Load(reader) 花费了大约 4600 毫秒。

在 SSMS 中运行相同的查询大约需要 8 秒才能捕获所有 256K 行。

22 minutes sounds way too long for a single basic (non-aggregating) SELECT against 250K records (even 22 seconds sounds awfully long for that to me).

To say why, it would help if you could post some code and your schema definition. Do you have any triggers configured?

With 1K characters in each record (2KB), 250K records (500MB) should fit within SQL Express' 1GB limit, so memory shouldn't be an issue for that query alone.

Possible causes of the performance problems you're seeing include:

Contention from other applications
Having rows that are much wider than just the two columns you mentioned
Excessive on-disk fragmentation of either the table or the DB MDF file
A slow network connection between your app and the DB

Update: I did a quick test. On my machine, reading 250K 2KB rows with a SqlDataReader takes under 1 second.

First, create test table with 256K rows (this only took about 30 seconds):

CREATE TABLE dbo.data (num int PRIMARY KEY, val nvarchar(max))
GO
DECLARE @txt nvarchar(max)
SET @txt = N'put 1000 characters here....'
INSERT dbo.data VALUES (1, @txt);
GO 
INSERT dbo.data 
    SELECT num + (SELECT COUNT(*) FROM dbo.data), val FROM dbo.data 
GO 18

Test web page to read data and display the statistics:

using System;
using System.Collections;
using System.Data.SqlClient;
using System.Text;

public partial class pages_default
{
    protected override void OnLoad(EventArgs e)
    {
        base.OnLoad(e);
        using (SqlConnection conn = new SqlConnection(DAL.ConnectionString))
        {
            using (SqlCommand cmd = new SqlCommand("SELECT num, val FROM dbo.data", conn))
            {
                conn.Open();
                conn.StatisticsEnabled = true;
                using (SqlDataReader reader = cmd.ExecuteReader())
                {
                    while (reader.Read())
                    {
                    }
                }
                StringBuilder result = new StringBuilder();
                IDictionary stats = conn.RetrieveStatistics();
                foreach (string key in stats.Keys)
                {
                    result.Append(key);
                    result.Append(" = ");
                    result.Append(stats[key]);
                    result.Append("<br/>");
                }
                this.info.Text = result.ToString();
            }
        }
    }
}

Results (ExecutionTime in milliseconds):

IduRows = 0
Prepares = 0
PreparedExecs = 0
ConnectionTime = 930
SelectCount = 1
Transactions = 0
BytesSent = 88
NetworkServerTime = 0
SumResultSets = 1
BuffersReceived = 66324
BytesReceived = 530586745
UnpreparedExecs = 1
ServerRoundtrips = 1
IduCount = 0
BuffersSent = 1
ExecutionTime = 893
SelectRows = 262144
CursorOpens = 0

I repeated the test with SQL Enterprise and SQL Express, with similar results.

Capturing the "val" element from each row increased ExecutionTime to 4093 ms (string val = (string)reader["val"];). Using DataTable.Load(reader) took about 4600 ms.

Running the same query in SSMS took about 8 seconds to capture all 256K rows.

回复收藏 0 原文

黑寡妇 2024-12-30 12:01:58

运行 exec sp_spaceused myTable 的结果提供了一个潜在的提示：

rows = 255,000
reserved = 1994320 KB
data = 1911088 KB
index_size = 82752 KB
unused 480KB

这里需要注意的重要一点是 reserved = 1994320 KB 意味着当读取以下字段时，您的表约为 1866 MB未建立索引（因为 NVARCHAR(MAX) 无法建立索引）SQL Server 必须在限制列之前将整行读入内存。因此，您很容易就会超出 1GB RAM 限制。

作为一个简单的测试，删除最后（或前）150k 行，然后再次尝试查询，看看获得的性能。

有几个问题：

您的表在主键上是否有聚集索引（是 id 字段还是其他字段）？
您是否对未索引的列（例如 `nvarchar(max) 字段）进行排序？

在最适合您的情况下，您的 PK 是 id 并且也是一个聚集索引，并且您要么没有 order by 要么您是 order by id：

假设您的 varchar(max) 字段名为 comments：

SELECT id, comments
FROM myTable
ORDER BY id

这可以正常工作，但需要您将所有行读入内存（但它只会执行一次解析）表），因为 comments 是VARCHAR(MAX) 并且无法建立索引，并且表为 2GB，然后 SQL 必须将表部分加载到内存中。

可能发生的情况是这样的：

SELECT id, comments
FROM myTable
ORDER BY comment_date

其中 comment_date 是一个未索引的附加字段。这种情况下的行为是，SQL 无法实际对内存中的所有行进行排序，最终不得不多次将表分页进出内存，这可能会导致您所看到的问题。

在这种情况下，一个简单的解决方案是向 comment_date 添加索引。

但是假设这是不可能的，因为您只有对数据库的读取访问权限，那么另一个解决方案是使用以下命令创建您想要的数据的本地临时表：

DECLARE @T TABLE
(
id BIGINT,
comments NVARCHAR(MAX),
comment_date date
)

INSERT INTO @T SELECT id, comments, comment_date FROM myTable

SELECT id, comments
FROM @T
ORDER BY comment_date

如果这没有帮助，则需要其他信息，您可以发布吗您的实际查询以及整个表定义以及索引是什么。

除此之外，在恢复备份以重建索引和统计信息后运行以下命令，您可能会遇到统计信息损坏的情况（当您备份碎片数据库然后将其恢复到新实例时会发生这种情况）：

EXEC [sp_MSforeachtable] @command1="RAISERROR('UPDATE STATISTICS(''?'') ...',10,1) WITH NOWAIT UPDATE STATISTICS ? "

EXEC [sp_MSforeachtable] @command1="RAISERROR('DBCC DBREINDEX(''?'') ...',10,1) WITH NOWAIT DBCC DBREINDEX('?')"

EXEC [sp_MSforeachtable] @command1="RAISERROR('UPDATE STATISTICS(''?'') ...',10,1) WITH NOWAIT UPDATE STATISTICS ? "

Your results from running exec sp_spaceused myTable provide a potential hint:

rows = 255,000
reserved = 1994320 KB
data = 1911088 KB
index_size = 82752 KB
unused 480KB

The important thing to note here is reserved = 1994320 KB meaning your table is some 1866 MB, when reading fields that are not indexed (since NVARCHAR(MAX) can not be indexed) SQL Server must read the entire row into memory before restricting the columns. Hence you're easily running past the 1GB RAM limit.

As a simple test delete the last (or first) 150k rows and try the query again see what performance you get.

A few questions:

Does your table have a clustered index on the primary key (is it the id field or something else)?
Are you sorting on a column that is not indexed such as the `nvarchar(max) field?

In the best scenario for you your PK is id and also a clustered index and you either have no order by or you are order by id:

Assuming your varchar(max) field is named comments:

SELECT id, comments
FROM myTable
ORDER BY id

This will work fine but it will require you to read all the rows into memory (but it will only do one parse over the table), since comments is VARCHAR(MAX) and cannot be indexed and table is 2GB SQL will then have to load the table into memory in parts.

Likely what is happening is you have something like this:

SELECT id, comments
FROM myTable
ORDER BY comment_date

Where comment_date is an additional field that is not indexed. The behaviour in this case would be that SQL would be unable to actually sort the rows all in memory and it will end up having to page the table in and out of memory several times likely causing the problem you are seeing.

A simple solution in this case is to add an index to comment_date.

But suppose that is not possible as you only have read access to the database, then another solution is make a local temp table of the data you want using the following:

DECLARE @T TABLE
(
id BIGINT,
comments NVARCHAR(MAX),
comment_date date
)

INSERT INTO @T SELECT id, comments, comment_date FROM myTable

SELECT id, comments
FROM @T
ORDER BY comment_date

If this doesn't help then additional information is required, can you please post your actual query along with your entire table definition and what the index is.

Beyond all of this run the following after you restore backups to rebuild indexes and statistics, you could just be suffering from corrupted statistics (which happens when you backup a fragmented database and then restore it to a new instance):

EXEC [sp_MSforeachtable] @command1="RAISERROR('UPDATE STATISTICS(''?'') ...',10,1) WITH NOWAIT UPDATE STATISTICS ? "

EXEC [sp_MSforeachtable] @command1="RAISERROR('DBCC DBREINDEX(''?'') ...',10,1) WITH NOWAIT DBCC DBREINDEX('?')"

EXEC [sp_MSforeachtable] @command1="RAISERROR('UPDATE STATISTICS(''?'') ...',10,1) WITH NOWAIT UPDATE STATISTICS ? "

回复收藏 0 原文

~没有更多了~