实体框架缓存的查询计划性能随不同参数而降低
我有以下问题。
背景
我正在尝试使用 MVC3、EF4 和 jquery 在包含 450 万条记录的表上实现自动完成选择器。
这是表:
CREATE TABLE [dbo].[CONSTA] (
[afpCUIT] nvarchar(11) COLLATE Modern_Spanish_CI_AS NOT NULL,
[afpNombre] nvarchar(30) COLLATE Modern_Spanish_CI_AS NULL,
[afpGanancias] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
[afpIVA] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
[afpMonot] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
[afpIntSoc] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
[afpEmpl] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
[afpAct] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
CONSTRAINT [CONSTA_pk] PRIMARY KEY CLUSTERED ([afpCUIT])
)
ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [CONSTA_Nombre_idx] ON [dbo].[CONSTA]
([afpNombre])
WITH (
PAD_INDEX = OFF,
DROP_EXISTING = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = OFF,
ALLOW_PAGE_LOCKS = OFF)
ON [PRIMARY]
GO
该表非常静态(它只需要每月批量更新)并且是只读的。
如果有人想下载记录 (54MB),请访问以下 URL:
http:// www.afip.gob.ar/genericos/cInscripcion/22102011.zip
以下是记录描述:
http://www.afip.gob.ar/genericos/cInscripcion/archivoCompleto.asp
这是应用程序的代码:
控制器:
public class AltaMasivaController : Controller
{
//
// GET: /AltaMasiva/
public ActionResult Index()
{
return View();
}
public JsonResult GetUsers(string query)
{
CENT2Entities db = new CENT2Entities();
bool isCUIT = true;
for(int j = 0; j < query.Length; j++)
if (! Char.IsDigit(query, j))
{
isCUIT = false;
break;
}
if (isCUIT)
{
// nvarchar search
var x = from u in db.CONSTA
where u.afpCUIT.StartsWith(query)
orderby u.afpNombre
select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };
return Json(x.Take(50), JsonRequestBehavior.AllowGet);
}
else
{
// nvarchar search
var x = from u in db.CONSTA
where u.afpNombre.StartsWith(query)
orderby u.afpNombre
select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };
return Json(x.Take(50), JsonRequestBehavior.AllowGet);
}
}
}
视图:
@{
viewbag.title = "index";
}
<h2>index</h2>
@html.textbox("user", "", new { style="width: 400px;" })
<script type="text/javascript">
$("input#user").autocomplete(
{
source: function (request, response)
{
// define a function to call your action (assuming usercontroller)
$.ajax(
{
url: '/altamasiva/getusers', type: "post", datatype: "json",
// query will be the param used by your action method
data: { query: request.term },
success: function(data){
response( $.map(data, function (item){ return { label: item.label + " (" + item.id + ")", value: item.label, id: item.id }; }));
}
})
},
minlength: 1, // require at least one character from the user
});
</script>
现在:
问题
如您所见,如果查询字符串仅包含数字,则代码遵循不同的路径。
当控制器参数的所有字符都是数字时(其中 u.afpCUIT.StartsWith(query) ),查询优化器“应该”执行聚集索引查找(它确实这样做了)并返回它找到的前 50 行。 当第一个“自动完成”字符串到达时(通常最多一两个字符),查询执行速度非常快,但是,当字符串长度增加时,性能显着下降(使用 9 或 9 个字符时,几乎需要 20 秒到 2 分钟)更多字符)。 令人惊讶的是,在“重新启动”SQL Server 服务后,如果初始字符串包含 10 个字符,它的性能也很好,但是当我们从“查询”字符串中删除字符时,性能会下降,完全相反。
为什么会发生这种情况?
当 SQL Server 编译第一个执行计划时,它会对其进行优化,以便在处理大型结果集时快速执行(反之亦然)。后续查询会缩小(或扩展)结果集,需要不同的执行计划...但是...EF 生成的 SQL 使用逗号参数来(精确地)避免语句重新编译...
通过执行清理执行计划缓存:
db.ExecuteStoreCommand("DBCC FREEPROCCACHE");
恢复性能达到出色的响应时间...但是...它会杀死所有数据库中的所有计划,从而降低所有其他缓存计划(通常执行正常)的性能。
对 EF sql 语句进行一些分析后,我在 EF 生成 sql 之前在查询分析器中执行 DBCC FREEPROCCACHE,结果生成了不同的执行计划,所有执行计划都在 250 毫秒范围内执行,与参数长度
DBCC FREEPROCCACHE
exec sp_executesql N'SELECT TOP (50)
[Project1].[C1] AS [C1],
[Project1].[C2] AS [C2],
[Project1].[afpCUIT] AS [afpCUIT]
FROM ( SELECT
[Extent1].[afpCUIT] AS [afpCUIT],
[Extent1].[afpNombre] AS [afpNombre],
1 AS [C1],
RTRIM([Extent1].[afpNombre]) AS [C2]
FROM [dbo].[CONSTA] AS [Extent1]
WHERE [Extent1].[afpCUIT] LIKE @p__linq__0 ESCAPE N''~''
) AS [Project1]
ORDER BY [Project1].[afpNombre] ASC',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'2023291%'
无关
:一个更优雅的替代品
db.ExecuteStoreCommand("DBCC FREEPROCCACHE");
?
令人惊讶的是,查询的第二条路径(其中 u.afpNombre.StartsWith(query) )没有受到相同问题的影响并且表现良好。显然,当字符串长度改变时,执行计划不会改变...
我在旧版本的 EF 中找到了一个 ObjectContext 参数:
System.Data.EntityClient.EntityCommand.EnablePlanCaching
但在 EF4 中找不到它,并且我不确定全局结果是否会是相同。
我真的很困惑这个问题,不知道真正的问题出在哪里
索引设计不好? 缺少分区? SQL SERVER 2008 Express 版本? EF 生成 SQL? 运气差?
任何帮助都会很棒。 提前致谢!
I have the following problem.
Background
I'm trying to implement an autocomplete selector with MVC3, EF4 and jquery over a table wit 4.5 million records.
This is the table:
CREATE TABLE [dbo].[CONSTA] (
[afpCUIT] nvarchar(11) COLLATE Modern_Spanish_CI_AS NOT NULL,
[afpNombre] nvarchar(30) COLLATE Modern_Spanish_CI_AS NULL,
[afpGanancias] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
[afpIVA] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
[afpMonot] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
[afpIntSoc] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
[afpEmpl] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
[afpAct] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
CONSTRAINT [CONSTA_pk] PRIMARY KEY CLUSTERED ([afpCUIT])
)
ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [CONSTA_Nombre_idx] ON [dbo].[CONSTA]
([afpNombre])
WITH (
PAD_INDEX = OFF,
DROP_EXISTING = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = OFF,
ALLOW_PAGE_LOCKS = OFF)
ON [PRIMARY]
GO
The table is pretty static (it only needs a monthly batch update) and read only.
if somebody cares to download the records (54MB) this is the URL:
http://www.afip.gob.ar/genericos/cInscripcion/22102011.zip
and here is the record description:
http://www.afip.gob.ar/genericos/cInscripcion/archivoCompleto.asp
Here is the code of the app:
CONTROLLER:
public class AltaMasivaController : Controller
{
//
// GET: /AltaMasiva/
public ActionResult Index()
{
return View();
}
public JsonResult GetUsers(string query)
{
CENT2Entities db = new CENT2Entities();
bool isCUIT = true;
for(int j = 0; j < query.Length; j++)
if (! Char.IsDigit(query, j))
{
isCUIT = false;
break;
}
if (isCUIT)
{
// nvarchar search
var x = from u in db.CONSTA
where u.afpCUIT.StartsWith(query)
orderby u.afpNombre
select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };
return Json(x.Take(50), JsonRequestBehavior.AllowGet);
}
else
{
// nvarchar search
var x = from u in db.CONSTA
where u.afpNombre.StartsWith(query)
orderby u.afpNombre
select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };
return Json(x.Take(50), JsonRequestBehavior.AllowGet);
}
}
}
VIEW:
@{
viewbag.title = "index";
}
<h2>index</h2>
@html.textbox("user", "", new { style="width: 400px;" })
<script type="text/javascript">
$("input#user").autocomplete(
{
source: function (request, response)
{
// define a function to call your action (assuming usercontroller)
$.ajax(
{
url: '/altamasiva/getusers', type: "post", datatype: "json",
// query will be the param used by your action method
data: { query: request.term },
success: function(data){
response( $.map(data, function (item){ return { label: item.label + " (" + item.id + ")", value: item.label, id: item.id }; }));
}
})
},
minlength: 1, // require at least one character from the user
});
</script>
And now:
THE PROBLEM
As you can see, The code follows different paths if the query string contains only numbers.
When all the characters of the controller parameter are numbers (where u.afpCUIT.StartsWith(query) ), the query optimizer "is supposed to" perform a clustered index seek (which it does) and return the first 50 rows it finds.
When the first "autocomplete" string arrives (usually one or two characters at most) the query performs extraordinarily fast, but, when the lenght of the string increases, the performance degrades notably (it takes almost between 20 seconds to 2 minutes with 9 or more chars).
Surprisingly, after "restarting" the SQL Server Service, if the initial string contains 10 chars, it performs great too, but the performance degrades when we delete chars from the "query" string, the complete opposite.
Why is this happening?
When SQL server compiles the first execution plan, it optimizes it to perform really fast with a large result set (or viceversa). Subsequent queries, which narrows (or expands) the resultset, require a different execution plan ... BUT ... EF generated SQL uses commad parameters to (precisely) avoid statement recompiling ...
Cleaning the Execution Plan Cache by executing:
db.ExecuteStoreCommand("DBCC FREEPROCCACHE");
restores the performance to excellent response times ... BUT ... it kills al plans in all databases, thus degrading the performance of all other cached plans (which generally perform OK).
After doing some profiling on the EF sql statements, I executed DBCC FREEPROCCACHE in Query Analyzer prior to the sql EF generates, which turned out to generate different execution plans, all performing in the 250ms range, independently of the parameter length:
DBCC FREEPROCCACHE
exec sp_executesql N'SELECT TOP (50)
[Project1].[C1] AS [C1],
[Project1].[C2] AS [C2],
[Project1].[afpCUIT] AS [afpCUIT]
FROM ( SELECT
[Extent1].[afpCUIT] AS [afpCUIT],
[Extent1].[afpNombre] AS [afpNombre],
1 AS [C1],
RTRIM([Extent1].[afpNombre]) AS [C2]
FROM [dbo].[CONSTA] AS [Extent1]
WHERE [Extent1].[afpCUIT] LIKE @p__linq__0 ESCAPE N''~''
) AS [Project1]
ORDER BY [Project1].[afpNombre] ASC',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'2023291%'
THE QUESTION
Is there a more ellegant alternative to
db.ExecuteStoreCommand("DBCC FREEPROCCACHE");
?
Surprisingly, the second path of the query ( where u.afpNombre.StartsWith(query) ) are not affected by the same problem and performs great. Obviously, execution plans do not change when the lenght of the string changes...
I found an ObjectContext parameter in older versions of EF:
System.Data.EntityClient.EntityCommand.EnablePlanCaching
but I couldn't find it in EF4, and I'm not sure if the global results would be the same.
I'm really puzzled with this problem, and I don't know where the real problem lies
Poor index design?
Lack of partitions?
SQL SERVER 2008 Express edition?
EF generated SQL?
Lousy luck?
Any help would be great.
Thanx in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
有一种方法可以从 SQL Server 缓存中删除单个计划。
这里有详细解释:
http://sqlblog.com/blogs/kalen_delaney/archive/2007/09/29/geek-city-clearing-a-single-plan-from-cache.aspx
另外,您可以创建一个存储过程,并将其映射到实体框架而不是使用 LINQ2Entities,并通过这种方式对 SQL 语法进行特定更改,并确保它始终相同。
There's a way to remove a single plan from SQL Server's cache.
It's explained in detail here:
http://sqlblog.com/blogs/kalen_delaney/archive/2007/09/29/geek-city-clearing-a-single-plan-from-cache.aspx
Also, you can create a Stored Procedure, and map it with Entity Framework instead of using LINQ2Entities, and in this way make spcific changes to the SQL syntax, and make sure it's always the same.
正如您所指出的,SQL Server 编译计划以针对具有大型结果集的一个参数值进行优化。当结果集缩小时,查询性能不佳。
这种情况需要在查询中使用“选项(重新编译)”提示,因此将为收到的每个值重新编译查询。
使用实体框架来做到这一点并不那么容易。您将需要创建一个 DbCommandInterceptor 以在查询中包含选项(重新编译)。另一种选择是在 SQL Server 中创建计划指南以将“选项(重新编译)”添加到查询中。
您可以在此处找到有关 DbCommandInterceptor 的信息 - 添加查询提示调用表值函数
关于计划指南,您将需要与此类似的内容:
As you identified, SQL Server compiles the plan to be optimized for one parameter value with large result set. When the result set is narrowed, the query doesn't perform well.
This scenario requires the use of "option (recompile)" hint in the query, so the query will be recompiled for each value it receives.
It's not so easy to do this with entity framework. You will need to create a DbCommandInterceptor to include option (recompile) in the query. Another option is to create a plan guide in SQL Server to add the "option (recompile)" to the query.
You will find information about the DbCommandInterceptor here - Adding a query hint when calling Table-Valued Function
About the plan guide, you will need something similar to this: