需要使用 Linq 优化循环的帮助
免责声明:我对 linq 的经验很少。
我的工作任务之一是维护一个电子商务网站。昨天,我们的一位客户开始抱怨当他们尝试为谷歌创建提要文件时会发生超时。事实证明,如果用户要在其 feed 文件中放入超过 9,000 个项目,我们的代码至少需要一分钟才能执行。
我无法通过运行调试器找到问题的根源,所以我启动了一个探查器(ANTS)并让它做它的事情。它找到了我们问题的根源,即包含一些 linq 代码的 foreach 循环。代码如下:
var productMappings = GoogleProductMappingAccess.GetGoogleProductMappingsByID(context, productID);
List<google_Category> retCats = new List<google_Category>(numCategories);
int added = 0;
//this line was flagged by the profiler as taking 48.5% of total run time
foreach (google_ProductMapping pm in (from pm in productMappings orderby pm.MappingType descending select pm))
{
if (pm.GoogleCategoryId.HasValue && pm.GoogleCategoryId > 0)
{
//this line was flagged as 36% of the total time
retCats.Add(pm.google_Category);
}
else if (pm.GoogleCategoryMappingId.HasValue && pm.GoogleCategoryMappingId > 0)
{
retCats.Add(pm.google_CategoryMapping.google_Category);
}
else
{
continue;
}
if (++added >= numCategories)
{
break;
}
}
你们中更有经验的开发人员有什么想法吗?我正在尝试用 sql 替换所有 linq,但我不确定这是否是最好的做法(如果它是用 linq 编写的,那么一定有原因)。
Disclaimer: I have little experience with linq.
One of my tasks at my work is to maintain an e commerce web site. Yesterday, one of our customers started complaining of a timeout that would occur when they tried to create a feed file for google. Turns out, if the user has more than 9,000 items to put in their feed file, our code takes at least one minute to execute.
I couldn't find the source of the problem by running the debugger, so I fired up a profiler (ANTS) and let it do its thing. It found the source of our problem, a foreach loop that contains a bit of linq code. Here is the code:
var productMappings = GoogleProductMappingAccess.GetGoogleProductMappingsByID(context, productID);
List<google_Category> retCats = new List<google_Category>(numCategories);
int added = 0;
//this line was flagged by the profiler as taking 48.5% of total run time
foreach (google_ProductMapping pm in (from pm in productMappings orderby pm.MappingType descending select pm))
{
if (pm.GoogleCategoryId.HasValue && pm.GoogleCategoryId > 0)
{
//this line was flagged as 36% of the total time
retCats.Add(pm.google_Category);
}
else if (pm.GoogleCategoryMappingId.HasValue && pm.GoogleCategoryMappingId > 0)
{
retCats.Add(pm.google_CategoryMapping.google_Category);
}
else
{
continue;
}
if (++added >= numCategories)
{
break;
}
}
Do any of you more experienced devs have any ideas? I was toying with trying to replace all the linq with sql, but I am unsure if that is the best course of action here (if it was written with linq, there must be a reason for it).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您可以过滤掉您不想要的结果,那么您的查询应该会更快 - 您正在使用
orderby
因此,所有这些结果都会耗尽查询中的处理,因为它们都必须进行评估:您也可以应限制查询返回的结果数量,因为您最多只使用
numCategories
。因此,将 a 添加到您的查询中,而不是在 foreach 循环中进行检查。
If you can filter out the results you don't want anyway your query should be faster - you are using
orderby
hence all these results use up processing in your query since they all have to be evaluated:Also you should limit the number of results returned by the query since you only use up to
numCategories
. So add ato your query, instead of checking within the foreach loop.
retCats.Add(pm.google_Category);
花费如此长的时间的原因是您引用了一个延迟加载的对象,该对象会再次往返服务器。如果您可以重构它,只获取 Id 的本地副本而不是整个对象,那么这将加快该部分的速度。
如果您确实需要获取整个对象,请研究如何在获取 ProductMappings 时通过单个查询将其下拉。如何执行此操作取决于您在 SQL 上使用的 LINQ 包装器。
The reason
retCats.Add(pm.google_Category);
takes so long is because you are referecing a lazily loaded object which does another round trip to the server.If you can refactor that so you only take a local copy of the Id instead of the whole object it will speed that part up.
If you do need to take the whole object, then investigate how you can pull it down in a single query when getting the productMappings. How to do this will depend on what LINQ wrapper you are using on your SQL.
不知道你的数据库模式真的很难说。有几个想法:
1) 通过数据库引擎优化顾问运行查询。也许查询需要一些索引?
2)预处理这些信息并将其放入另一个表或文件中。这样当谷歌请求它时它就不会超时。
Not knowing your database schema it's really hard to tell. A couple of ideas:
1) Run the query through the Database Engine Tuning Advisor. Maybe the query needs some indexes?
2) pre-processing this information and putting it in another table or file. That way when google requests it it won't timeout.
这可能应该有效:
如果 GetGoogleProductMappingsByID 返回一个
IQueryable
(如果适用),效果最好。如果是这样,LINQ 会将整个语句转换为 T-SQL 命令,这将比内存中的 LINQ 快得多。请随意将 .ToList() 添加到最后一个语句,以使其具有与代码中相同的返回类型(并强制执行 LINQ 语句)。
检查 .HasValue 和 > 0没用。检查 ID > 0就够了。
有关详细信息:http://msdn.microsoft.com/en-us/library /2cf62fcy.aspx(运算符)
This should probably work:
It would work best if GetGoogleProductMappingsByID would return an
IQueryable
(if applicable). If so, LINQ will convert the entire statement into a T-SQL command and that would be far faster than in memory LINQ.Feel free to add a .ToList() to the last statement to get it into the same return type as in your code (and to force execution of the LINQ statement).
Checking for both .HasValue and > 0 is useless. Checking for Id > 0 is enough.
For more info: http://msdn.microsoft.com/en-us/library/2cf62fcy.aspx (operators)