使用 linq、外键关系或本地列表哪种模型最快?

发布于 2024-12-06 05:54:22 字数 2811 浏览 0 评论 0原文

一些基础知识

我有两张表,一张保存用户,一张保存登录日志。 用户表包含大约 15000 多个用户,登录表正在增长并达到 150000 多个帖子。 该数据库基于 SQL Server(非 Express)构建。

为了管理用户,我获得了一个从 ObjectDatasource 填充的 gridview(来自 Devexpress 的 ASPxGridView)。

在总结用户登录次数时,我应该了解哪些一般注意事项?

事情变得异常缓慢。

这是一张显示涉及的表的图片。 在此处输入图像描述

我尝试了一些方法。

DbDataContext db = new DbDataContext();

// Using foregin key relationship
foreach (var proUser in db.tblPROUsers)
{
    var count = proUser.tblPROUserLogins.Count;
    //...
}

执行时间:01:29.316(1分29秒)

// By storing a list in a local variable (I removed the FK relation)
var userLogins = db.tblPROUserLogins.ToList();
foreach (var proUser in db.tblPROUsers)
{
    var count = userLogins.Where(x => x.UserId.Equals(proUser.UserId)).Count();
    //...
}

执行时间:01:18.410(1分18秒)

// By storing a dictionary in a local variable (I removed the FK relation)
var userLogins = db.tblPROUserLogins.ToDictionary(x => x.UserLoginId, x => x.UserId);
foreach (var proUser in db.tblPROUsers)
{
    var count = userLogins.Where(x => x.Value.Equals(proUser.UserId)).Count();
    //...
}

执行时间:01:15.821( 1分15秒)

表现最好的模型实际上是字典。不过,我想知道我想听听的任何选项,以及在处理如此大量的数据时这种编码是否存在“不好”的地方。

谢谢

================================================== =======

更新了根据 BrokenGlass 示例的模型

// By storing a dictionary in a local variable (I removed the FK relation)
foreach (var proUser in db.tblPROUsers)
{
    var userId = proUser.UserId;
    var count = db.tblPROUserLogins.Count(x => x.UserId.Equals(userId));
    //...
}

执行时间:02:01.135(2 分 1 秒)

除此之外,我创建了一个存储简单类的列表

public class LoginCount
{
    public int UserId { get; set; }
    public int Count { get; set; }
}

,并且在总结方法

var loginCount = new List<LoginCount>();

// This foreach loop takes approx 30 secs
foreach (var login in db.tblPROUserLogins)
{
    var userId = login.UserId;

    // Check if available
    var existing = loginCount.Where(x => x.UserId.Equals(userId)).FirstOrDefault();
    if (existing != null)
        existing.Count++;
    else
        loginCount.Add(new LoginCount{UserId = userId, Count = 1});
}

// Calling it
foreach (var proUser in tblProUser)
{
    var user = proUser;
    var userId = user.UserId;

    // Count logins
    var count = 0;
    var loginCounter = loginCount.Where(x => x.UserId.Equals(userId)).FirstOrDefault();
    if(loginCounter != null)
        count = loginCounter.Count;
    //...
}

执行时间:00:36.841(36秒)

结论如此到目前为止,用 linq 进行总结很慢,但我已经到了!

Some basics

I have two tables, one holding the users and one holding a log with logins.
The user table holds something like 15000+ users, the login table is growing and is reaching 150000+ posts.
The database is built upon SQL Server (not express).

To administer the users I got a gridview (ASPxGridView from Devexpress) that I populate from an ObjectDatasource.

Is there any general do’s and donts I should know about when summarizing the number of logins a user made.

Things are getting strangely slow.

Here is a picture showing the involved tables.
enter image description here

I’ve tried a few things.

DbDataContext db = new DbDataContext();

// Using foregin key relationship
foreach (var proUser in db.tblPROUsers)
{
    var count = proUser.tblPROUserLogins.Count;
    //...
}

Execution time: 01:29.316 (1 minute and 29 seconds)

// By storing a list in a local variable (I removed the FK relation)
var userLogins = db.tblPROUserLogins.ToList();
foreach (var proUser in db.tblPROUsers)
{
    var count = userLogins.Where(x => x.UserId.Equals(proUser.UserId)).Count();
    //...
}

Execution time: 01:18.410 (1 minute and 18 seconds)

// By storing a dictionary in a local variable (I removed the FK relation)
var userLogins = db.tblPROUserLogins.ToDictionary(x => x.UserLoginId, x => x.UserId);
foreach (var proUser in db.tblPROUsers)
{
    var count = userLogins.Where(x => x.Value.Equals(proUser.UserId)).Count();
    //...
}

Execution time: 01:15.821 (1 minute and 15 seconds)

The model giving the best performance is actually the dictionary. However I you know of any options I'd like to hear about it, also if there's something "bad" with this kind of coding when handling such large amounts of data.

Thanks

========================================================

UPDATED With a model according to BrokenGlass example

// By storing a dictionary in a local variable (I removed the FK relation)
foreach (var proUser in db.tblPROUsers)
{
    var userId = proUser.UserId;
    var count = db.tblPROUserLogins.Count(x => x.UserId.Equals(userId));
    //...
}

Execution time: 02:01.135 (2 minutes and 1 second)

In addition to this I created a list storing a simple class

public class LoginCount
{
    public int UserId { get; set; }
    public int Count { get; set; }
}

And in the summarizing method

var loginCount = new List<LoginCount>();

// This foreach loop takes approx 30 secs
foreach (var login in db.tblPROUserLogins)
{
    var userId = login.UserId;

    // Check if available
    var existing = loginCount.Where(x => x.UserId.Equals(userId)).FirstOrDefault();
    if (existing != null)
        existing.Count++;
    else
        loginCount.Add(new LoginCount{UserId = userId, Count = 1});
}

// Calling it
foreach (var proUser in tblProUser)
{
    var user = proUser;
    var userId = user.UserId;

    // Count logins
    var count = 0;
    var loginCounter = loginCount.Where(x => x.UserId.Equals(userId)).FirstOrDefault();
    if(loginCounter != null)
        count = loginCounter.Count;
    //...
}

Execution time: 00:36.841 (36 seconds)

Conclusion so far, summarizing with linq is slow, but Im getting there!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

长安忆 2024-12-13 05:54:22

如果您尝试构建一个执行相同操作的 SQL 查询并独立于您的应用程序(在 SQL Server Management Studio 中)执行它,也许这会很有用。类似于:

SELECT UserId, COUNT(UserLoginId)
FROM tblPROUserLogin
GROUP BY UserId

(注意:这仅选择 UserId。如果您想要来自 tblPROUser 的其他字段,则需要在这个基本的“顶部”进行简单的 JOIN查询。)

确保 {UserId, UserLoginId} 上有一个复合索引并且查询计划正在使用它。在索引中并按该顺序包含两个字段可确保您的查询可以在不触及 tblPROUserLogin 表的情况下运行:

在此处输入图像描述

然后进行基准测试,看看是否可以获得比 LINQ 代码明显更好的时间:

  • 如果是,那么您需要找到一种方法来“哄骗”LINQ 生成类似的查询。
  • 如果不是,那么您已经达到了前所未有的速度。

--- EDIT ---

以下 LINQ 代码段相当于上面的查询:

var db = new UserLoginDataContext();

db.Log = Console.Out;

var result =
    from user_login in db.tblPROUserLogins
    group user_login by user_login.UserId into g
    select new { UserId = g.Key, Count = g.Count() };

foreach (var row in result) {
    int user_id = row.UserId;
    int count = row.Count;
    // ...
}

在控制台中打印以下文本:

SELECT COUNT(*) AS [Count], [t0].[UserId]
FROM [dbo].[tblPROUserLogin] AS [t0]
GROUP BY [t0].[UserId]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 4.0.30319.1

--- EDIT 2 ---

To拥有“整个”用户而不仅仅是 UserId,您可以执行以下操作:

var db = new UserLoginDataContext();

db.Log = Console.Out;

var login_counts =
    from user_login in db.tblPROUserLogins
    group user_login by user_login.UserId into g
    select new { UserId = g.Key, Count = g.Count() };

var result =
    from user in db.tblPROUsers
    join login_count in login_counts on user.UserId equals login_count.UserId
    select new { User = user, Count = login_count.Count };

foreach (var row in result) {
    tblPROUser user = row.User;
    int count = row.Count;
    // ...
}

控制台输出显示以下查询...

SELECT [t0].[UserId], [t0].[UserGuid], [t0].[CompanyId], [t0].[UserName], [t0].[UserPassword], [t2].[value] AS [Count]
FROM [dbo].[tblPROUser] AS [t0]
INNER JOIN (
    SELECT COUNT(*) AS [value], [t1].[UserId]
    FROM [dbo].[tblPROUserLogin] AS [t1]
    GROUP BY [t1].[UserId]
    ) AS [t2] ON [t0].[UserId] = [t2].[UserId]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 4.0.30319.1

...如果您的索引正确,这应该非常有效:

在此处输入图像描述

Perhaps it would be useful if you tried to construct an SQL query that does the same thing and executing it independently of your application (in SQL Server Management Studio). Something like:

SELECT UserId, COUNT(UserLoginId)
FROM tblPROUserLogin
GROUP BY UserId

(NOTE: This just selects UserId. If you want other fields from tblPROUser, you'll need a simple JOIN "on top" of this basic query.)

Ensure there is a composite index on {UserId, UserLoginId} and it is being used by the query plan. Having both fields in the index and in that order ensures your query can run without touching the tblPROUserLogin table:

enter image description here

Then benchmark and see if you can get a significantly better time than your LINQ code:

  • If yes, then you'll need to find a way to "coax" the LINQ to generate a similar query.
  • If no, then you are already as fast as you'll ever be.

--- EDIT ---

The follwing LINQ snippet is equivalent to the query above:

var db = new UserLoginDataContext();

db.Log = Console.Out;

var result =
    from user_login in db.tblPROUserLogins
    group user_login by user_login.UserId into g
    select new { UserId = g.Key, Count = g.Count() };

foreach (var row in result) {
    int user_id = row.UserId;
    int count = row.Count;
    // ...
}

Which prints the following text in the console:

SELECT COUNT(*) AS [Count], [t0].[UserId]
FROM [dbo].[tblPROUserLogin] AS [t0]
GROUP BY [t0].[UserId]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 4.0.30319.1

--- EDIT 2 ---

To have the "whole" user and not just UserId, you can do this:

var db = new UserLoginDataContext();

db.Log = Console.Out;

var login_counts =
    from user_login in db.tblPROUserLogins
    group user_login by user_login.UserId into g
    select new { UserId = g.Key, Count = g.Count() };

var result =
    from user in db.tblPROUsers
    join login_count in login_counts on user.UserId equals login_count.UserId
    select new { User = user, Count = login_count.Count };

foreach (var row in result) {
    tblPROUser user = row.User;
    int count = row.Count;
    // ...
}

And the console output shows the following query...

SELECT [t0].[UserId], [t0].[UserGuid], [t0].[CompanyId], [t0].[UserName], [t0].[UserPassword], [t2].[value] AS [Count]
FROM [dbo].[tblPROUser] AS [t0]
INNER JOIN (
    SELECT COUNT(*) AS [value], [t1].[UserId]
    FROM [dbo].[tblPROUserLogin] AS [t1]
    GROUP BY [t1].[UserId]
    ) AS [t2] ON [t0].[UserId] = [t2].[UserId]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 4.0.30319.1

...which should be very efficient provided your indexes are correct:

enter image description here

苏佲洛 2024-12-13 05:54:22

第二种情况应该是迄今为止最快的,只要您删除ToList(),这样计数就可以在数据库端完成,而不是在内存中:

var userId = proUser.UserId;
var count = db.tblPROUserLogins.Count(x => x.UserId == userId);

此外,您还必须将由于 EF 无法处理对象的映射属性,因此首先将用户 id 转换为“普通”原始变量。

The second case should always be the fastest by far provided you drop the ToList() so counting can be done on the database side, not in memory:

var userId = proUser.UserId;
var count = db.tblPROUserLogins.Count(x => x.UserId == userId);

Also you have to put the user id into a "plain" primitive variable first since EF can't deal with mapping properties of an object.

格子衫的從容 2024-12-13 05:54:22

抱歉,因为我不在普通计算机上,所以盲目地这样做。只是几个问题,

  • 您在登录表中是否有用户 ID 的索引,
  • 您是否尝试过专门为此页面设计的视图?
  • 您是使用分页来获取用户,还是尝试一次获取所有计数?
  • 您是否运行了 sql profiler 并观察了实际发送的 sql?

这样的事情对你有用吗?

var allOfIt = from c in db.tblProUsers 
        select new {
             User  = c, 
             Count = db.tblProUserLogins.Count(l => l.UserId == c.UserId)
        }
        .Skip(pageSize * pageNumber)
        .Take(pageSize) // page size

Sorry, doing this blind since I'm not on my normal computer. Just a couple of questions

  • do you have an index on the user id in the logins table
  • have you tried a view specifically crafted for this page?
  • are you using paging to get the users, or trying to get all counts at once?
  • have you run sql profiler and watched the actual sql being sent?

Does something like this work for you?

var allOfIt = from c in db.tblProUsers 
        select new {
             User  = c, 
             Count = db.tblProUserLogins.Count(l => l.UserId == c.UserId)
        }
        .Skip(pageSize * pageNumber)
        .Take(pageSize) // page size
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文