解决 Group By 子句限制的方法

发布于 2024-07-14 07:44:36 字数 857 浏览 7 评论 0原文

我正在开发一个社交网络 Web 应用程序，我遇到了一种情况，我需要向尚未激活电子邮件的用户重新发送提醒电子邮件。

问题是，当我调查数据库时，我发现许多电子邮件都是重复的（显然没有对电子邮件唯一性进行验证。所以我需要知道的是从数据库中检索这些字段：用户名、电子邮件激活码、电子邮件、这样我就可以重新发送激活电子邮件，对于重复的电子邮件，我只需要返回其中一封（即，如果我的用户 john 的电子邮件 [电子邮件受保护] 和用户 john1，其电子邮件 [email protected] 也是，我只想检索这些约翰中的一个，无论是 john1 还是两个），所以我想到了通过（按电子邮件分组）来跟踪 SQL 查询

。我无法选择不在 group by 子句中的其他字段，我不喜欢这里的解决方案，每次我需要向用户发送电子邮件时；遍历整个列表以确保该电子邮件不存在，如果不存在，我发送给它，然后将电子邮件添加到列表中。
类似以下内容：

if(!EmailIsInList(email)){ 
  SendActivationEmail(email);
  AddEmailToList(email)
}
else { DoNotSend); }

实际上我通过这种方式解决了问题，但我仍然不喜欢我的解决方案。有任何想法吗？

原文

I'm working on a social network web application, and I got a situation where I need to resend reminder emails to users who haven't activated their emails.

The problem is when I investigated the DB I found that many emails are duplicated (there was no validation on the email uniqueness apparently. So what I need to do know is to retrieve these fields from the DB the username, email activation code, email, so that I can resend the activation emails, and for the case of the duplicated emails I need to return only one of them (i.e if I have user john with email [email protected] and user john1 with email [email protected] too, I want to retrieve only one of these johns no matter john1 or two) so I thought of tailing the SQL query by (Group By Email).

The thing is that I can't select other fields that are not in the group by clause. the solution that I have here is one that I don't like; I created a List and every time when I need to send an email to a user I iterate all over the entire list to make sure that this email is not existing, if it's not there, I send to it and then add the email to the list.
Something like the following:

if(!EmailIsInList(email)){ 
  SendActivationEmail(email);
  AddEmailToList(email)
}
else { DoNotSend); }

Actually I got the problem solved this way, still, I don't like my solution. Any ideas?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤独岁月 2024-07-21 07:44:36

收入测试数据：

DECLARE @User TABLE (UserId int, 
UserName varchar(100), Email varchar(40), IsActivated bit)
INSERT INTO @User
SELECT 1, 'John', '[email protected]', 0 UNION
SELECT 2, 'Ann', '[email protected]', 0 UNION
SELECT 3, 'John2', '[email protected]', 1 UNION
SELECT 4, 'Bill', '[email protected]', 0 UNION
SELECT 5, 'Bill', '[email protected]', 0

DECLARE @Email TABLE (EmailId int, 
UserId int, Date datetime, Message varchar(1000))
INSERT INTO @Email
SELECT 1, 1, GETDATE(), '' UNION
SELECT 2, 2, GETDATE(), '' UNION
SELECT 3, 3, GETDATE(), '' UNION
SELECT 4, 4, GETDATE(), '' UNION
SELECT 5, 5, GETDATE(), ''

SELECT * FROM @User
SELECT * FROM @Email

您看，我们已经激活了一次[email protected] ，所以我们在结果集中不需要他。
现在，使用 RANK OVER 实现：

SELECT M.UserID, M.UserName, M.Email, 
    M.IsActivated, M.EmailId, M.Date, M.Message 
FROM (
    SELECT RANK() OVER (PARTITION BY U.Email 
        ORDER BY U.IsActivated Desc, U.UserID ASC) AS N, 
        U.UserID, U.UserName, U.Email, U.IsActivated, 
        E.EmailId, E.Date, E.Message
    FROM @User U INNER JOIN @Email E ON U.UserID = E.UserID
)M WHERE M.N = 1 AND M.IsActivated = 0

Income testing data:

DECLARE @User TABLE (UserId int, 
UserName varchar(100), Email varchar(40), IsActivated bit)
INSERT INTO @User
SELECT 1, 'John', '[email protected]', 0 UNION
SELECT 2, 'Ann', '[email protected]', 0 UNION
SELECT 3, 'John2', '[email protected]', 1 UNION
SELECT 4, 'Bill', '[email protected]', 0 UNION
SELECT 5, 'Bill', '[email protected]', 0

DECLARE @Email TABLE (EmailId int, 
UserId int, Date datetime, Message varchar(1000))
INSERT INTO @Email
SELECT 1, 1, GETDATE(), '' UNION
SELECT 2, 2, GETDATE(), '' UNION
SELECT 3, 3, GETDATE(), '' UNION
SELECT 4, 4, GETDATE(), '' UNION
SELECT 5, 5, GETDATE(), ''

SELECT * FROM @User
SELECT * FROM @Email

You see, we have [email protected] already activated once, so we don't need him in result set.
Now, implementation with RANK OVER:

SELECT M.UserID, M.UserName, M.Email, 
    M.IsActivated, M.EmailId, M.Date, M.Message 
FROM (
    SELECT RANK() OVER (PARTITION BY U.Email 
        ORDER BY U.IsActivated Desc, U.UserID ASC) AS N, 
        U.UserID, U.UserName, U.Email, U.IsActivated, 
        E.EmailId, E.Date, E.Message
    FROM @User U INNER JOIN @Email E ON U.UserID = E.UserID
)M WHERE M.N = 1 AND M.IsActivated = 0

回复收藏 0 原文

江南烟雨〆相思醉 2024-07-21 07:44:36

如果我们假设同一电子邮件地址既可以针对 usera 激活，又不能针对 userb 激活，那么以下查询将为每个从未激活的电子邮件地址返回一个用户 ID

SELECT  MAX(userid),
        email
FROM    users AS u1
WHERE   activated = 'False'
AND NOT EXISTS (
        SELECT 1
        FROM   users AS u2
        WHERE  u2.email = u1.email
        AND    u2.activated = 'True'
        )

GROUP BY email

您确实希望确保电子邮件字段已建立索引，如果它使用（电子邮件、用户 ID）的唯一组合键进行索引，那么这将是索引扫描并且应该非常快。

If we assume that the same email address could be both activated against usera and not activated against userb then the following query would return you one userid for each email address that has never been activated

SELECT  MAX(userid),
        email
FROM    users AS u1
WHERE   activated = 'False'
AND NOT EXISTS (
        SELECT 1
        FROM   users AS u2
        WHERE  u2.email = u1.email
        AND    u2.activated = 'True'
        )

GROUP BY email

You really do want to make sure the email field is indexed, and if it was indexed with a unique composite key of (email, userid) then this would be an indexed scan and should be pretty quick.

回复收藏 0 原文

慢慢从新开始 2024-07-21 07:44:36

我认为你犯了一个重大的逻辑错误。电子邮件地址现在不是、也永远不会是唯一的。仅仅因为两个用户具有相同的电子邮件地址并不意味着他们是同一个人！人们经常共享电子邮件，夫妻可能拥有相同的电子邮件，小型办公室有时只有一封电子邮件（对于医生办公室来说通常如此。）如果有人放弃，电子邮件也会被重复使用。因此，2007 年使用 [email protected] 注册的 John Smith 可能不再受保护在您的系统中处于活动状态，因此当他访问 [email 受保护]。与此同时，处于不同状态的 Judy Smith 注册了[电子邮件受保护]。您不能假设电子邮件地址永远是唯一的。

回复收藏 0 原文

~没有更多了~