SQL:按电子邮件域名排序

发布于 2024-08-12 23:14:34 字数 3062 浏览 7 评论 0原文

按域名片段对包含电子邮件地址列的表进行排序的最短和/或高效的 SQL 语句是什么?

这本质上是忽略电子邮件地址中“@”之前的任何内容并且不区分大小写。让我们忽略这个域名的国际化域名。

目标:mySQL、MSSQL、Oracle

来自 TABLE1 的示例数据

id   name           email 
------------------------------------------
 1   John Doe       [email protected]
 2   Jane Doe       [email protected]
 3   Ali Baba       [email protected]
 4   Foo Bar        [email protected]
 5   Tarrack Ocama  [email protected]

通过电子邮件订购
从表 1 中选择 * 按电子邮件 ASC 排序

id   name           email 
------------------------------------------
 3   Ali Baba       [email protected]
 4   Foo Bar        [email protected]
 2   Jane Doe       [email protected]
 1   John Doe       [email protected]
 5   Tarrack Ocama  [email protected]

按域排序
从表 1 中选择 * 排序依据 ?????? ASC

id   name           email 
------------------------------------------
 5   Tarrack Ocama  [email protected]
 3   Ali Baba       [email protected]
 1   John Doe       [email protected]
 2   Jane Doe       [email protected]
 4   Foo Bar        [email protected]

编辑:
我并不是要求单个 SQL 语句可以在所有 3 个或更多 SQL 引擎上运行。欢迎任何贡献。 :)

What is the shortest and/or efficient SQL statement to sort a table with a column of email address by it's DOMAIN name fragment?

That's essentially ignoring whatever is before "@" in the email addresses and case-insensitive. Let's ignore the internationalized domain names for this one.

Target at: mySQL, MSSQL, Oracle

Sample data from TABLE1

id   name           email 
------------------------------------------
 1   John Doe       [email protected]
 2   Jane Doe       [email protected]
 3   Ali Baba       [email protected]
 4   Foo Bar        [email protected]
 5   Tarrack Ocama  [email protected]

Order By Email
SELECT * FROM TABLE1 ORDER BY EMAIL ASC

id   name           email 
------------------------------------------
 3   Ali Baba       [email protected]
 4   Foo Bar        [email protected]
 2   Jane Doe       [email protected]
 1   John Doe       [email protected]
 5   Tarrack Ocama  [email protected]

Order By Domain
SELECT * FROM TABLE1 ORDER BY ?????? ASC

id   name           email 
------------------------------------------
 5   Tarrack Ocama  [email protected]
 3   Ali Baba       [email protected]
 1   John Doe       [email protected]
 2   Jane Doe       [email protected]
 4   Foo Bar        [email protected]

EDIT:
I am not asking for a single SQL statement that will work on all 3 or more SQL engines. Any contribution are welcomed. :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

羁客 2024-08-19 23:14:35

试试这个

查询(对于 Sql Server):

select * from mytbl
order by SUBSTRING(email,(CHARINDEX('@',email)+1),1)

查询(对于 Oracle):

select * from mytbl
order by substr(email,INSTR(email,'@',1) + 1,1)

查询(对于 MySQL)

pygorex1 already answered

输出:

id name email< /强>

5   Tarrack Ocama   [email protected]
3   Ali Baba    [email protected]
1   John Doe    [email protected]
2   Jane Doe    [email protected]
4   Foo Bar [email protected]

Try this

Query(For Sql Server):

select * from mytbl
order by SUBSTRING(email,(CHARINDEX('@',email)+1),1)

Query(For Oracle):

select * from mytbl
order by substr(email,INSTR(email,'@',1) + 1,1)

Query(for MySQL)

pygorex1 already answered

Output:

id name email

5   Tarrack Ocama   [email protected]
3   Ali Baba    [email protected]
1   John Doe    [email protected]
2   Jane Doe    [email protected]
4   Foo Bar [email protected]
暗喜 2024-08-19 23:14:35

对于 MySQL:

select email, SUBSTRING_INDEX(email,'@',-1) AS domain from user order by domain desc;

对于不区分大小写:

select user_id, username, email, LOWER(SUBSTRING_INDEX(email,'@',-1)) AS domain from user order by domain desc;

For MySQL:

select email, SUBSTRING_INDEX(email,'@',-1) AS domain from user order by domain desc;

For case-insensitive:

select user_id, username, email, LOWER(SUBSTRING_INDEX(email,'@',-1)) AS domain from user order by domain desc;
寂寞笑我太脆弱 2024-08-19 23:14:35

如果您希望此解决方案能够扩展,那么您不应该尝试提取子列。随着表变得越来越大,每行函数的速度非常慢。

在这种情况下,正确的做法是将提取成本从 select(这种情况经常发生)转移到 insert/update,其中它发生的次数较少(在大多数普通数据库中)。通过仅在插入更新上产生成本,您可以大大提高数据库的整体效率,因为这是您进行操作的唯一时间点需要这样做(即,这是数据更改的唯一时间)。

为了实现此目的,请将电子邮件地址拆分为表中的两个不同的列:email_useremail_domain)。然后,您可以在插入/更新之前在应用程序中将其拆分,或者在数据库中使用触发器(或预先计算的列,如果您的 DBMS 支持)来自动执行此操作。

然后,您对 email_domain 进行排序,当您需要完整的电子邮件地址时,您可以使用 email_name|'@'|email_domain

或者,您可以保留完整的 email 列,并使用触发器仅复制 email_domain 中的域部分,然后您就无需担心连接列以获得完整的内容电子邮件。

如果您知道自己在做什么,出于性能原因从 3NF 恢复是完全可以接受的。在这种情况下,两列中的数据不能仅仅因为触发器不允许而不同步。这是用磁盘空间(相对便宜)换取性能的好方法(我们总是想要更多)。

而且,如果您根本不喜欢从 3NF 恢复,email_name/email_domain 解决方案将解决该问题。

这还假设您只想处理 a@b 形式的电子邮件地址 - 还有其他有效的电子邮件地址,但我不记得多年来在野外看到过其中任何一个。

If you want this solution to scale at all, you should not be trying to extract sub-columns. Per-row functions are notoriously slow as the table gets bigger and bigger.

The right thing to do in this case is to move the cost of extraction from select (where it happens a lot) to insert/update where it happens less (in most normal databases). By incurring the cost only on insert and update, you greatly increase the overall efficiency of the database, since that's the only point in time where you need to do it (i.e., it's the only time when the data changes).

In order to achieve this, split the email address into two distinct columns in the table, email_user and email_domain). Then you can either split it in your application before insertion/update or use a trigger (or pre-computed columns if your DBMS supports it) in the database to do it automatically.

Then you sort on email_domain and, when you want the full email address, you use email_name|'@'|email_domain.

Alternatively, you can keep the full email column and use a trigger to duplicate just the domain part in email_domain, then you never need to worry about concatenating the columns to get the full email address.

It's perfectly acceptable to revert from 3NF for performance reasons provided you know what you're doing. In this case, the data in the two columns can't get out of sync simply because the triggers won't allow it. It's a good way to trade disk space (relatively cheap) for performance (we always want more of that).

And, if you're the sort that doesn't like reverting from 3NF at all, the email_name/email_domain solution will fix that.

This is also assuming you just want to handle email addresses of the form a@b - there are other valid email addresses but I can't recall seeing any of them in the wild for years.

や莫失莫忘 2024-08-19 23:14:35

对于 SQL Server,您可以将计算列添加到表中,并将域提取到单独的字段中。如果将该列保留到表中,则可以像任何其他字段一样使用它,甚至可以在其上放置索引,以加快速度(如果您经常按域名查询):

ALTER TABLE Table1
  ADD DomainName AS 
     SUBSTRING(email, CHARINDEX('@', email)+1, 500) PERSISTED

所以现在您的表将有一个附加列“DomainName” ”,其中包含您的电子邮件地址中“@”符号之后的任何内容。

For SQL Server, you could add a computed column to your table with extracts the domain into a separate field. If you persist that column into the table, you can use it like any other field and even put an index on it, to speed things up, if you query by domain name a lot:

ALTER TABLE Table1
  ADD DomainName AS 
     SUBSTRING(email, CHARINDEX('@', email)+1, 500) PERSISTED

So now your table would have an additional column "DomainName" which contains anything after the "@" sign in your e-mail address.

篱下浅笙歌 2024-08-19 23:14:35

假设您确实必须满足 MySQL、Oracle 和 MSSQL 的要求……最有效的方法可能是将帐户名和域名存储在两个单独的字段中。您可以进行排序:

select id,name,email from table order by name

select id,name,email,account,domain from table order by email

select id,name,email,account,domain from table order by domain,account

正如唐尼指出的那样,字符串操作函数是非标准的..这就是为什么您必须保持数据冗余!

我已将帐户和域添加到第三个查询中,因为我似乎记得并非所有 DBMS 都会对不在所选字段中的字段进行查询排序。

Assuming you really must cater for MySQL, Oracle and MSSQL .. the most efficient way might be to store the account name and domain name in two separate fields. The you can do your ordering:

select id,name,email from table order by name

select id,name,email,account,domain from table order by email

select id,name,email,account,domain from table order by domain,account

as donnie points out, string manipulation functions are non standard .. that is why you will have to keep the data redundant!

I've added account and domain to the third query, since I seam to recall not all DBMSs will sort a query on a field that isn't in the selected fields.

牵你的手,一向走下去 2024-08-19 23:14:35

这适用于 Oracle:

select id,name,email,substr(email,instr(email,'@',1)+1) as domain
from table1
order by domain asc

This will work with Oracle:

select id,name,email,substr(email,instr(email,'@',1)+1) as domain
from table1
order by domain asc
陌若浮生 2024-08-19 23:14:35

对于 postgres,查询为:

SELECT * FROM table
ORDER BY SUBSTRING(email,(position('@' in email) + 1),252)

252 是允许的最长域(因为,电子邮件的最大长度为 254,包括本地部分,@< /code> 和域,

了解更多详细信息:有效电子邮件地址的最大长度是多少?

For postgres the query is:

SELECT * FROM table
ORDER BY SUBSTRING(email,(position('@' in email) + 1),252)

The value 252 is the longest allowed domain (since, the max length of an email is 254 including the local part, the @, and the domain.

See this for more details: What is the maximum length of a valid email address?

娇纵 2024-08-19 23:14:35

您将必须使用文本操作函数来解析域。然后按新列排序。

You are going to have to use the text manipulation functions to parse out the domain. Then order by the new column.

绿光 2024-08-19 23:14:35

MySQL,right()instr()

SQL Server, right()patindex()

Oracle,instr()substr()

而且,正如其他人所说,如果您有相当高的记录数,则将您的电子邮件字段包装在您的 where 子句中的函数中将使其如此RDBMS 无法使用该列上可能拥有的任何索引。因此,您可能需要考虑创建一个保存域的计算列。

MySQL, an intelligent combination of right() and instr()

SQL Server, right() and patindex()

Oracle, instr() and substr()

And, as said by someone else, if you have a decent to high record count, wrapping your email field in functions in you where clause will make it so the RDBMS can't use any index you might have on that column. So, you may want to consider creating a computed column which holds the domain.

寒尘 2024-08-19 23:14:35

如果您有数百万条记录,我建议您创建仅包含域名的新列。

If you have million records, I suggest you to create new column with domain name only.

苏佲洛 2024-08-19 23:14:35

我的建议是(对于mysql):

SELECT 
    LOWER(email) AS email,
    SUBSTRING_INDEX(email, '@', + 1) AS account,
 REPLACE(SUBSTRING_INDEX(email, '@', -1), CONCAT('.',SUBSTRING_INDEX(email, '.', -1)),'') -- 2nd part of mail - tld.
  AS domain,
    CONCAT('.',SUBSTRING_INDEX(email, '.', -1)) AS tld
FROM
********
ORDER BY domain, email ASC;

And then just add a WHERE...

My suggestion would be (for mysql):

SELECT 
    LOWER(email) AS email,
    SUBSTRING_INDEX(email, '@', + 1) AS account,
 REPLACE(SUBSTRING_INDEX(email, '@', -1), CONCAT('.',SUBSTRING_INDEX(email, '.', -1)),'') -- 2nd part of mail - tld.
  AS domain,
    CONCAT('.',SUBSTRING_INDEX(email, '.', -1)) AS tld
FROM
********
ORDER BY domain, email ASC;

And then just add a WHERE...

绅士风度i 2024-08-19 23:14:35

SQL Server 的原始答案对我不起作用......

这是 SQL Server 的版本......

select SUBSTRING(email,(CHARINDEX('@',email)+1),len(email)), count(*) 
from table_name 
group by SUBSTRING(email,(CHARINDEX('@',email)+1),len(email))
order by count(*) desc

The original answer for SQL Server didn't work for me....

Here is a version for SQL Server...

select SUBSTRING(email,(CHARINDEX('@',email)+1),len(email)), count(*) 
from table_name 
group by SUBSTRING(email,(CHARINDEX('@',email)+1),len(email))
order by count(*) desc
避讳 2024-08-19 23:14:35

更聪明地工作而不是更努力地工作:

SELECT REVERSE(SUBSTRING_INDEX(REVERSE(SUBSTRING(emails.email, POSITION('@' IN emails.email)+1)),'.',2)) FROM emails

work smarter not harder:

SELECT REVERSE(SUBSTRING_INDEX(REVERSE(SUBSTRING(emails.email, POSITION('@' IN emails.email)+1)),'.',2)) FROM emails
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文