NVARCHAR(?) 用于 SQL Server 中的电子邮件地址

发布于 2025-01-06 10:00:17 字数 744 浏览 3 评论 0原文

对于电子邮件地址,我应该在 SQL Server 中为各列提供多少空间。

我在维基百科上找到了这个定义:

http://en.wikipedia.org/wiki/Email_address

电子邮件地址的格式为 local-part@domain,其中 local-part 的长度最多为 64 个字符,并且域名可以 最多 253 个字符 - 但最多 256 个字符 正向或反向路径的长度限制整个电子邮件地址 不超过 254 个字符

这一个:

http://askville.amazon。 com/maximum-length-allowed-email-address/AnswerViewer.do?requestId=1166932

所以目前,电子邮件地址允许的总字符数为 64(本地 部分)+ 1(“@”符号)+ 255(域部分)= 320

将来他们可能会增加本地部分限制 最多 128 个字符。总共有 384 个字符。

有什么想法吗?

For Email addresses, how much space should I give the columns in SQL Server.

I found this definition on Wikipedia:

http://en.wikipedia.org/wiki/Email_address

The format of email addresses is local-part@domain where the
local-part may be up to 64 characters long and the domain name may
have a maximum of 253 characters - but the maximum 256 characters
length of a forward or reverse path restricts the entire email address
to be no more than 254 characters

And this one:

http://askville.amazon.com/maximum-length-allowed-email-address/AnswerViewer.do?requestId=1166932

So for now, total characters allowed for e-mail address is 64 (local
part) + 1("@" sign) + 255 (domain part) = 320

It's possible that in the future they'll increase the local-part limit
to 128 characters. which would make total of 384 characters.

Any thoughts?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

楠木可依 2025-01-13 10:00:17

根据你后面的计算,我一直使用 320。允许更多*不需要花费任何费用,除非人们滥用它并在其中塞入垃圾。允许更少的可能会让您付出代价,因为如果用户拥有合法的更长的电子邮件地址,您将会感到沮丧,而现在您必须返回并更新架构、代码、参数等。在我曾经使用的系统(电子邮件服务提供商)中,我自然遇到的最长电子邮件地址大约有 120 个字符 - 很明显,他们只是为了笑而制作一个很长的电子邮件地址。

* 严格来说并不正确,因为内存授予估计是基于不同宽度列填充一半的假设,因此存储相同数据的较宽列可能会导致某些查询的性能特征截然不同。

我也争论过 NVARCHAR 对于电子邮件地址是否是必需的。我还没有遇到过带有 Unicode 字符的电子邮件地址 - 我知道标准支持它们,但许多现有系统不支持,如果那是您的电子邮件地址,那将非常令人沮丧。

虽然 NVARCHAR 确实会花费双倍的空间,但使用 SQL Server 2008 R2,您可以从 Unicode 压缩中受益,它基本上将 NVARCHAR 列中的所有非 Unicode 字符视为ASCII,这样你就可以取回那些额外的字节。当然,压缩仅在 Enterprise+ 中可用...

减少空间需求的另一种方法是对所有观察到的域名使用中央查找表,并将 LocalPartDomainID 存储为用户,并且每个唯一的域名仅存储一次。是的,这会使编程变得更加麻烦,但如果您有 80,000 个 hotmail.com 地址,则成本为 80,0000 x 4 字节,而不是 80,000 x 11 字节(或压缩后更少)。如果存储或 I/O 是您的瓶颈,而不是 CPU,那么这绝对是一个值得研究的选项。

我在这里写过:

I've always used 320 based on your latter calculation. It doesn't cost you anything to allow more*, unless people abuse it and stuff junk in there. It could cost you to allow less, as you'll have a frustrating users if they have legitimately longer e-mail addresses and now you'll have to go back and update schema, code, parameters etc. In the system I used to work with (an e-mail service provider), the longest e-mail address I came across naturally was about 120 characters - and it was clear they were just making a long e-mail address for grins.

* Not strictly true, since memory grant estimates are based on the assumption that varying-width columns are half-populated, so a wider column storing the same data can have lead to vastly different performance characteristics of certain queries.

And I've debated whether NVARCHAR is necessary for e-mail address. I've yet to come across an e-mail address with Unicode characters - I know the standard supports them, but so many existing systems do not, it would be pretty frustrating if that was your e-mail address.

And while it's true that NVARCHAR costs double the space, with SQL Server 2008 R2 you can benefit from Unicode compression, which basically treats all non-Unicode characters in an NVARCHAR column as ASCII, so you get those extra bytes back. Of course compression is only available in Enterprise+...

Another way to reduce space requirements is to use a central lookup table for all observed domain names, and store LocalPart and DomainID with the user, and store each unique domain name only once. Yes this makes for more cumbersome programming, but if you have 80,000 hotmail.com addresses, the cost is 80,0000 x 4 bytes instead of 80,000 x 11 bytes (or less with compression). If storage or I/O is your bottleneck, and not CPU, this is definitely an option worth investigating.

I wrote about this here:

鯉魚旗 2025-01-13 10:00:17

我猜想 VARCHAR(320) 是基于 ASCII 的域名和电子邮件地址的正常限制。但是我们不会很快就会看到 unicode 域名出现吗?

http://en.wikipedia.org/wiki/Internationalized_domain_name

也许 NVARCHAR(320) 就是我们想要的应该开始使用?

I guess VARCHAR(320) would be the normal limit for an ASCII-based domain name and email address. But won't we start seeing unicode domain names appearing sometime soon?

http://en.wikipedia.org/wiki/Internationalized_domain_name

Maybe NVARCHAR(320) is what we should start using?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文