如何在 PHP 中检查重复的电子邮件地址,考虑 Gmail (user.name+label@gmail.com)

发布于 2024-08-09 00:19:12 字数 1075 浏览 3 评论 0原文

如何在 PHP 中检查重复的电子邮件地址,同时考虑 Gmail 的自动标签和标点符号?

例如,我希望这些地址被检测为重复:

         [email protected]
        [email protected]
   [email protected]
  [email protected]

尽管 Daniel A. White 声称:在 Gmail 中,“@”(和标签)之前随机位置的点可以随意放置。 [电子邮件受保护][email protected] 实际上是同一用户。

How can I check for duplicate email addresses in PHP, with the possibility of Gmail's automated labeler and punctuation in mind?

For example, I want these addressed to be detected as duplicates:

         [email protected]
        [email protected]
   [email protected]
  [email protected]

Despite what Daniel A. White claims: In Gmail, dots at random places before the '@' (and label) can be placed as much as you like. [email protected] and [email protected] are in fact the same user.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

樱花坊 2024-08-16 00:19:12
$email_parts    = explode('@', $email);

// check if there is a "+" and return the string before
$before_plus    = strstr($email_parts[0], '+', TRUE);
$before_at      = $before_plus ? $before_plus : $email_parts[0];

// remove "."
$before_at      = str_replace('.', '', $before_at);

$email_clean    = $before_at.'@'.$email_parts[1];
$email_parts    = explode('@', $email);

// check if there is a "+" and return the string before
$before_plus    = strstr($email_parts[0], '+', TRUE);
$before_at      = $before_plus ? $before_plus : $email_parts[0];

// remove "."
$before_at      = str_replace('.', '', $before_at);

$email_clean    = $before_at.'@'.$email_parts[1];
坦然微笑 2024-08-16 00:19:12

在比较之前将地址剥离为基本形式。创建一个函数 normalise() 来剥离标签,然后删除所有点。然后,您可以通过以下方式比较地址:

normalise(address1) == normalise(address2)

如果您必须经常这样做,也可以将地址保存为标准化形式,这样您就不必经常将它们转换回来。

Strip the address to the basic form before comparing. Make a function normalise() that will strip the label, then remove all dots. Then you can compare the addresses via:

normalise(address1) == normalise(address2)

If you have to do it very often, save the addresses in the normalised form too, so you don't have to convert them back too often.

烏雲後面有陽光 2024-08-16 00:19:12

这个答案是对@powtac 答案的改进。我需要这个功能来阻止同一个人使用 gmail 进行多次注册。

if ( ! function_exists('normalize_email'))
{
    /**
     * to normalize emails to a base format, especially for gmail
     * @param $email
     * @return string
     */
    function normalize_email($email) {
        // ensure email is lowercase because of pending in_array check, and more...
        $email = strtolower($email);
        $parts    = explode('@', $email);

        // normalize gmail addresses
        if (in_array($parts[1], ['gmail.com', 'googlemail.com'])) {
            // check if there is a "+" and return the string before then remove "."
            $before_plus    = strstr($parts[0], '+', TRUE);
            $before_at      = str_replace('.', '', $before_plus ? $before_plus : $parts[0]);

            // ensure only @gmail.com addresses are used
            $email    = $before_at.'@gmail.com';
        }

        return $email;
    }
}

This answer is an improvement on @powtac's answer. I needed this function to defeat multiple signups from same person using gmail.

if ( ! function_exists('normalize_email'))
{
    /**
     * to normalize emails to a base format, especially for gmail
     * @param $email
     * @return string
     */
    function normalize_email($email) {
        // ensure email is lowercase because of pending in_array check, and more...
        $email = strtolower($email);
        $parts    = explode('@', $email);

        // normalize gmail addresses
        if (in_array($parts[1], ['gmail.com', 'googlemail.com'])) {
            // check if there is a "+" and return the string before then remove "."
            $before_plus    = strstr($parts[0], '+', TRUE);
            $before_at      = str_replace('.', '', $before_plus ? $before_plus : $parts[0]);

            // ensure only @gmail.com addresses are used
            $email    = $before_at.'@gmail.com';
        }

        return $email;
    }
}
甜心小果奶 2024-08-16 00:19:12

也许这会更好地标题为“如何在 PHP 中标准化 gmail 地址,考虑 ([email protected] ])”

上面有两种技术方案。我会走另一条路,问你为什么要这样做。我感觉不太对劲。您是否试图阻止某人使用不同的电子邮件地址在您的网站上多次注册?这只会防止出现这种特殊情况。

我有自己的域 example.com,发送到该域中任何地址的任何电子邮件都会发送到我的单个邮箱。现在,您是否想要进行检查以将我的 example.com 上的所有内容标准化为您这边的单个地址?

根据官方电子邮件地址格式,您所在的这些地址尝试匹配相同的和不同的。

Perhaps this would be better titled "How to normalize gmail addresses in PHP, considering ([email protected])"

You have two technical solutions above. I'll go a different route and ask why you're trying to do this. It doesn't feel right to me. Are you trying to prevent someone registering multiple times at your site using different e-mail addresses? This will only prevent a specialized case of that.

I have my own domain, example.com, and any e-mail that goes to any address at that domain goes to my single mailbox. Do you, now, want to put a check to normalize anything at my example.com to a single address on your end?

By the official e-mail address format, those addresses you are trying to match as the same are different.

明媚如初 2024-08-16 00:19:12

电子邮件地址解析真的很难正确地进行,而不破坏事情和烦扰用户。

首先,我会问你是否真的需要这样做?为什么您有多个电子邮件地址,并且具有不同的子地址?

如果您确定需要这样做,请首先阅读 rfc0822,然后修改 此电子邮件地址解析正则表达式以提取电子邮件的所有部分,并重新组合它们,不包括标签。.

稍微多一点..实际上,电子邮件地址维基百科页面有一个关于地址格式这部分的部分,子寻址

powtac 发布的代码看起来应该可以工作 - 只要您不以自动方式使用它来删除帐户或任何内容,就应该没问题。

请注意,“自动标签器”不是 GMail 特有的功能,Gmail 只是普及了它。其他邮件服务器支持此功能,有些使用 + 作为分隔符,有些使用 -.如果您要使用 GMail 地址中的特殊空格,请记住还要考虑 googlemail.com

Email address parsing is really, really hard to do correctly, without breaking things and annoying users..

First, I would question if you really need to do this? Why do you have multiple email addresses, with different sub-addresses?

If you are sure you need to do this, first read rfc0822, then modify this email address parsing regex to extract all parts of the email, and recombine them excluding the label..

Slightly more.. practically, the Email Address wikipedia page has a section on this part of the address format, Sub-addressing.

The code powtac posted looks like it should work - as long as you're not using it in an automated manner to delete accounts or anything, it should be fine.

Note that the "automated labeler" isn't a GMail specific feature, Gmail simply popularised it.. Other mail servers support this feature, some using + as the separator, others using -. If you are going to special-case spaces in GMail addresses, remember to consider the googlemail.com domain also

终弃我 2024-08-16 00:19:12

我已经像这样扩展了 Zend Validator。

<?php
class My_Validate_EmailAddress extends Zend_Validate_EmailAddress
{
    public function isValid($value)
    {
        $valid = parent::isValid($value);
        if ($valid
                && in_array($this->_hostname, array('gmail.com', 'googlemail.com'))
                && substr_count($this->_localPart, '.') > 1) {
            $this->_error(parent::INVALID_HOSTNAME);
            $valid = false;
        }
        return valid;
    }
}

Gmail 地址中包含多个“点”符号的电子邮件将被视为无效。对于某些情况,这不是合乎逻辑的解决方案,但这对我有用。

I have extended Zend Validator like this.

<?php
class My_Validate_EmailAddress extends Zend_Validate_EmailAddress
{
    public function isValid($value)
    {
        $valid = parent::isValid($value);
        if ($valid
                && in_array($this->_hostname, array('gmail.com', 'googlemail.com'))
                && substr_count($this->_localPart, '.') > 1) {
            $this->_error(parent::INVALID_HOSTNAME);
            $valid = false;
        }
        return valid;
    }
}

Email with more than one "dot" symbol in gmail address are considered invalid. For some cases this is not logical solution, but that works for me.

一杆小烟枪 2024-08-16 00:19:12
function normalize($input) {
     $input = str_replace('.', '', $input);
     $pattern = '/\+(\w+)@/';
     return preg_replace($pattern, '@', $input);
}
function normalize($input) {
     $input = str_replace('.', '', $input);
     $pattern = '/\+(\w+)@/';
     return preg_replace($pattern, '@', $input);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文