在 C# 中匹配 URL 编码的电子邮件地址

发布于 2024-10-26 21:14:45 字数 1646 浏览 6 评论 0原文

我做了一些搜索,但不太明白为什么我的解决方案不起作用。基本上我需要获取一个字符串(即 HTML 代码)解析它并查找 mailto 链接(然后我想将其替换为混淆的一部分)。到目前为止,我得到的内容如下:

    string text = "<p>Some Person<br /> Person's Position<br />p. 123-456-7890<br /> e. <a  title=\"Email Some Person\" target=\"_blank\" href=\"mailto:someperson%40domain.com\">[email protected]</a></p>";
    text = Server.UrlDecode(text);
    string safeEmails = Regex.Replace(text, "(<a href=\"mailto:)(.*?)(%40)(.*?)(\">)(.*?)(</a>)", "<a class=\"mailme\" href=\"$2*$4\">$6</a>");
    Response.Write( Server.HtmlDecode(safeEmails));

文本来自所见即所得文本编辑器(熟悉的人为 Telrik RadEditor),出于所有意图和目的,我无法控制从中产生的内容。

基本上我需要查找并替换任何:

<a href="mailto:someone%40domain.com">[email protected]</a>

与:

<a class="mailme" href="[email protected]">[email protected]</a>

一些背景: 我正在尝试创建一个 mailto 链接,以避免被收割机检测。问题是我收到一个字符串,其中包含电子邮件作为标准的 mailto 链接。我无法控制传入的字符串,因此 mailto 将始终是不受保护的 mailto。我的目标是找到所有这些,混淆它们,然后使用 JavaScript 来“修复”链接,以便人类访问者可以轻松使用 mailto 链接。我愿意接受新方法以及对上述代码的修改。

I did some searching and didn't quite figure out why my solution is not working. Basically I need to take a string (which is HTML code) parse it and look for mailto links (which I then want to replace as part of an obfuscation). Here is what I have thus far:

    string text = "<p>Some Person<br /> Person's Position<br />p. 123-456-7890<br /> e. <a  title=\"Email Some Person\" target=\"_blank\" href=\"mailto:someperson%40domain.com\">[email protected]</a></p>";
    text = Server.UrlDecode(text);
    string safeEmails = Regex.Replace(text, "(<a href=\"mailto:)(.*?)(%40)(.*?)(\">)(.*?)(</a>)", "<a class=\"mailme\" href=\"$2*$4\">$6</a>");
    Response.Write( Server.HtmlDecode(safeEmails));

The text is coming out of a WYSIWYG text editor (Telrik RadEditor for those familiar) and for all intents and purposes I don't have access to be able to control what is coming out of it.

Basically I need to find and replace any:

<a href="mailto:someone%40domain.com">[email protected]</a>

With:

<a class="mailme" href="[email protected]">[email protected]</a>

Some background: I am attempting to create a mailto link that will avoid detection by harvesters. The problem is that I receive a string with the e-mail as a standard mailto link. I cannot control the incoming string, so the mailto will always be an unprotected mailto. My object is to find all of them, obfuscate them, then use JavaScript to "fix" the link so that human vistors can easily use the mailto links. I am open to new approaches as well as modifications to the above code.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

暮倦 2024-11-02 21:14:45

您可以使用正则表达式或 HTML 敏捷包来查找并混淆所有 mailto。如果您想要良好的混淆,请尝试阅读

编辑:
抱歉,从您问题的第一个版本开始,我没有发现您在使正则表达式工作时遇到问题。由于您使用的是所见即所得文本编辑器,我认为它生成的 HTML 应该非常“常规”,因此您可能可以使用正则表达式。
您可以尝试像这样更改替换行:

string safeEmails = Regex.Replace(text, "href=\"mailto:.*\">(.*)</a>", "class=\"mailme\" href=\"$1\">$1</a>");

You could use a regex or the HTML agility pack to find and obfuscate all your mailto. If you want a good obfuscation try reading ten methods to obfuscate e-mail addresses compared

EDIT:
sorry, from the first version of your question I didn't get you had a problem in making your regex work. Since you're usign a WYSIWYG text editor, I think the HTML that comes out of it should be pretty "regular", so you may be fine using a regex.
You can try changing your Replace line like this:

string safeEmails = Regex.Replace(text, "href=\"mailto:.*\">(.*)</a>", "class=\"mailme\" href=\"$1\">$1</a>");
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文