用于链接字符串中的 url 的 C# 代码

发布于 2024-07-18 00:26:09 字数 57 浏览 5 评论 0原文

有没有人有任何好的 C# 代码(和正则表达式)来解析字符串并“链接”字符串中可能存在的任何 url?

Does anyone have any good c# code (and regular expressions) that will parse a string and "linkify" any urls that may be in the string?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

七婞 2024-07-25 00:26:09

这是一个非常简单的任务,您可以使用 来完成它正则表达式 和现成的正则表达式来自:

类似:

var html = Regex.Replace(html, @"^(http|https|ftp)\://[a-zA-Z0-9\-\.]+" +
                         "\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?" +
                         "([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$",
                         "<a href=\"$1\">$1</a>");

您可能不仅对创建链接感兴趣,而且对缩短 URL 感兴趣。 这是关于此主题的一篇好文章:

另请参阅

It's a pretty simple task you can acheive it with Regex and a ready-to-go regular expression from:

Something like:

var html = Regex.Replace(html, @"^(http|https|ftp)\://[a-zA-Z0-9\-\.]+" +
                         "\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?" +
                         "([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$",
                         "<a href=\"$1\">$1</a>");

You may also be interested not only in creating links but in shortening URLs. Here is a good article on this subject:

See also:

你列表最软的妹 2024-07-25 00:26:09

好吧,经过对此进行了大量研究,并多次尝试修复

  1. 人们进入 http://www.sitename.com 的 时间 和 www.sitename.com 在同一篇文章中
  2. 修复了括号,例如 (http://www.sitename.com< /a>) 和 http://msdn.microsoft.com /en-us/library/aa752574(vs.85).aspx
  3. 长网址,例如:http://www.amazon.com/gp/product/b000ads62g/ref=s9_simz_gw_s3_p74_t1?pf_rd_m=atvpdkikx0der&pf_rd_s=center-2&pf_rd_r=04eezfszazqzs8xfm9yd&pf_rd_t=101&pf _rd_p=470938631&pf_rd_i=507846

我们现在正在使用这个 HtmlHelper 扩展...我想我会分享并获得任何评论:

    private static Regex regExHttpLinks = new Regex(@"(?<=\()\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\))|(?<=(?<wrap>[=~|_#]))\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\k<wrap>)|\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]", RegexOptions.Compiled | RegexOptions.IgnoreCase);

    public static string Format(this HtmlHelper htmlHelper, string html)
    {
        if (string.IsNullOrEmpty(html))
        {
            return html;
        }

        html = htmlHelper.Encode(html);
        html = html.Replace(Environment.NewLine, "<br />");

        // replace periods on numeric values that appear to be valid domain names
        var periodReplacement = "[[[replace:period]]]";
        html = Regex.Replace(html, @"(?<=\d)\.(?=\d)", periodReplacement);

        // create links for matches
        var linkMatches = regExHttpLinks.Matches(html);
        for (int i = 0; i < linkMatches.Count; i++)
        {
            var temp = linkMatches[i].ToString();

            if (!temp.Contains("://"))
            {
                temp = "http://" + temp;
            }

            html = html.Replace(linkMatches[i].ToString(), String.Format("<a href=\"{0}\" title=\"{0}\">{1}</a>", temp.Replace(".", periodReplacement).ToLower(), linkMatches[i].ToString().Replace(".", periodReplacement)));
        }

        // Clear out period replacement
        html = html.Replace(periodReplacement, ".");

        return html;
    }

well, after a lot of research on this, and several attempts to fix times when

  1. people enter in http://www.sitename.com and www.sitename.com in the same post
  2. fixes to parenthisis like (http://www.sitename.com) and http://msdn.microsoft.com/en-us/library/aa752574(vs.85).aspx
  3. long urls like: http://www.amazon.com/gp/product/b000ads62g/ref=s9_simz_gw_s3_p74_t1?pf_rd_m=atvpdkikx0der&pf_rd_s=center-2&pf_rd_r=04eezfszazqzs8xfm9yd&pf_rd_t=101&pf_rd_p=470938631&pf_rd_i=507846

we are now using this HtmlHelper extension... thought I would share and get any comments:

    private static Regex regExHttpLinks = new Regex(@"(?<=\()\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\))|(?<=(?<wrap>[=~|_#]))\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\k<wrap>)|\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]", RegexOptions.Compiled | RegexOptions.IgnoreCase);

    public static string Format(this HtmlHelper htmlHelper, string html)
    {
        if (string.IsNullOrEmpty(html))
        {
            return html;
        }

        html = htmlHelper.Encode(html);
        html = html.Replace(Environment.NewLine, "<br />");

        // replace periods on numeric values that appear to be valid domain names
        var periodReplacement = "[[[replace:period]]]";
        html = Regex.Replace(html, @"(?<=\d)\.(?=\d)", periodReplacement);

        // create links for matches
        var linkMatches = regExHttpLinks.Matches(html);
        for (int i = 0; i < linkMatches.Count; i++)
        {
            var temp = linkMatches[i].ToString();

            if (!temp.Contains("://"))
            {
                temp = "http://" + temp;
            }

            html = html.Replace(linkMatches[i].ToString(), String.Format("<a href=\"{0}\" title=\"{0}\">{1}</a>", temp.Replace(".", periodReplacement).ToLower(), linkMatches[i].ToString().Replace(".", periodReplacement)));
        }

        // Clear out period replacement
        html = html.Replace(periodReplacement, ".");

        return html;
    }
迎风吟唱 2024-07-25 00:26:09
protected string Linkify( string SearchText ) {
    // this will find links like:
    // http://www.mysite.com
    // as well as any links with other characters directly in front of it like:
    // href="http://www.mysite.com"
    // you can then use your own logic to determine which links to linkify
    Regex regx = new Regex( @"\b(((\S+)?)(@|mailto\:|(news|(ht|f)tp(s?))\://)\S+)\b", RegexOptions.IgnoreCase );
    SearchText = SearchText.Replace( " ", " " );
    MatchCollection matches = regx.Matches( SearchText );

    foreach ( Match match in matches ) {
        if ( match.Value.StartsWith( "http" ) ) { // if it starts with anything else then dont linkify -- may already be linked!
            SearchText = SearchText.Replace( match.Value, "<a href='" + match.Value + "'>" + match.Value + "</a>" );
        }
    }

    return SearchText;
}
protected string Linkify( string SearchText ) {
    // this will find links like:
    // http://www.mysite.com
    // as well as any links with other characters directly in front of it like:
    // href="http://www.mysite.com"
    // you can then use your own logic to determine which links to linkify
    Regex regx = new Regex( @"\b(((\S+)?)(@|mailto\:|(news|(ht|f)tp(s?))\://)\S+)\b", RegexOptions.IgnoreCase );
    SearchText = SearchText.Replace( " ", " " );
    MatchCollection matches = regx.Matches( SearchText );

    foreach ( Match match in matches ) {
        if ( match.Value.StartsWith( "http" ) ) { // if it starts with anything else then dont linkify -- may already be linked!
            SearchText = SearchText.Replace( match.Value, "<a href='" + match.Value + "'>" + match.Value + "</a>" );
        }
    }

    return SearchText;
}
破晓 2024-07-25 00:26:09

这并不像您在 Jeff Atwood 的博客文章中读到的那么容易。 检测 URL 的结束位置尤其困难。

例如,尾部括号是否是 URL 的一部分:

  • http ://en.wikipedia.org/wiki/PCTools(CentralPointSoftware)
  • 括号中的 URL (http://en.wikipedia.org) 更多文本

在第一个中在这种情况下,括号是 URL 的一部分。 在第二种情况下,他们不是!

It's not that easy as you can read in this blog post by Jeff Atwood. It's especially hard to detect where an URL ends.

For example, is the trailing parenthesis part of the URL or not:

  • http​://en.wikipedia.org/wiki/PCTools(CentralPointSoftware)
  • an URL in parentheses (http​://en.wikipedia.org) more text

In the first case, the parentheses are part of the URL. In the second case they are not!

征棹 2024-07-25 00:26:09

发现以下正则表达式
http://daringfireball.net/2010/07/improved_regex_for_matching_urls

对我来说看起来非常好。 Jeff Atwood 解决方案不能处理很多情况。 josefresno 在我看来处理所有案例。 但当我试图理解它时(如果有任何支持请求),我的大脑就沸腾了。

Have found following regular expression
http://daringfireball.net/2010/07/improved_regex_for_matching_urls

for me looks very good. Jeff Atwood solution doesn't handle many cases. josefresno seem to me handle all cases. But when I have tried to understand it (in case of any support requests) my brain was boiled.

究竟谁懂我的在乎 2024-07-25 00:26:09

有类:

public class TextLink
{
    #region Properties

    public const string BeginPattern = "((http|https)://)?(www.)?";

    public const string MiddlePattern = @"([a-z0-9\-]*\.)+[a-z]+(:[0-9]+)?";

    public const string EndPattern = @"(/\S*)?";

    public static string Pattern { get { return BeginPattern + MiddlePattern + EndPattern; } }

    public static string ExactPattern { get { return string.Format("^{0}$", Pattern); } }

    public string OriginalInput { get; private set; }

    public bool Valid { get; private set; }

    private bool _isHttps;

    private string _readyLink;

    #endregion

    #region Constructor

    public TextLink(string input)
    {
        this.OriginalInput = input;

        var text = Regex.Replace(input, @"(^\s)|(\s$)", "", RegexOptions.IgnoreCase);

        Valid = Regex.IsMatch(text, ExactPattern);

        if (Valid)
        {
            _isHttps = Regex.IsMatch(text, "^https:", RegexOptions.IgnoreCase);
            // clear begin:
            _readyLink = Regex.Replace(text, BeginPattern, "", RegexOptions.IgnoreCase);
            // HTTPS
            if (_isHttps)
            {
                _readyLink = "https://www." + _readyLink;
            }
            // Default
            else
            {
                _readyLink = "http://www." + _readyLink;
            }
        }
    }

    #endregion

    #region Methods

    public override string ToString()
    {
        return _readyLink;
    }

    #endregion
}

在此方法中使用它:

public static string ReplaceUrls(string input)
{
    var result = Regex.Replace(input.ToSafeString(), TextLink.Pattern, match =>
    {
        var textLink = new TextLink(match.Value);
        return textLink.Valid ?
            string.Format("<a href=\"{0}\" target=\"_blank\">{1}</a>", textLink, textLink.OriginalInput) :
            textLink.OriginalInput;
    });
    return result;
}

测试用例:

[TestMethod]
public void RegexUtil_TextLink_Parsing()
{
    Assert.IsTrue(new TextLink("smthing.com").Valid);
    Assert.IsTrue(new TextLink("www.smthing.com/").Valid);
    Assert.IsTrue(new TextLink("http://smthing.com").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com/").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com/publisher").Valid);

    // port
    Assert.IsTrue(new TextLink("http://www.smthing.com:80").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com:80/").Valid);
    // https
    Assert.IsTrue(new TextLink("https://smthing.com").Valid);

    Assert.IsFalse(new TextLink("").Valid);
    Assert.IsFalse(new TextLink("smthing.com.").Valid);
    Assert.IsFalse(new TextLink("smthing.com-").Valid);
}

[TestMethod]
public void RegexUtil_TextLink_ToString()
{
    // default
    Assert.AreEqual("http://www.smthing.com", new TextLink("smthing.com").ToString());
    Assert.AreEqual("http://www.smthing.com", new TextLink("http://www.smthing.com").ToString());
    Assert.AreEqual("http://www.smthing.com/", new TextLink("smthing.com/").ToString());

    Assert.AreEqual("https://www.smthing.com", new TextLink("https://www.smthing.com").ToString());
}

There is class:

public class TextLink
{
    #region Properties

    public const string BeginPattern = "((http|https)://)?(www.)?";

    public const string MiddlePattern = @"([a-z0-9\-]*\.)+[a-z]+(:[0-9]+)?";

    public const string EndPattern = @"(/\S*)?";

    public static string Pattern { get { return BeginPattern + MiddlePattern + EndPattern; } }

    public static string ExactPattern { get { return string.Format("^{0}$", Pattern); } }

    public string OriginalInput { get; private set; }

    public bool Valid { get; private set; }

    private bool _isHttps;

    private string _readyLink;

    #endregion

    #region Constructor

    public TextLink(string input)
    {
        this.OriginalInput = input;

        var text = Regex.Replace(input, @"(^\s)|(\s$)", "", RegexOptions.IgnoreCase);

        Valid = Regex.IsMatch(text, ExactPattern);

        if (Valid)
        {
            _isHttps = Regex.IsMatch(text, "^https:", RegexOptions.IgnoreCase);
            // clear begin:
            _readyLink = Regex.Replace(text, BeginPattern, "", RegexOptions.IgnoreCase);
            // HTTPS
            if (_isHttps)
            {
                _readyLink = "https://www." + _readyLink;
            }
            // Default
            else
            {
                _readyLink = "http://www." + _readyLink;
            }
        }
    }

    #endregion

    #region Methods

    public override string ToString()
    {
        return _readyLink;
    }

    #endregion
}

Use it in this method:

public static string ReplaceUrls(string input)
{
    var result = Regex.Replace(input.ToSafeString(), TextLink.Pattern, match =>
    {
        var textLink = new TextLink(match.Value);
        return textLink.Valid ?
            string.Format("<a href=\"{0}\" target=\"_blank\">{1}</a>", textLink, textLink.OriginalInput) :
            textLink.OriginalInput;
    });
    return result;
}

Test cases:

[TestMethod]
public void RegexUtil_TextLink_Parsing()
{
    Assert.IsTrue(new TextLink("smthing.com").Valid);
    Assert.IsTrue(new TextLink("www.smthing.com/").Valid);
    Assert.IsTrue(new TextLink("http://smthing.com").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com/").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com/publisher").Valid);

    // port
    Assert.IsTrue(new TextLink("http://www.smthing.com:80").Valid);
    Assert.IsTrue(new TextLink("http://www.smthing.com:80/").Valid);
    // https
    Assert.IsTrue(new TextLink("https://smthing.com").Valid);

    Assert.IsFalse(new TextLink("").Valid);
    Assert.IsFalse(new TextLink("smthing.com.").Valid);
    Assert.IsFalse(new TextLink("smthing.com-").Valid);
}

[TestMethod]
public void RegexUtil_TextLink_ToString()
{
    // default
    Assert.AreEqual("http://www.smthing.com", new TextLink("smthing.com").ToString());
    Assert.AreEqual("http://www.smthing.com", new TextLink("http://www.smthing.com").ToString());
    Assert.AreEqual("http://www.smthing.com/", new TextLink("smthing.com/").ToString());

    Assert.AreEqual("https://www.smthing.com", new TextLink("https://www.smthing.com").ToString());
}
瘫痪情歌 2024-07-25 00:26:09

这对我有用:

str = Regex.Replace(str,
                @"((http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?)",
                "<a target='_blank' href='$1'>$1</a>");

This works for me:

str = Regex.Replace(str,
                @"((http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?)",
                "<a target='_blank' href='$1'>$1</a>");
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文