当前位置：文江博客话题详情

C# model-view-controller

去掉 HTML 标签？

发布于 2024-12-10 07:09:11 字数 793 浏览 3 评论 0原文

如何剥离此文本，

<html>

<body>      

<h1>My First Heading</h1>

<p>My first paragraph.</p>
<[email protected]>
</body>
</html>

使其看起来像

My First Heading
My first paragraph.
<[email protected]>

功能

public static string StripHTML(this string htmlText)
    {
        var reg = new Regex("<(.|\n)*?>", RegexOptions.IgnoreCase);
        return reg.Replace(htmlText, "");
    }

使用“我得到

我的第一个标题” 我的第一段。

How to strip this text

<html>

<body>      

<h1>My First Heading</h1>

<p>My first paragraph.</p>
<[email protected]>
</body>
</html>

to look like

My First Heading
My first paragraph.
<[email protected]>

Using the function

public static string StripHTML(this string htmlText)
    {
        var reg = new Regex("<(.|\n)*?>", RegexOptions.IgnoreCase);
        return reg.Replace(htmlText, "");
    }

I get

My First Heading
My first paragraph.

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（2）

终止放荡 2024-12-17 07:09:11

使用 Html Agility Pack 进行此类操作。它比任何正则表达式都快并且支持 LINQ。

回复收藏 0 原文

一抹淡然 2024-12-17 07:09:11

static void Main(string[] args)
    {


      string modified_html =  emas(input);

        HtmlDocument doc = new HtmlDocument();

        doc.LoadHtml(modified_html);

        string test1 = doc.DocumentNode.InnerText;


        Console.WriteLine();


        var reg = new Regex("<(.|\n)*?>", RegexOptions.IgnoreCase);

        Console.WriteLine(reg.Replace(modified_html , ""));

        Console.Read();
    }


    public static string emas(string text)
    {

        string stripped = text;

        const string MatchEmailPattern =
       @"(([\w-]+\.)+[\w-]+|([a-zA-Z]{1}|[\w-]{2,}))@"
       + @"((([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\."
         + @"([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])){1}|"
       + @"([a-zA-Z]+[\w-]+\.)+[a-zA-Z]{2,4})";
        Regex rx = new Regex(MatchEmailPattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
        // Find matches.
        MatchCollection matches = rx.Matches(text);
        // Report the number of matches found.
        int noOfMatches = matches.Count;
        // Report on each match.
        foreach (Match match in matches)
        {

            stripped = stripped.Replace("<"+ match.Value + ">" , match.Value);

        }


        return stripped;


    }



   static string input = " Your html goes here  ";

static void Main(string[] args)
    {


      string modified_html =  emas(input);

        HtmlDocument doc = new HtmlDocument();

        doc.LoadHtml(modified_html);

        string test1 = doc.DocumentNode.InnerText;


        Console.WriteLine();


        var reg = new Regex("<(.|\n)*?>", RegexOptions.IgnoreCase);

        Console.WriteLine(reg.Replace(modified_html , ""));

        Console.Read();
    }


    public static string emas(string text)
    {

        string stripped = text;

        const string MatchEmailPattern =
       @"(([\w-]+\.)+[\w-]+|([a-zA-Z]{1}|[\w-]{2,}))@"
       + @"((([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\."
         + @"([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])){1}|"
       + @"([a-zA-Z]+[\w-]+\.)+[a-zA-Z]{2,4})";
        Regex rx = new Regex(MatchEmailPattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
        // Find matches.
        MatchCollection matches = rx.Matches(text);
        // Report the number of matches found.
        int noOfMatches = matches.Count;
        // Report on each match.
        foreach (Match match in matches)
        {

            stripped = stripped.Replace("<"+ match.Value + ">" , match.Value);

        }


        return stripped;


    }



   static string input = " Your html goes here  ";

回复收藏 0 原文

~没有更多了~

关于作者

暂无简介

文章

评论

27 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

冰魂雪魄

文章 0 评论 0

qq_Wl4Sbi

文章 0 评论 0

柳家齐

文章 0 评论 0

无法言说的痛

文章 0 评论 0

魄砕の薆

文章 0 评论 0

盗琴音

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文