如何在 Android 中删除或转义 html 标签

发布于 2024-11-17 14:28:42 字数 93 浏览 1 评论 0原文

PHP 有 strip_tags 函数,可以从字符串中去除 HTML 和 PHP 标签。

Android有没有办法转义html?

PHP has strip_tags function which strips HTML and PHP tags from a string.

Does Android have a way to escape html?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

那片花海 2024-11-24 14:28:42

@sparkymat 链接到的答案中的解决方案通常需要正则表达式(这是一种容易出错的方法)或安装第三方库,例如 jsoupjericho。 Android 设备上更好的解决方案就是使用 Html.fromHtml() 函数:

public String stripHtml(String html) {
    if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
       return Html.fromHtml(html, Html.FROM_HTML_MODE_LEGACY).toString();
    } else {
       return Html.fromHtml(html).toString();
    }
}

它使用 Android 内置的 Html 解析器来构建输入 html 的 Spanned 表示形式,而无需任何 html 标签。然后通过将输出转换回字符串来去除“Span”标记。

正如此处所述,自 Android N 以来,Html.fromHtml 行为已发生变化。请参阅文档了解更多信息。

The solutions in the answer linked to by @sparkymat generally require either regex - which is an error-prone approach - or installing a third-party library such as jsoup or jericho. A better solution on Android devices is just to make use of the Html.fromHtml() function:

public String stripHtml(String html) {
    if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
       return Html.fromHtml(html, Html.FROM_HTML_MODE_LEGACY).toString();
    } else {
       return Html.fromHtml(html).toString();
    }
}

This uses Android's built in Html parser to build a Spanned representation of the input html without any html tags. The "Span" markup is then stripped by converting the output back into a string.

As discussed here, Html.fromHtml behaviour has changed since Android N. See the documentation for more info.

吾性傲以野 2024-11-24 14:28:42

很抱歉迟到的帖子,但我认为这可能对其他人有帮助,

仅删除 html 条

Html.fromHtml(htmltext).toString()

这样 html 标签将被字符串替换,但字符串的格式不会正确。因此我

Html.fromHtml(htmltext).toString().replaceAll("\n", "").trim()

这样做了,我首先用空格替换下一行并删除空格。同样,您可以删除其他人。

Sorry for the late post, but i think this might help for others,

To just remove the html strips

Html.fromHtml(htmltext).toString()

This way the html tag will be replaced with string, but the string willnot be formatted properly. Hence i did

Html.fromHtml(htmltext).toString().replaceAll("\n", "").trim()

This way i first replace with nextline with blankspace and removed blank space. Similarly you can remove others.

明月松间行 2024-11-24 14:28:42

如果您的目标是 API 16 或更高版本,您也可以使用 Html.escapeHtml(String)

如果还针对 API 16 以下,您可以通过调用 HtmlUtils.escapeHtml(String) 来使用下面的类,我只是从 Html.escapeHtml(String) 的源中提取该类。

public class HtmlUtils {

    public static String escapeHtml(CharSequence text) {
        StringBuilder out = new StringBuilder();
        withinStyle(out, text, 0, text.length());
        return out.toString();
    }

    private static void withinStyle(StringBuilder out, CharSequence text,
                                    int start, int end) {
        for (int i = start; i < end; i++) {
            char c = text.charAt(i);

            if (c == '<') {
                out.append("<");
            } else if (c == '>') {
                out.append(">");
            } else if (c == '&') {
                out.append("&");
            } else if (c >= 0xD800 && c <= 0xDFFF) {
                if (c < 0xDC00 && i + 1 < end) {
                    char d = text.charAt(i + 1);
                    if (d >= 0xDC00 && d <= 0xDFFF) {
                        i++;
                        int codepoint = 0x010000 | (int) c - 0xD800 << 10 | (int) d - 0xDC00;
                        out.append("&#").append(codepoint).append(";");
                    }
                }
            } else if (c > 0x7E || c < ' ') {
                out.append("&#").append((int) c).append(";");
            } else if (c == ' ') {
                while (i + 1 < end && text.charAt(i + 1) == ' ') {
                    out.append(" ");
                    i++;
                }

                out.append(' ');
            } else {
                out.append(c);
            }
        }
    }
}

我正在使用这个类,效果很好。

You can alternatively use Html.escapeHtml(String) if you are targeting API 16 or above.

For also targeting below API 16, you can instead use the below class by calling HtmlUtils.escapeHtml(String) which i simply pulled from the source of Html.escapeHtml(String).

public class HtmlUtils {

    public static String escapeHtml(CharSequence text) {
        StringBuilder out = new StringBuilder();
        withinStyle(out, text, 0, text.length());
        return out.toString();
    }

    private static void withinStyle(StringBuilder out, CharSequence text,
                                    int start, int end) {
        for (int i = start; i < end; i++) {
            char c = text.charAt(i);

            if (c == '<') {
                out.append("<");
            } else if (c == '>') {
                out.append(">");
            } else if (c == '&') {
                out.append("&");
            } else if (c >= 0xD800 && c <= 0xDFFF) {
                if (c < 0xDC00 && i + 1 < end) {
                    char d = text.charAt(i + 1);
                    if (d >= 0xDC00 && d <= 0xDFFF) {
                        i++;
                        int codepoint = 0x010000 | (int) c - 0xD800 << 10 | (int) d - 0xDC00;
                        out.append("&#").append(codepoint).append(";");
                    }
                }
            } else if (c > 0x7E || c < ' ') {
                out.append("&#").append((int) c).append(";");
            } else if (c == ' ') {
                while (i + 1 < end && text.charAt(i + 1) == ' ') {
                    out.append(" ");
                    i++;
                }

                out.append(' ');
            } else {
                out.append(c);
            }
        }
    }
}

I am using this class which works fine.

静若繁花 2024-11-24 14:28:42

这是新方法的替代方案(API 16+):

android.text.Html.escapeHtml(your_html).toString();

This is for new method alternative (API 16+):

android.text.Html.escapeHtml(your_html).toString();
海风掠过北极光 2024-11-24 14:28:42

对于大型 html 字符串,Html.fromHtml 可能会非常慢。

以下是使用 jsoup 轻松快速地完成此操作的方法:

将此行添加到您的 gradle 文件中:

implementation 'org.jsoup:jsoup:1.11.3'

在此处检查最新的 jsoup 版本是什么:
https://jsoup.org/download

将此行添加到您的代码中:

String text = Jsoup.parse(htmlStr).text();

查看此处的此链接以了解如何保留换行符:

如何使用 jsoup 将 html 转换为纯文本时是否保留换行符?

Html.fromHtml can be extremely slow for large html strings.

Here's how you can do it, easily and fast with jsoup:

Add this line to your gradle file:

implementation 'org.jsoup:jsoup:1.11.3'

Check what is the latest jsoup version here:
https://jsoup.org/download

Add this line to your code:

String text = Jsoup.parse(htmlStr).text();

Check this link here to learn how to preserve line breaks:

How do I preserve line breaks when using jsoup to convert html to plain text?

只为守护你 2024-11-24 14:28:42
 Spanned spanned;
        if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
            spanned = Html.fromHtml(textToShare, Html.FROM_HTML_MODE_LEGACY);
        } else {
            spanned = Html.fromHtml(textToShare);
        }
tv.setText(spanned.toString());
 Spanned spanned;
        if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
            spanned = Html.fromHtml(textToShare, Html.FROM_HTML_MODE_LEGACY);
        } else {
            spanned = Html.fromHtml(textToShare);
        }
tv.setText(spanned.toString());
奢华的一滴泪 2024-11-24 14:28:42

使用 jsoup 这非常简单

public static String html2text(String html) {
   return Jsoup.parse(html).text();
}

This is dead simple with jsoup

public static String html2text(String html) {
   return Jsoup.parse(html).text();
}
忘羡 2024-11-24 14:28:42

由于尚未提及,以向后兼容的方式执行此操作的方法是使用 HtmlCompat 实用程序类,然后简单地调用(如果您不需要使用特定标志,则使用 0)

HtmlCompat.from(inputString, 0).toString()

在幕后,它已经为您完成了所有必需的 api 检查,

if (Build.VERSION.SDK_INT >= 24) {
   return Html.fromHtml(source, flags);
}
return Html.fromHtml(source);

因此对于

<a href="https://www.stackoverflow.com">Click me!</a>

您将收到的 输入仅有的字符串“点击我!”作为输出。

As it has not been mentioned yet, the way to do this in a backwards compatible manner would be to use the HtmlCompat utility class, and simply call (with 0 if you require no specific flags to be used)

HtmlCompat.from(inputString, 0).toString()

Under the hood it already does all the required api checks for you

if (Build.VERSION.SDK_INT >= 24) {
   return Html.fromHtml(source, flags);
}
return Html.fromHtml(source);

So for for the input

<a href="https://www.stackoverflow.com">Click me!</a>

you will receive only the string 'Click me!' as output.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文