�使用character_limiter()和strip_tags()和utf-8字符集出现

发布于 2024-12-09 07:04:58 字数 2026 浏览 0 评论 0原文

当我将 Codeigniter 的 character_limiter() 与 PHP 的本机 strip_tags() 结合起来时,我得到了 � 个字符。这是我正在使用的代码:

<?php echo character_limiter(strip_tags($block->body), 60); ?>

$block->body 是存储在数据库中的 HTML 字符串。如果我只使用其中一个函数,我不会得到这种意外的输出。它看起来像这样:

在此处输入图像描述

HTML 看起来像这样:

在此处输入图像描述

我没有粘贴实际的 HTML,因为将其发布到此处会修改字符串,请参见下文

这是代码点火器功能character_limiter

function character_limiter($str, $n = 500, $end_char = '&#8230;')
{
    if (strlen($str) < $n)
    {
        return $str;
    }

    $str = preg_replace("/\s+/", ' ', str_replace(array("\r\n", "\r", "\n"), ' ', $str));

    if (strlen($str) <= $n)
    {
        return $str;
    }

    $out = "";
    foreach (explode(' ', trim($str)) as $val)
    {
        $out .= $val.' ';

        if (strlen($out) >= $n)
        {
            $out = trim($out);
            return (strlen($out) == strlen($str)) ? $out : $out.$end_char;
        }
    }
}

我发现存在一些不可见的字符或可能导致此问题的原因,因为当我将 HTML 粘贴到文本编辑器中,然后返回到图像中的“HTML 源编辑器”时(这只是TinyMCE),然后保存它,奇怪的字符消失了。

我全面使用 utf-8 字符集(尽可能)。原始数据确实来自未知数据库的转储,并使用 SQL 客户端导入。但是,当我保存现有字符串(在 CMS 中)时,没有任何变化。

我无法连接这两个函数之间的点,导致一起使用时出现此输出,并且我无法正常获取 � 字符。当我使用以下内容时,我看到此输出:

character_limiter(strip_tags($html))

可能导致此问题的原因是什么,以及如何防止它?

注意:我绝对想使用character_limiter函数或其变体。如果字符串的长度比第二个参数长,它会在字符串末尾添加省略号。单独使用它(没有 strip_tags)效果非常好(没有奇怪的字符)。

更新:对于无法重现此问题的任何人,我将一个 SQL 文件放在网上来演示该问题。我使用 MySQL 查询浏览器 导入它。当 HTML 来自数据库时,我似乎才得到这个输出。这是链接(忽略内容,这是客户端的错): http://wesleymurch.com/test /test1.sql

I'm getting � characters when I combine Codeigniter's character_limiter() with PHP's native strip_tags(). Here is the code I'm using:

<?php echo character_limiter(strip_tags($block->body), 60); ?>

$block->body is an HTML string stored in the database. I do not get this unexpected output if I use only one of the functions. It looks like this:

enter image description here

This is what the HTML looks like:

enter image description here

I didn't paste the actual HTML because the string would be modified by posting it here, see below

Here is the Codeigniter function character_limiter:

function character_limiter($str, $n = 500, $end_char = '…')
{
    if (strlen($str) < $n)
    {
        return $str;
    }

    $str = preg_replace("/\s+/", ' ', str_replace(array("\r\n", "\r", "\n"), ' ', $str));

    if (strlen($str) <= $n)
    {
        return $str;
    }

    $out = "";
    foreach (explode(' ', trim($str)) as $val)
    {
        $out .= $val.' ';

        if (strlen($out) >= $n)
        {
            $out = trim($out);
            return (strlen($out) == strlen($str)) ? $out : $out.$end_char;
        }
    }
}

I figured out that there was some invisible character or something that may have been causing this, because when I pasted the HTML into a text editor, then back into the "HTML source editor" in the image (which is just TinyMCE), then saved it, the weird characters disappeared.

I am using the utf-8 character set across the board (everywhere possible). The original data did come from a dump of an unknown database, and was imported with an SQL client. However, when I saved the existing string (in the CMS), nothing changed.

I can't connect the dots between these two functions causing this output when used together, and I do not get the � characters normally. I only see this output when I use:

character_limiter(strip_tags($html))

What could be causing this, and how can I prevent it?

Note: I definitely want to use the character_limiter function, or a variation of it. It makes an ellipsis at the end of the string if its length is longer than the second param. Using it alone (without strip_tags) works perfectly fine (no weird characters).

Update: For anyone that can't reproduce this, I put an SQL file online that demos the issue. I am importing this with MySQL Query Browser. I only get this output it seems when the HTML comes from the database. Here is the link (ignore the content, it's the client's fault): http://wesleymurch.com/test/test1.sql

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

被你宠の有点坏 2024-12-16 07:04:58

� 替换字符 用于替换未知或不可打印的字符
在 php 中,我们通常使用多字节字符串函数来解决这个问题。
将 mb_substr 与条带标签一起使用,例如 :

mb_substr( strip_tags($text) , 0,300 ,'UTF-8' );//or what ever your charset 

或者您可以修改 codeigniter 函数并使用多字节字符串函数。

更新

function character_limiter($str, $n = 500, $end_char = '…')
{
    if (mb_strlen($str) < $n)
    {
        return $str;
    }

    $str = mb_ereg_replace("\s+", ' ', str_replace(array("\r\n", "\r", "\n"), ' ', $str));

    if (mb_strlen($str) <= $n)
    {
        return $str;
    }

    $out = "";
    foreach (explode(' ', trim($str)) as $val)
    {
        $out .= $val.' ';

        if (mb_strlen($out) >= $n)
        {
            $out = trim($out);
            return (mb_strlen($out) == mb_strlen($str)) ? $out : $out.$end_char;
        }
    }
}

� replacement character used to replace an unknown or unprintable character
in php usually we solve this issue using multibyte string functions .
use mb_substr with strip tags like :

mb_substr( strip_tags($text) , 0,300 ,'UTF-8' );//or what ever your charset 

or you maybe modify the codeigniter function and use Multibyte String Functions .

UPDATE

function character_limiter($str, $n = 500, $end_char = '…')
{
    if (mb_strlen($str) < $n)
    {
        return $str;
    }

    $str = mb_ereg_replace("\s+", ' ', str_replace(array("\r\n", "\r", "\n"), ' ', $str));

    if (mb_strlen($str) <= $n)
    {
        return $str;
    }

    $out = "";
    foreach (explode(' ', trim($str)) as $val)
    {
        $out .= $val.' ';

        if (mb_strlen($out) >= $n)
        {
            $out = trim($out);
            return (mb_strlen($out) == mb_strlen($str)) ? $out : $out.$end_char;
        }
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文