PHP Token 取代 html 实体

发布于 2024-12-04 16:50:36 字数 1562 浏览 4 评论 0原文

我想制作某些单词/字符串,例如文本中找到的链接。我有一段来自 php.bet 的代码可以做到这一点,但它也从 转到主页。你能帮忙解决这个问题吗?

这是一段代码:

<?php

$str_in =   '<p>Hi there worm! You have a disease!</p><a href="http://www.domain.com/index.php" title="Home">go to homepage</a>';
$replaces=      array(
                'worm' => 'http://www.domain.com/index.php/worm.html',
                'disease' => 'http://www.domain.com/index.php/disease.html'
                );

function addLinks($str_in, $replaces)
{
  $str_out = '';
  $tok = strtok($str_in, '<>');
  $must_replace = (substr($str_in, 0, 1) !== '<');
  while ($tok !== false) {
    if ($must_replace) {
      foreach ($replaces as $tag => $href) {
        if (preg_match('/\b' . $tag . '\b/i', $tok)) {
          $tok = preg_replace(
                                '/\b(' . $tag . ')\b/i',
                                '<a title="' . $tag . '" href="' . $href . '">\1</a>',
                                $tok,
                                1);
          unset($replaces[$tag]);
        }
      }
    } else {
      $tok = "<$tok>";
    }
    $str_out .= $tok;
    $tok = strtok('<>');
    $must_replace = !$must_replace;
  }
  return $str_out;
}

echo addLinks($str_in, $replaces);

结果是:

嗨,蠕虫!你有病!

a href="http://www.domain.com/index.php" title="Home"/a

“蠕虫”和“疾病”单词被转换成所需的链接,但其余的......

非常感谢!

I want to make certain words/strings like links if found in the text. I have a piece of code from php.bet which does that, but it also removes the beginning and end of tags from <a href="http://www.domain.com/index.php" title="Home">go to homepage</a>. Can you help solve this?

Here's the piece of code:

<?php

$str_in =   '<p>Hi there worm! You have a disease!</p><a href="http://www.domain.com/index.php" title="Home">go to homepage</a>';
$replaces=      array(
                'worm' => 'http://www.domain.com/index.php/worm.html',
                'disease' => 'http://www.domain.com/index.php/disease.html'
                );

function addLinks($str_in, $replaces)
{
  $str_out = '';
  $tok = strtok($str_in, '<>');
  $must_replace = (substr($str_in, 0, 1) !== '<');
  while ($tok !== false) {
    if ($must_replace) {
      foreach ($replaces as $tag => $href) {
        if (preg_match('/\b' . $tag . '\b/i', $tok)) {
          $tok = preg_replace(
                                '/\b(' . $tag . ')\b/i',
                                '<a title="' . $tag . '" href="' . $href . '">\1</a>',
                                $tok,
                                1);
          unset($replaces[$tag]);
        }
      }
    } else {
      $tok = "<$tok>";
    }
    $str_out .= $tok;
    $tok = strtok('<>');
    $must_replace = !$must_replace;
  }
  return $str_out;
}

echo addLinks($str_in, $replaces);

The result is:

Hi there worm! You have a disease!

a href="http://www.domain.com/index.php" title="Home"/a

The "worm" and "disease" words are transformed into links like desired, but the rest...

Thanks a lot!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

旧竹 2024-12-11 16:50:36

这对函数应该能够满足您的需求,不会出现使用正则表达式str_replace解析 HTML 时出现的问题。

function process($node, $replaceRules)
{
    if($node->hasChildNodes()) {
        $nodes = array();
        foreach ($node->childNodes as $childNode) {
            $nodes[] = $childNode;
        }
        foreach ($nodes as $childNode) {
            if ($childNode instanceof DOMText) {
                $text = preg_replace(
                    array_keys($replaceRules),
                    array_values($replaceRules),
                    $childNode->wholeText);
                $node->replaceChild(new DOMText($text),$childNode);
            }
            else {
                process($childNode, $replaceRules);
            }
        }
    }
}

function addLinks($str_in, $replaces)
{
    $replaceRules = array();    
    foreach($replaces as $k=>$v) {
        $k = '/\b(' . $k . ')\b/i';
        $v = '<a href="' . $v . '">$1</a>';
        $replaceRules[$k] = $v;
    }

    $doc = new DOMDocument;
    $doc->loadHTML($str_in);
    process($doc->documentElement, $replaceRules);
    return html_entity_decode($doc->saveHTML());
}

注意:
无需担心 HTML 的结构是否良好(如您的示例所示);然而,输出将是结构良好的。

应得的信用:
递归 process() 函数完成大部分实际工作,直接来自 Lukáš Lalinský 对 如何替换 HTML 中的文本addLinks() 函数只是一个根据您的问题量身定制的用例。

This pair of functions should do what you want without the problems that come with parsing HTML with regex or str_replace.

function process($node, $replaceRules)
{
    if($node->hasChildNodes()) {
        $nodes = array();
        foreach ($node->childNodes as $childNode) {
            $nodes[] = $childNode;
        }
        foreach ($nodes as $childNode) {
            if ($childNode instanceof DOMText) {
                $text = preg_replace(
                    array_keys($replaceRules),
                    array_values($replaceRules),
                    $childNode->wholeText);
                $node->replaceChild(new DOMText($text),$childNode);
            }
            else {
                process($childNode, $replaceRules);
            }
        }
    }
}

function addLinks($str_in, $replaces)
{
    $replaceRules = array();    
    foreach($replaces as $k=>$v) {
        $k = '/\b(' . $k . ')\b/i';
        $v = '<a href="' . $v . '">$1</a>';
        $replaceRules[$k] = $v;
    }

    $doc = new DOMDocument;
    $doc->loadHTML($str_in);
    process($doc->documentElement, $replaceRules);
    return html_entity_decode($doc->saveHTML());
}

Note:
No need to worry if the HTML is not well structured (as in your example); however, the output will be well structured.

Credit where it’s due:
The recursive process() function, which does most of the real work, comes direclty from Lukáš Lalinský’s answer to How to replace text in HTML. The addLinks() function is just a use case tailored to fit your question.

得不到的就毁灭 2024-12-11 16:50:36

不知道为什么你有这么大的结构,当类似的事情:

$str_out = preg_replace('/(' . preg_quote(implode('|', array_keys($replaces))) . ')/', $replaces[$1], $str_in);

会完成同样的事情。当然,使用正则表达式处理 HTML 是一个危险过程。您应该将 DOM 与某些 xpath 结合使用,以更可靠地完成此操作。

Not sure why you've got that large construction, when something like:

$str_out = preg_replace('/(' . preg_quote(implode('|', array_keys($replaces))) . ')/', $replaces[$1], $str_in);

would accomplish about the same thing. Of course, using regexes to process HTML is a hazardous process. You should use DOM with some xpath to do this more reliably.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文