PHP Token 取代 html 实体

发布于 2024-12-04 16:50:36 字数 1562 浏览 2 评论 0原文

我想制作某些单词/字符串，例如文本中找到的链接。我有一段来自 php.bet 的代码可以做到这一点，但它也从 转到主页。你能帮忙解决这个问题吗？

这是一段代码：

<?php

$str_in =   '<p>Hi there worm! You have a disease!</p><a href="http://www.domain.com/index.php" title="Home">go to homepage</a>';
$replaces=      array(
                'worm' => 'http://www.domain.com/index.php/worm.html',
                'disease' => 'http://www.domain.com/index.php/disease.html'
                );

function addLinks($str_in, $replaces)
{
  $str_out = '';
  $tok = strtok($str_in, '<>');
  $must_replace = (substr($str_in, 0, 1) !== '<');
  while ($tok !== false) {
    if ($must_replace) {
      foreach ($replaces as $tag => $href) {
        if (preg_match('/\b' . $tag . '\b/i', $tok)) {
          $tok = preg_replace(
                                '/\b(' . $tag . ')\b/i',
                                '<a title="' . $tag . '" href="' . $href . '">\1</a>',
                                $tok,
                                1);
          unset($replaces[$tag]);
        }
      }
    } else {
      $tok = "<$tok>";
    }
    $str_out .= $tok;
    $tok = strtok('<>');
    $must_replace = !$must_replace;
  }
  return $str_out;
}

echo addLinks($str_in, $replaces);

结果是：

嗨，蠕虫！你有病！
a href="http://www.domain.com/index.php" title="Home"/a

“蠕虫”和“疾病”单词被转换成所需的链接，但其余的......

非常感谢！

原文

I want to make certain words/strings like links if found in the text. I have a piece of code from php.bet which does that, but it also removes the beginning and end of tags from <a href="http://www.domain.com/index.php" title="Home">go to homepage</a>. Can you help solve this?

Here's the piece of code:

<?php

$str_in =   '<p>Hi there worm! You have a disease!</p><a href="http://www.domain.com/index.php" title="Home">go to homepage</a>';
$replaces=      array(
                'worm' => 'http://www.domain.com/index.php/worm.html',
                'disease' => 'http://www.domain.com/index.php/disease.html'
                );

function addLinks($str_in, $replaces)
{
  $str_out = '';
  $tok = strtok($str_in, '<>');
  $must_replace = (substr($str_in, 0, 1) !== '<');
  while ($tok !== false) {
    if ($must_replace) {
      foreach ($replaces as $tag => $href) {
        if (preg_match('/\b' . $tag . '\b/i', $tok)) {
          $tok = preg_replace(
                                '/\b(' . $tag . ')\b/i',
                                '<a title="' . $tag . '" href="' . $href . '">\1</a>',
                                $tok,
                                1);
          unset($replaces[$tag]);
        }
      }
    } else {
      $tok = "<$tok>";
    }
    $str_out .= $tok;
    $tok = strtok('<>');
    $must_replace = !$must_replace;
  }
  return $str_out;
}

echo addLinks($str_in, $replaces);

The result is:

Hi there worm! You have a disease!
a href="http://www.domain.com/index.php" title="Home"/a

The "worm" and "disease" words are transformed into links like desired, but the rest...

Thanks a lot!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

旧竹 2024-12-11 16:50:36

这对函数应该能够满足您的需求，不会出现使用正则表达式或str_replace解析 HTML 时出现的问题。

function process($node, $replaceRules)
{
    if($node->hasChildNodes()) {
        $nodes = array();
        foreach ($node->childNodes as $childNode) {
            $nodes[] = $childNode;
        }
        foreach ($nodes as $childNode) {
            if ($childNode instanceof DOMText) {
                $text = preg_replace(
                    array_keys($replaceRules),
                    array_values($replaceRules),
                    $childNode->wholeText);
                $node->replaceChild(new DOMText($text),$childNode);
            }
            else {
                process($childNode, $replaceRules);
            }
        }
    }
}

function addLinks($str_in, $replaces)
{
    $replaceRules = array();    
    foreach($replaces as $k=>$v) {
        $k = '/\b(' . $k . ')\b/i';
        $v = '<a href="' . $v . '">$1</a>';
        $replaceRules[$k] = $v;
    }

    $doc = new DOMDocument;
    $doc->loadHTML($str_in);
    process($doc->documentElement, $replaceRules);
    return html_entity_decode($doc->saveHTML());
}

注意：
无需担心 HTML 的结构是否良好（如您的示例所示）；然而，输出将是结构良好的。

应得的信用：
递归 process() 函数完成大部分实际工作，直接来自 Lukáš Lalinský 对如何替换 HTML 中的文本。 addLinks() 函数只是一个根据您的问题量身定制的用例。

This pair of functions should do what you want without the problems that come with parsing HTML with regex or str_replace.

function process($node, $replaceRules)
{
    if($node->hasChildNodes()) {
        $nodes = array();
        foreach ($node->childNodes as $childNode) {
            $nodes[] = $childNode;
        }
        foreach ($nodes as $childNode) {
            if ($childNode instanceof DOMText) {
                $text = preg_replace(
                    array_keys($replaceRules),
                    array_values($replaceRules),
                    $childNode->wholeText);
                $node->replaceChild(new DOMText($text),$childNode);
            }
            else {
                process($childNode, $replaceRules);
            }
        }
    }
}

function addLinks($str_in, $replaces)
{
    $replaceRules = array();    
    foreach($replaces as $k=>$v) {
        $k = '/\b(' . $k . ')\b/i';
        $v = '<a href="' . $v . '">$1</a>';
        $replaceRules[$k] = $v;
    }

    $doc = new DOMDocument;
    $doc->loadHTML($str_in);
    process($doc->documentElement, $replaceRules);
    return html_entity_decode($doc->saveHTML());
}

Note:
No need to worry if the HTML is not well structured (as in your example); however, the output will be well structured.

Credit where it’s due:
The recursive process() function, which does most of the real work, comes direclty from Lukáš Lalinský’s answer to How to replace text in HTML. The addLinks() function is just a use case tailored to fit your question.

回复收藏 0 原文

得不到的就毁灭 2024-12-11 16:50:36

不知道为什么你有这么大的结构，当类似的事情：

$str_out = preg_replace('/(' . preg_quote(implode('|', array_keys($replaces))) . ')/', $replaces[$1], $str_in);

会完成同样的事情。当然，使用正则表达式处理 HTML 是一个危险过程。您应该将 DOM 与某些 xpath 结合使用，以更可靠地完成此操作。

Not sure why you've got that large construction, when something like:

$str_out = preg_replace('/(' . preg_quote(implode('|', array_keys($replaces))) . ')/', $replaces[$1], $str_in);

would accomplish about the same thing. Of course, using regexes to process HTML is a hazardous process. You should use DOM with some xpath to do this more reliably.

回复收藏 0 原文

~没有更多了~