将主题标签转换为超链接，无需部分匹配 htmlentities

发布于 2024-09-01 02:22:25 字数 344 浏览 7 评论 0原文

我想将所有出现的 #word 替换为 HTML 链接。我为此编写了一个 preg_replace() 调用：

$text = preg_replace('~#([\p{L}|\p{N}]+)~u', '<a href="/?aranan=$1">#$1</a>', $text);

问题是，这个正则表达式还匹配 ' 等 html 字符代码，因此会损坏输出。

我需要排除 &# 前面的字母数字子字符串，但我不知道如何使用正则表达式来做到这一点。

原文

I want to replace all occurrences of #word with an HTML link. I have written a preg_replace() call for this:

$text = preg_replace('~#([\p{L}|\p{N}]+)~u', '<a href="/?aranan=$1">#$1</a>', $text);

The problem is, this regular expression also matches the html character codes like ' and therefore corrupts the output.

I need to exclude alphanumeric substrings which are preveded by &#, but I do not know how to do that using regular expressions.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

天冷不及心凉 2024-09-08 02:22:25

'~(?<!&)#([\p{L}|\p{N}]+)~u'

这是一个负面的后向断言： http://www.php.net/ Manual/en/regexp.reference.assertions.php

仅当前面没有 & 时才匹配 #

'~(?<!&)#([\p{L}|\p{N}]+)~u'

That's a negative lookbehind assertion: http://www.php.net/manual/en/regexp.reference.assertions.php

Matches # only if not preceded by &

回复收藏 0 原文

君勿笑 2024-09-08 02:22:25

http://gskinner.com/RegExr/

使用此在线正则表达式构造函数。他们对您可能想要使用的每个标志都有解释......并且您将在示例文本中看到突出显示的匹配项。

是的，使用 [a-zA-Z]

回复收藏 0 原文

番薯 2024-09-08 02:22:25

使用 SKIP-FAIL 子模式来匹配不应替换的整个序列。编写以哈希为前缀的多字节安全字子模式以匹配任何未取消资格的子字符串。这将消除模式歧义并确保替换准确性。
字符类不需要管道来分隔两个字符范围。花括号也可以删除。
在替换中，如果您要生成 HTML 元素，请正确对 href 值进行 URL 编码，并对打印的链接文本进行 HTML 编码。

代码：(演示)

$text = '#Test ' #039foo "#bär"';

echo preg_replace_callback(
    '~&#\d+;(*SKIP)(*FAIL)|#([\pL\pN]+)~u',
    fn($m) => sprintf(
        '<a href="/?%s">#%s</a>',
        http_build_query(['aranan' => $m[1]]),
        htmlentities($m[1])
    ),
    $text
);

未渲染的输出：

<a href="/?aranan=Test">#Test</a> ' <a href="/?aranan=039foo">#039foo</a> "<a href="/?aranan=b%C3%A4r">#bär</a>"

渲染的 HTML：

#Test ' #039foo "#bär"

Use a SKIP-FAIL subpattern to match whole sequences which should not be replaced. Write your hash-prefixed multibyte-safe word subpattern to match any substrings which were not disqualified. This will eliminate pattern ambiguity and ensure replacement accuracy.
A character class does not need a pipe to separate the two character ranges. The curly braces can be removed too.
In the replacement, if you are generating HTML <a> elements, then properly URL-encode the href value and HTML-encode the printed link text.

Code: (Demo)

$text = '#Test ' #039foo "#bär"';

echo preg_replace_callback(
    '~&#\d+;(*SKIP)(*FAIL)|#([\pL\pN]+)~u',
    fn($m) => sprintf(
        '<a href="/?%s">#%s</a>',
        http_build_query(['aranan' => $m[1]]),
        htmlentities($m[1])
    ),
    $text
);

Unrendered output:

<a href="/?aranan=Test">#Test</a> ' <a href="/?aranan=039foo">#039foo</a> "<a href="/?aranan=b%C3%A4r">#bär</a>"

Rendered HTML:

#Test ' #039foo "#bär"

回复收藏 0 原文

伴我老 2024-09-08 02:22:25

您需要在正则表达式语句中添加 [A-Za-z] 规则，以便它仅将自身限制为字母而不是数字。

~~稍后我将用一个示例进行编辑。~~

回复收藏 0 原文

~没有更多了~

关于作者

意中人

暂无简介

0 文章

0 评论

24 人气

关注发私信

離殇

文章 0 评论 0

关注

小姐丶请自重

文章 0 评论 0

关注

Aik

文章 0 评论 0

关注

国产ˉ祖宗

文章 0 评论 0

关注

猥琐帝

文章 0 评论 0

关注

半仙

文章 0 评论 0

友情链接

文江博客

将主题标签转换为超链接，无需部分匹配 htmlentities

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

離殇

小姐丶请自重

Aik

国产ˉ祖宗

猥琐帝

半仙

友情链接

将主题标签转换为超链接，无需部分匹配 htmlentities

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

離殇

小姐丶请自重

Aik

国产ˉ祖宗

猥琐帝

半仙

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。