需要正则表达式专家

发布于 2024-08-13 10:20:01 字数 559 浏览 18 评论 0原文

我正在尝试编写一个脚本来解析 HTML 块并将单词与给定的术语表进行匹配。如果找到匹配项，则会将该术语包装在中并提供定义。

它工作正常 - 除了两个主要缺点：

它匹配属性中的文本
它匹配已经在标记中的文本，创建了一个嵌套链接。

有什么方法可以让我的正则表达式仅匹配不在属性中且不在标记中的单词吗？

这是我正在使用的代码，以防相关：

foreach(Glossary::map() as $term => $def) {
  $search[] = "/\b($term)\b/i";
  self::$lookup[strtoupper($term)] = $def;
}

return preg_replace_callback($search, array(&$this,'replace'),$this->content);

原文

I'm trying to write a script that parses a block of HTML and matches words against a given glossary of terms. If it finds a match, it wraps the term in <a class="tooltip"></a> and provides a definition.

It's working okay -- except for two major shortcomings:

It matches text that is in attributes
It matches text that is already in an <a> tag, created a nested link.

Is there any way to have my regular expression match only words that are not in attributes, and not in <a> tags?

Here's the code I'm using, in case it's relevant:

foreach(Glossary::map() as $term => $def) {
  $search[] = "/\b($term)\b/i";
  self::$lookup[strtoupper($term)] = $def;
}

return preg_replace_callback($search, array(&$this,'replace'),$this->content);

分享到QQ

分享到微博