正则表达式替换 <或>与 &gt;或< html 标签内

发布于 2024-10-01 03:14:45 字数 251 浏览 9 评论 0 原文

例如。

<html>
<head></head>
<body>
<div>
<h1>-----> hello! ----< </h1>
</div>
</body>

我想替换 >且< h1 标签内有相应的 >且<

哪个是正确的模式?

提前致谢!

for example.

<html>
<head></head>
<body>
<div>
<h1>-----> hello! ----< </h1>
</div>
</body>

I want to replace the > and < inside the h1 tag with the corresponding > and <

which is the correct pattern?

thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

忘年祭陌 2024-10-08 03:14:45

与评论者“为什么首先生成这个损坏的 HTML?”一致,如果您表示这样的文档,那么您将遇到当前遇到的这些问题。有两种有效情况

  • 您有一些数据(不是 HTML 转义),例如 PHP 中的一堆字符串
  • 您有一个 HTML 文档,包含标签和 HTML 转义的文本

因此,当您从源数据(字符串、数据库)你需要对它们进行转义(例如,通过使用 htmlspecialchars 作为另一个回答者正确指出。)

您需要不惜一切代价避免出现这样的情况:您拥有像您这样的字符串,其中包含 HTML 标签和非转义文本。

例如,如果您的文本包含文本 text 并且您确实希望该文本显示在 HTML 文档中,即您希望看到尖括号而不是文本以粗体显示(例如,您正在编写有关如何对 HTML 进行编程的文档),那么一旦您拥有这样的文档,您就无法将其与实际的 HTML 代码区分开来。

In agreement with the commenter "Why is this broken HTML being generated in the first place?", if you represent documents like this then you will have exactly these problems that you are currently having. There are two valid situations

  • You have some data (not HTML escaped) e.g. a bunch of strings in PHP
  • You have an HTML document, containing tags, and text which is HTML escaped

So when you generate the HTML document from your source data (strings, database) you need to do the escaping them (e.g. by using htmlspecialchars as another answerer correctly pointed out.)

You need to avoid, at all costs, a situation where you have a string like you have, which has HTML tags and non-escaped text.

For example, if you text contained the text <b>text</b> and you literally wanted that text to be displayed in the HTML document i.e. you wanted the angle-brackets to be seen rather than the text be in bold (e.g. you were writing a document about how to program HTML) then you have no way to differentiate that from actual HTML code once you have such a document.

呆° 2024-10-08 03:14:45

您可以将其扔到 tidy (请参阅文档< /a>) 并查看是否可以修复错误。比尝试使用正则表达式自己做“正确的事情”要好得多。

$html = <<<EOT
<html>
<head></head>
<body>
<div>
<h1>-----> hello! ----< </h1>
</div>
</body>
EOT;

$config = array ( 
  'clean'                       => true, 
  'drop-proprietary-attributes' => true, 
  'output-xhtml'                => false, 
  'show-body-only'              => false, 
  'wrap'                        => '0'
); 

$tidy = new tidy();
$tidy->parseString($html, $config, 'utf8');
$tidy->cleanRepair();

echo tidy_get_output($tidy);

您可能必须首先在 PHP 环境中启用 tidy。

You could throw it at tidy (see the docs) and see if it can fix the errors. A lot better than trying to do the "right thing" on your own with regex.

$html = <<<EOT
<html>
<head></head>
<body>
<div>
<h1>-----> hello! ----< </h1>
</div>
</body>
EOT;

$config = array ( 
  'clean'                       => true, 
  'drop-proprietary-attributes' => true, 
  'output-xhtml'                => false, 
  'show-body-only'              => false, 
  'wrap'                        => '0'
); 

$tidy = new tidy();
$tidy->parseString($html, $config, 'utf8');
$tidy->cleanRepair();

echo tidy_get_output($tidy);

It might be that you must enable tidy first in your PHP environment.

淡看悲欢离合 2024-10-08 03:14:45

我会通过 tidy 传递它。

I would pass it through tidy.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文