获取自定义开始 HTML 标记及其结束标记之间的文本

发布于 2024-09-03 20:10:31 字数 241 浏览 7 评论 0原文

$data = "<Data>hello</Data>";
preg_match_all("/\<Data\>[.]+\<\/Data\>/", $data, $match);
print_r($match);

Array ( [0] => Array ( ) )

所以我猜测没有匹配？

原文

$data = "<Data>hello</Data>";
preg_match_all("/\<Data\>[.]+\<\/Data\>/", $data, $match);
print_r($match);

This returns:

Array ( [0] => Array ( ) )

So I am guessing that a match is not made?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

护你周全 2024-09-10 20:10:32

preg_match_all("#<Data>.+</Data>#", $data, $match);

如果您想使用 / 作为分隔符：

preg_match_all("/<Data>.+<\/Data>/", $data, $match);

主要问题是字符类中的 . 与文字句点匹配。此外，使用不同的分隔符可以消除转义。请注意，您不必以任何方式转义 <。如果您希望能够提取内部值，请使用：

preg_match_all("#<Data>(.+)</Data>#", $data, $match);

“hello”现在将位于示例中的 $matches[1] 中。请注意，正则表达式不适合解析 XML，因此请切换到真正的解析器来处理任何重要的事情。

preg_match_all("#<Data>.+</Data>#", $data, $match);

If you wanted to use / as the delimiter:

preg_match_all("/<Data>.+<\/Data>/", $data, $match);

The main problem was that a . inside a character class matches a literal period. Also, using a different delimiter eliminates escaping. Note that you don't have to escape < either way. If you want to be able to extract the inner value, use:

preg_match_all("#<Data>(.+)</Data>#", $data, $match);

"hello" will now be in $matches[1] in your example. Note that regex is not suited for parsing XML, so switch to a real parser for anything non-trivial.

回复收藏 0 原文

纵性 2024-09-10 20:10:32

您正在使用 [] 和 .错误地。

试试这个：

$data = "<Data>hello</Data>";
preg_match_all("/\<Data\>.+\<\/Data\>/", $data, $match);
print_r($match);

当您使用 [] 时，您定义了可能的字符列表，在您的情况下，您定义的字符仅限于。仅有的。如果您想使用 .要定义任何字符，您必须在 [] 之外使用它。

You are using the [] and . incorrectly.

Try this :

$data = "<Data>hello</Data>";
preg_match_all("/\<Data\>.+\<\/Data\>/", $data, $match);
print_r($match);

When you use the [] your a defining a list of possible caracter, in your case the caracters you defined where limited to . only. If you want to use the . to define any caracter you have to use it outside of [].

回复收藏 0 原文

三生殊途 2024-09-10 20:10:32

<?php

$data = "<Data>hello</Data>";
preg_match_all('#<Data>(.+)</Data>#', $data, $match);
print_r($match);

?>

输出：(如 ideone.com 上所示)

Array
(
    [0] => Array
        (
            [0] => <Data>hello</Data>
        )

    [1] => Array
        (
            [0] => hello
        )

)

[...] 是字符类定义。您可以使用 (...) 来捕获。

参考

regular-expressions.info/Character Class 和组

关于勉强匹配的特别说明

由于您使用的是 preg_match_all，因此应该注意，您目前正在贪婪地匹配。也就是说，hellohow are you (参见 ideone.com）。

如果您想要两个元素，则必须使用勉强匹配的 '#(.+?)#' (< a href="http://ideone.com/OTQNi" rel="nofollow noreferrer">参见 ideone.com）。

举例说明：

----A--Z----A----Z----
    ^^^^^^^^^^^^^^
        A.*Z

上述输入中只有一个 A.*Z 匹配项。

关于解析 HTML/XML 的正则表达式的特别说明

这是一个痛苦。如果可能的话，使用适当的 HTML/XML 解析器。 PHP 有很多。

<?php

$data = "<Data>hello</Data>";
preg_match_all('#<Data>(.+)</Data>#', $data, $match);
print_r($match);

?>

The output: (as seen on ideone.com)

Array
(
    [0] => Array
        (
            [0] => <Data>hello</Data>
        )

    [1] => Array
        (
            [0] => hello
        )

)

[...] is a character class definition. You use (...) to capture.

References

regular-expressions.info/Character Class and Groups

Special note on reluctant matching

Since you're using preg_match_all, it should be noted that you're currently matching greedily. That is, there is only one match in, say, <Data>hello</Data><Data>how are you</Data> (see on ideone.com).

If you want both <Data> elements, then you must use reluctant matching '#<Data>(.+?)</Data>#' (see on ideone.com).

To illustrate:

----A--Z----A----Z----
    ^^^^^^^^^^^^^^
        A.*Z

There is only one A.*Z match in the above input.

Special note on regex to parse HTML/XML

It's a pain. If at all possible, use a proper HTML/XML parser. There are plenty for PHP.

回复收藏 0 原文

是你 2024-09-10 20:10:32

在字符类中，点只是一个点。

<?php  

    $data = "<Data>hello</Data>";
    preg_match_all("/\<Data\>.+\<\/Data\>/", $data, $match);
    print_r($match);

?>

将产生：

Array
(
    [0] => Array
        (
            [0] => <Data>hello</Data>
        )

)

Inside character classes a dot is just a dot.

<?php  

    $data = "<Data>hello</Data>";
    preg_match_all("/\<Data\>.+\<\/Data\>/", $data, $match);
    print_r($match);

?>

Will yield:

Array
(
    [0] => Array
        (
            [0] => <Data>hello</Data>
        )

)

回复收藏 0 原文

紫南 2024-09-10 20:10:32

试试这个。您不需要将 .

"/\<Data\>.+\<\/Data\>/"

Try this. you dont need the brackets around the .

"/\<Data\>.+\<\/Data\>/"

回复收藏 0 原文

一个人的旅程 2024-09-10 20:10:32

/<Data>([^<^>]+)\<\/Data\>/

$data = "<Data>hello</Data>";
preg_match_all("/<Data>([^<^>]+)\<\/Data\>/", $data, $match);

print_r($match);

/<Data>([^<^>]+)\<\/Data\>/

$data = "<Data>hello</Data>";
preg_match_all("/<Data>([^<^>]+)\<\/Data\>/", $data, $match);

print_r($match);

回复收藏 0 原文

~没有更多了~

关于作者

じее

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

获取自定义开始 HTML 标记及其结束标记之间的文本

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

参考

关于勉强匹配的特别说明

关于解析 HTML/XML 的正则表达式的特别说明

References

Special note on reluctant matching

Special note on regex to parse HTML/XML

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

获取自定义开始 HTML 标记及其结束标记之间的文本

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

参考

关于勉强匹配的特别说明

关于解析 HTML/XML 的正则表达式的特别说明

References

Special note on reluctant matching

Special note on regex to parse HTML/XML

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。