RegExp PHP 获取多个 span 标签之间的文本

发布于 2024-10-09 21:57:35 字数 1041 浏览 5 评论 0原文

我英语说得不太好。所以,如果我犯了一些错误,请抱歉。

在网站上,我有一个 div 框,其中包含有关游戏的一些信息:

<span class="noteline">Developer:</span> 
<span class="subline">Gameloft</span> 
<span class="noteline">Genre:</span> 
<span class="subline">Racing/Arcade</span> 
<span class="noteline">Release year:</span> 
<span class="subline">2010</span> 

我需要获取 和它的结束标记 之间的信息。 上面的

preg_match("/\<span\sclass=\"subline\"\>(.*)<\/span\>/imsU", $source, $matches);

解决方案工作正常,但它只获得带有文本“gameloft”的“子行”;

但我还需要包含文本 Racing/Arcade 和 2010 的子行;

也许是这样的(这不起作用);

for developer = preg_match("/*(\<span\sclass=\"subline\"\>){1}*(.*)*(<\/span\>){1}*/imsU", $source, $matches);
for genre = preg_match("/*(\<span\sclass=\"subline\"\>){2}*(.*)*(<\/span\>){2}*/imsU", $source, $matches);

像这样的东西..

无论如何。感谢您的任何帮助。

I don't speak English very well. So, if i'll make some mistake please sorry.

On the site i have a div box with some information about game:

<span class="noteline">Developer:</span> 
<span class="subline">Gameloft</span> 
<span class="noteline">Genre:</span> 
<span class="subline">Racing/Arcade</span> 
<span class="noteline">Release year:</span> 
<span class="subline">2010</span> 

I need to get the information between <span class="noteline"> and it's closing tag </span>

preg_match("/\<span\sclass=\"subline\"\>(.*)<\/span\>/imsU", $source, $matches);

the solution above works fine but it only gets the "subline" with text "gameloft";

but i need also sublines that have text Racing/Arcade and 2010;

Maybe something like this (that doesn't work);

for developer = preg_match("/*(\<span\sclass=\"subline\"\>){1}*(.*)*(<\/span\>){1}*/imsU", $source, $matches);
for genre = preg_match("/*(\<span\sclass=\"subline\"\>){2}*(.*)*(<\/span\>){2}*/imsU", $source, $matches);

something like this..

Anyway. Thanks for any help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

淡笑忘祈一世凡恋 2024-10-16 21:57:35

正则表达式的替代方法是使用 phpQuery 或 QueryPath,这将其简化为:

foreach ( qp($source)->find("span.subline") as $span ) {
    print $span->text();
}

An alternative to regexps would be to use phpQuery or QueryPath, which simplifies it to:

foreach ( qp($source)->find("span.subline") as $span ) {
    print $span->text();
}
锦上情书 2024-10-16 21:57:35

正则表达式不适合解析 HTML。它们很难正确执行,并且总是在边缘情况下崩溃。

我不知道是否有更简单的方法,但这应该适用于您描述的标记:

<?php

$fragment = '<span class="noteline">Developer:</span>
<span class="subline">Gameloft</span>
<span class="noteline">Genre:</span>
<span class="subline">Racing/Arcade</span>
<span class="noteline">Release year:</span>
<span class="subline">2010</span>';

libxml_use_internal_errors(TRUE);
$dom = new DOMDocument();
$dom->loadHTML($fragment);
$xml = simplexml_import_dom($dom);
libxml_use_internal_errors(FALSE);

foreach($xml->xpath("//span[@class='subline']") as $item){
    echo (string)$item . PHP_EOL;
}

这假设 class="subline" 所以它会因多个类而失败。 (Xpath 的新手,欢迎改进。)

Regular expressions are not appropriate to parse HTML. They are difficult to get right and they always break in edge cases.

I don't know if there's an easier way but this should work with the markup you describe:

<?php

$fragment = '<span class="noteline">Developer:</span>
<span class="subline">Gameloft</span>
<span class="noteline">Genre:</span>
<span class="subline">Racing/Arcade</span>
<span class="noteline">Release year:</span>
<span class="subline">2010</span>';

libxml_use_internal_errors(TRUE);
$dom = new DOMDocument();
$dom->loadHTML($fragment);
$xml = simplexml_import_dom($dom);
libxml_use_internal_errors(FALSE);

foreach($xml->xpath("//span[@class='subline']") as $item){
    echo (string)$item . PHP_EOL;
}

This assumes class="subline" so it'll fail with multiple classes. (New to Xpath so improvements welcome.)

浅暮の光 2024-10-16 21:57:35

试试这个:

preg_match_all("/<span class=\"subline\".*span>/", $html, $matches);

preg_match_all("/<span class=\"noteline\".*span>/", $html, $matches);

我这样尝试了上面的代码:

<?php 

$html = '<span class="noteline">Developer:</span> 
<span class="subline">Gameloft</span> 
<span class="noteline">Genre:</span> 
<span class="subline">Racing/Arcade</span> 
<span class="noteline">Release year:</span> 
<span class="subline">2010</span>';

preg_match_all("/<span class=\"subline\".*span>/", $html, $matches1);

preg_match_all("/<span class=\"noteline\".*span>/", $html, $matches2);

print_r($matches1);
echo "<br>";
print_r($matches2);

?>

我得到的输出是这样的:

Array ( [0] => Array ( [0] => Gameloft [1] => Racing/Arcade [2] => 2010 ) )
Array ( [0] => Array ( [0] => Developer: [1] => Genre: [2] => Release year: ) ) 

Try this:

preg_match_all("/<span class=\"subline\".*span>/", $html, $matches);

preg_match_all("/<span class=\"noteline\".*span>/", $html, $matches);

I tried the above code this way:

<?php 

$html = '<span class="noteline">Developer:</span> 
<span class="subline">Gameloft</span> 
<span class="noteline">Genre:</span> 
<span class="subline">Racing/Arcade</span> 
<span class="noteline">Release year:</span> 
<span class="subline">2010</span>';

preg_match_all("/<span class=\"subline\".*span>/", $html, $matches1);

preg_match_all("/<span class=\"noteline\".*span>/", $html, $matches2);

print_r($matches1);
echo "<br>";
print_r($matches2);

?>

The output I got was this:

Array ( [0] => Array ( [0] => Gameloft [1] => Racing/Arcade [2] => 2010 ) )
Array ( [0] => Array ( [0] => Developer: [1] => Genre: [2] => Release year: ) ) 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文