使用 PHP Simple HTML Dom 解析器遍历表行直到已知元素

发布于 2024-12-10 08:11:54 字数 3014 浏览 0 评论 0原文

好吧,我正在尝试使用 PHP Simple HTML DOM Parser 从此 HTML 表构建 xml feed。

<table>
<tr><td colspan="5"><strong>Saturday October 15 2011</strong></td></tr>

<tr><td>Team 1</td>     <td>vs</td>     <td>Team 7</td> <td>3:00 pm</td></tr>
<tr><td>Team 2</td>     <td>vs</td>     <td>Team 12</td>    <td>3:00 pm</td></tr>
<tr><td>Team 3</td>     <td>vs</td>     <td>Team 8</td> <td>3:00 pm</td></tr>
<tr><td>Team 4</td>     <td>vs</td>     <td>Team 10</td>    <td>3:00 pm</td></tr>
<tr><td>Team 5</td>     <td>vs</td>     <td>Team 11</td>    <td>3:00 pm</td></tr>

<tr><td colspan="5"><strong>Monday October 17 2011</strong></td></tr>

<tr><td>Team 6</td>     <td>vs</td>     <td>Team 9</td> <td>7:45 pm</td></tr>

<tr><td colspan="5"><strong>Saturday October 22 2011</strong></td></tr>

<tr><td>Team 7</td>     <td>vs</td>     <td>Team 12</td>    <td>3:00 pm</td></tr>
<tr><td>Team 1</td>     <td>vs</td>     <td>Team 2</td> <td>3:00 pm</td></tr>
<tr><td>Team 8</td>     <td>vs</td>     <td>Team 4</td> <td>3:00 pm</td></tr>
<tr><td>Team 3</td>     <td>vs</td>     <td>Team 6</td> <td>3:00 pm</td></tr>
<tr><td>Team 9</td>     <td>vs</td>     <td>Team 5</td> <td>3:00 pm</td></td></tr>
<tr><td>Team 10</td>        <td>vs</td>     <td>Team 11</td>    <td>3:00 pm</td></tr>
</table>

我的目标是提取日期,然后提取以下行,直到下一个日期。这样我就可以为每个日期构建一个 XML 节点。

<matchday date="Saturday October 15 2011">
    <fixture>
        <hometeam>Team 1</hometeam>
        <awayteam>Team 7</awayteam>
        <kickoff>3:00 pm</kickoff>
    </fixture>
    <fixture>
        <hometeam>Team 2</hometeam>
        <awayteam>Team 12</awayteam>
        <kickoff>3:00 pm</kickoff>
    </fixture>
</matchday>

我目前已经从 html 中获取了每个日期,并构建了各自的 xml 节点

$dateNodes = $html->find('table tr td[colspan="5"] strong');

foreach($dateNodes as $date){
    echo '<matchday day="'.trim($date->innertext).'">';
    // FIXTURES

    // END FIXTURES
    echo '</matchday>';
}

我将如何获取每个固定装置的球队名称等,直到下一个比赛日日期?

Ok im trying to build an xml feed from this HTML table using PHP Simple HTML DOM Parser.

<table>
<tr><td colspan="5"><strong>Saturday October 15 2011</strong></td></tr>

<tr><td>Team 1</td>     <td>vs</td>     <td>Team 7</td> <td>3:00 pm</td></tr>
<tr><td>Team 2</td>     <td>vs</td>     <td>Team 12</td>    <td>3:00 pm</td></tr>
<tr><td>Team 3</td>     <td>vs</td>     <td>Team 8</td> <td>3:00 pm</td></tr>
<tr><td>Team 4</td>     <td>vs</td>     <td>Team 10</td>    <td>3:00 pm</td></tr>
<tr><td>Team 5</td>     <td>vs</td>     <td>Team 11</td>    <td>3:00 pm</td></tr>

<tr><td colspan="5"><strong>Monday October 17 2011</strong></td></tr>

<tr><td>Team 6</td>     <td>vs</td>     <td>Team 9</td> <td>7:45 pm</td></tr>

<tr><td colspan="5"><strong>Saturday October 22 2011</strong></td></tr>

<tr><td>Team 7</td>     <td>vs</td>     <td>Team 12</td>    <td>3:00 pm</td></tr>
<tr><td>Team 1</td>     <td>vs</td>     <td>Team 2</td> <td>3:00 pm</td></tr>
<tr><td>Team 8</td>     <td>vs</td>     <td>Team 4</td> <td>3:00 pm</td></tr>
<tr><td>Team 3</td>     <td>vs</td>     <td>Team 6</td> <td>3:00 pm</td></tr>
<tr><td>Team 9</td>     <td>vs</td>     <td>Team 5</td> <td>3:00 pm</td></td></tr>
<tr><td>Team 10</td>        <td>vs</td>     <td>Team 11</td>    <td>3:00 pm</td></tr>
</table>

What I am aiming to do is extract the Date and then the following rows up until the next date. so that I can build an XML node as such for each of the dates.

<matchday date="Saturday October 15 2011">
    <fixture>
        <hometeam>Team 1</hometeam>
        <awayteam>Team 7</awayteam>
        <kickoff>3:00 pm</kickoff>
    </fixture>
    <fixture>
        <hometeam>Team 2</hometeam>
        <awayteam>Team 12</awayteam>
        <kickoff>3:00 pm</kickoff>
    </fixture>
</matchday>

I have at present each of the dates from the html and built their respective xml nodes

$dateNodes = $html->find('table tr td[colspan="5"] strong');

foreach($dateNodes as $date){
    echo '<matchday day="'.trim($date->innertext).'">';
    // FIXTURES

    // END FIXTURES
    echo '</matchday>';
}

How would i go about getting the team names etc for each fixture up until the next matchday date?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦萦几度 2024-12-17 08:11:54

相反,如果SimpleHtmlDom(我认为这是一个糟糕的库),你可以使用 XSLT 转换PHP 的本机 XSLT 处理器

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" method="xml"/>
  <xsl:template match="/">
    <matchdays>
      <xsl:for-each select="table/tr[td[@colspan=5]]">
        <matchday>
          <xsl:attribute name="date">
            <xsl:value-of select="td/strong"/>
          </xsl:attribute>
          <xsl:for-each select="following-sibling::tr[
            not(td[@colspan]) and 
            preceding-sibling::tr[td[@colspan]][1] = current()
          ]">
            <fixture>
              <hometeam><xsl:value-of select="td[1]"/></hometeam>
              <awayteam><xsl:value-of select="td[3]"/></awayteam>
              <kickoff><xsl:value-of select="td[4]"/></kickoff>
            </fixture>
          </xsl:for-each>                   
        </matchday>
      </xsl:for-each>
    </matchdays>
  </xsl:template>   
</xsl:stylesheet>

然后只需使用 http://php.net/manual/en/xsltprocessor.transformtoxml.php 将 HTML 转换为 XML:

$xml = new DOMDocument;
$xml->load('YourSourceFile.xml');
$xsl = new DOMDocument;
$xsl->load('YourStyleSheet.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
echo $proc->transformToXML($xml);

Codepad 上的演示


除了使用 XSLT 之外,您还可以使用 PHP 的本机 DOM 扩展来实现:

$xml = new DOMDocument;
$xml->loadHtmlFile('YourHtmlFile.xml');
$xp = new DOMXPath($xml);   
$new = new DOMDocument('1,0', 'utf-8');
$new->appendChild($new->createElement('matchdays'));
foreach ($xp->query('//table/tr/td[@colspan=5]/strong') as $gameDate) {
    $matchDay = $new->createElement('matchday');
    $matchDay->setAttribute('date', $gameDate->nodeValue);
    foreach ($xp->query(
        sprintf(
            '//tr[
                not(td[@colspan]) and
                preceding-sibling::tr[td[@colspan]][1]/td/strong/text() = "%s"
            ]',
            $gameDate->nodeValue
        )
    ) as $gameData) {
        $tds = $gameData->getElementsByTagName('td');
        $fixture = $matchDay->appendChild($new->createElement('fixture'));
        $fixture->appendChild($new->createElement(
            'hometeam', $tds->item(0)->nodeValue)
        );
        $fixture->appendChild($new->createElement(
            'awayteam', $tds->item(2)->nodeValue)
        );
        $fixture->appendChild($new->createElement(
            'kickoff', $tds->item(3)->nodeValue)
        );
    }
    $new->documentElement->appendChild($matchDay);
}
$new->formatOutput = true;
echo $new->saveXML();

Codepad 上的演示

Instead if SimpleHtmlDom (which I believe is a craptaculous library), you can use an XSLT transformation and PHP's native XSLT processor:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" method="xml"/>
  <xsl:template match="/">
    <matchdays>
      <xsl:for-each select="table/tr[td[@colspan=5]]">
        <matchday>
          <xsl:attribute name="date">
            <xsl:value-of select="td/strong"/>
          </xsl:attribute>
          <xsl:for-each select="following-sibling::tr[
            not(td[@colspan]) and 
            preceding-sibling::tr[td[@colspan]][1] = current()
          ]">
            <fixture>
              <hometeam><xsl:value-of select="td[1]"/></hometeam>
              <awayteam><xsl:value-of select="td[3]"/></awayteam>
              <kickoff><xsl:value-of select="td[4]"/></kickoff>
            </fixture>
          </xsl:for-each>                   
        </matchday>
      </xsl:for-each>
    </matchdays>
  </xsl:template>   
</xsl:stylesheet>

Then just use the code given in the example at http://php.net/manual/en/xsltprocessor.transformtoxml.php to transform your HTML to the XML:

$xml = new DOMDocument;
$xml->load('YourSourceFile.xml');
$xsl = new DOMDocument;
$xsl->load('YourStyleSheet.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
echo $proc->transformToXML($xml);

Demo at Codepad


In addition to using XSLT, you can also do it with PHP's native DOM extension:

$xml = new DOMDocument;
$xml->loadHtmlFile('YourHtmlFile.xml');
$xp = new DOMXPath($xml);   
$new = new DOMDocument('1,0', 'utf-8');
$new->appendChild($new->createElement('matchdays'));
foreach ($xp->query('//table/tr/td[@colspan=5]/strong') as $gameDate) {
    $matchDay = $new->createElement('matchday');
    $matchDay->setAttribute('date', $gameDate->nodeValue);
    foreach ($xp->query(
        sprintf(
            '//tr[
                not(td[@colspan]) and
                preceding-sibling::tr[td[@colspan]][1]/td/strong/text() = "%s"
            ]',
            $gameDate->nodeValue
        )
    ) as $gameData) {
        $tds = $gameData->getElementsByTagName('td');
        $fixture = $matchDay->appendChild($new->createElement('fixture'));
        $fixture->appendChild($new->createElement(
            'hometeam', $tds->item(0)->nodeValue)
        );
        $fixture->appendChild($new->createElement(
            'awayteam', $tds->item(2)->nodeValue)
        );
        $fixture->appendChild($new->createElement(
            'kickoff', $tds->item(3)->nodeValue)
        );
    }
    $new->documentElement->appendChild($matchDay);
}
$new->formatOutput = true;
echo $new->saveXML();

Demo at Codepad

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文