使用 Simple_HTML_Dom 得到错误结果

发布于 2024-07-29 18:20:49 字数 4020 浏览 5 评论 0原文

我正在尝试抓取此网页:http://www.acttab.com.au/interbet/venues?day=today

这是我的代码:

function FindRaceRows($html) {
    foreach ($rows = $html->find(
        'tr[bgcolor="#ffffff"], tr[bgcolor="#cccccc"]') as
        $row);
        {
        echo $row->plaintext . "END ROW<br />\n";

        foreach ($row->find('td[align=center]') as $cell) {

            //echo $cell->bgcolor;

            //black
            if ($cell->bgcolor == "#000000") {
                echo "Already run";
            }

            //blue
            if ($cell->bgcolor == "#0000ff") {
                echo "Next race for type";
            }

            //green
            if ($cell->bgcolor == "#00cc00") {
                echo "Still to jump";
            }

            //Red
            if ($cell->bgcolor == "#cc0000") {
                echo "Next race for meeting";
            }

            foreach ($cell->find('a') as $tag); {
                $link = $tag->href;

                $eventIx = strpos($link, "mting=");

                if ($eventIx != -1) {
                    $event = substr($link, $eventIx + 6);
                    //echo $event."<br />\n";
                    $url =
                        "http://www.acttab.com.au/interbet/odds?mting="
                        . $event;

                    echo $url . "<br />\n";
                }
            }
        }
    }
}

$url = "http://www.acttab.com.au/interbet/venues?day=today";
$html = file_get_html($url);

FindRaceRows($html);

但它并没有分隔每一行。 我在行变量中得到一大堆行。

以下是一些输出:(注意“END ROW”没有出现在每行的末尾)

AR MORPHETTVILLE FINE/DEAD R2@ 1:10pm 1 2 3 4 5 6 7 8   BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8   CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7   CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10   DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8   DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8   MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8   NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8   SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8   VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8   XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8     HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10   BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10   MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8   NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6   ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8   XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5     GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11   ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8   SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10   XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6   END ROW
`http://www.acttab.com.au/interbet/odds?mting=XD06000`
BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8   CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7   CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10   DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8   DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8   MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8   NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8   SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8   VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8   XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8     HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10   BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10   MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8   NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6   ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8   XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5     GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11   ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8   SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10   XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6   END ROW

I am trying to scrape this web page: http://www.acttab.com.au/interbet/venues?day=today

Here is my code:

function FindRaceRows($html) {
    foreach ($rows = $html->find(
        'tr[bgcolor="#ffffff"], tr[bgcolor="#cccccc"]') as
        $row);
        {
        echo $row->plaintext . "END ROW<br />\n";

        foreach ($row->find('td[align=center]') as $cell) {

            //echo $cell->bgcolor;

            //black
            if ($cell->bgcolor == "#000000") {
                echo "Already run";
            }

            //blue
            if ($cell->bgcolor == "#0000ff") {
                echo "Next race for type";
            }

            //green
            if ($cell->bgcolor == "#00cc00") {
                echo "Still to jump";
            }

            //Red
            if ($cell->bgcolor == "#cc0000") {
                echo "Next race for meeting";
            }

            foreach ($cell->find('a') as $tag); {
                $link = $tag->href;

                $eventIx = strpos($link, "mting=");

                if ($eventIx != -1) {
                    $event = substr($link, $eventIx + 6);
                    //echo $event."<br />\n";
                    $url =
                        "http://www.acttab.com.au/interbet/odds?mting="
                        . $event;

                    echo $url . "<br />\n";
                }
            }
        }
    }
}

$url = "http://www.acttab.com.au/interbet/venues?day=today";
$html = file_get_html($url);

FindRaceRows($html);

But it is not seperating each row. I get a whole bunch of rows in the row variable.

Here is some of the output: (notice how "END ROW" does not appear at the end of each row)

AR MORPHETTVILLE FINE/DEAD R2@ 1:10pm 1 2 3 4 5 6 7 8   BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8   CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7   CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10   DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8   DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8   MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8   NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8   SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8   VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8   XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8     HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10   BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10   MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8   NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6   ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8   XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5     GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11   ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8   SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10   XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6   END ROW
`http://www.acttab.com.au/interbet/odds?mting=XD06000`
BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8   CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7   CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10   DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8   DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8   MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8   NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8   SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8   VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8   XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8     HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10   BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10   MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8   NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6   ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8   XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5     GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11   ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8   SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10   XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6   END ROW

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

场罚期间 2024-08-05 18:20:49

问题不在于 Simple_HTML_Dom,而实际上在于您的代码。

您已在两个 foreach 声明后添加了分号 (;)。

使用 foreach 时,不得在声明后添加分号 (;)。 考虑以下示例:

$array = array(1,2,3,4);
foreach($array as $value) {
    echo $value;
}

上面的输出将为“1234”

现在让我们看看如果在声明后添加分号 (;) 会发生什么:

$array = array(1,2,3,4);
foreach($array as $value); {
    echo $value;
}

上面的输出将为 "4"。 原因是 PHP 会执行你的循环,但不会进入大括号。 它将在循环处理完毕后进行处理,并且 $value 将保存循环中最后一个可用的值。

The problem is not with Simple_HTML_Dom but is in fact with your code.

You have put semi-colons (;) after your two foreach declarations.

When using foreach, you must not put a semi-colon (;) after your declaration. Consider the following example:

$array = array(1,2,3,4);
foreach($array as $value) {
    echo $value;
}

The output of the above will be "1234".

Now let's see what happens if we put a semi-colon (;) after your declaration:

$array = array(1,2,3,4);
foreach($array as $value); {
    echo $value;
}

The output of the above will be "4". The reason why is that PHP will execute your loop, but will not go in the braces. It will process after the loop has been processed, and $value will hold the last value available in the loop.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文