使用 Simple_HTML_Dom 得到错误结果
我正在尝试抓取此网页:http://www.acttab.com.au/interbet/venues?day=today
这是我的代码:
function FindRaceRows($html) {
foreach ($rows = $html->find(
'tr[bgcolor="#ffffff"], tr[bgcolor="#cccccc"]') as
$row);
{
echo $row->plaintext . "END ROW<br />\n";
foreach ($row->find('td[align=center]') as $cell) {
//echo $cell->bgcolor;
//black
if ($cell->bgcolor == "#000000") {
echo "Already run";
}
//blue
if ($cell->bgcolor == "#0000ff") {
echo "Next race for type";
}
//green
if ($cell->bgcolor == "#00cc00") {
echo "Still to jump";
}
//Red
if ($cell->bgcolor == "#cc0000") {
echo "Next race for meeting";
}
foreach ($cell->find('a') as $tag); {
$link = $tag->href;
$eventIx = strpos($link, "mting=");
if ($eventIx != -1) {
$event = substr($link, $eventIx + 6);
//echo $event."<br />\n";
$url =
"http://www.acttab.com.au/interbet/odds?mting="
. $event;
echo $url . "<br />\n";
}
}
}
}
}
$url = "http://www.acttab.com.au/interbet/venues?day=today";
$html = file_get_html($url);
FindRaceRows($html);
但它并没有分隔每一行。 我在行变量中得到一大堆行。
以下是一些输出:(注意“END ROW”没有出现在每行的末尾)
AR MORPHETTVILLE FINE/DEAD R2@ 1:10pm 1 2 3 4 5 6 7 8 BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8 CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7 CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10 DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8 DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8 MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8 NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8 SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8 VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8 XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8 HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10 BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10 MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8 NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6 ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8 XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5 GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11 ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8 SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10 XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6 END ROW
`http://www.acttab.com.au/interbet/odds?mting=XD06000`
BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8 CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7 CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10 DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8 DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8 MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8 NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8 SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8 VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8 XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8 HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10 BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10 MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8 NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6 ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8 XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5 GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11 ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8 SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10 XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6 END ROW
I am trying to scrape this web page: http://www.acttab.com.au/interbet/venues?day=today
Here is my code:
function FindRaceRows($html) {
foreach ($rows = $html->find(
'tr[bgcolor="#ffffff"], tr[bgcolor="#cccccc"]') as
$row);
{
echo $row->plaintext . "END ROW<br />\n";
foreach ($row->find('td[align=center]') as $cell) {
//echo $cell->bgcolor;
//black
if ($cell->bgcolor == "#000000") {
echo "Already run";
}
//blue
if ($cell->bgcolor == "#0000ff") {
echo "Next race for type";
}
//green
if ($cell->bgcolor == "#00cc00") {
echo "Still to jump";
}
//Red
if ($cell->bgcolor == "#cc0000") {
echo "Next race for meeting";
}
foreach ($cell->find('a') as $tag); {
$link = $tag->href;
$eventIx = strpos($link, "mting=");
if ($eventIx != -1) {
$event = substr($link, $eventIx + 6);
//echo $event."<br />\n";
$url =
"http://www.acttab.com.au/interbet/odds?mting="
. $event;
echo $url . "<br />\n";
}
}
}
}
}
$url = "http://www.acttab.com.au/interbet/venues?day=today";
$html = file_get_html($url);
FindRaceRows($html);
But it is not seperating each row. I get a whole bunch of rows in the row variable.
Here is some of the output: (notice how "END ROW" does not appear at the end of each row)
AR MORPHETTVILLE FINE/DEAD R2@ 1:10pm 1 2 3 4 5 6 7 8 BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8 CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7 CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10 DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8 DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8 MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8 NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8 SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8 VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8 XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8 HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10 BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10 MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8 NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6 ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8 XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5 GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11 ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8 SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10 XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6 END ROW
`http://www.acttab.com.au/interbet/odds?mting=XD06000`
BR DOOMBEN FINE/GOOD R3@ 1:30pm 1 2 3 4 5 6 7 8 CR TOOWOOMBA FINE/GOOD R1@ 5:08pm 1 2 3 4 5 6 7 CT OTAKI NZ FINE/HVY R8@ 1:01pm 1 2 3 4 5 6 7 8 9 10 DR TOWNSVILLE FINE/GOOD R3@ 1:15pm 1 2 3 4 5 6 7 8 DT TE RAPA NZ FINE/SLOW R6@ 1:15pm 1 2 3 4 5 6 7 8 MR MOONEE VALLEY OCAST/DEAD R2@ 1:05pm 1 2 3 4 5 6 7 8 NR NEWCASTLE FINE/SLOW R3@ 1:35pm 1 2 3 4 5 6 7 8 SR RANDWICK FINE/HVY R3@ 1:20pm 1 2 3 4 5 6 7 8 VR DONALD FINE/DEAD R3@ 1:25pm 1 2 3 4 5 6 7 8 XR BELMONT FINE/DEAD R1@ 2:25pm 1 2 3 4 5 6 7 8 HARNESS MEETINGS AT GLOBE DERBY FINE/GOOD R1@ 6:13pm 1 2 3 4 5 6 7 8 9 10 BT ALBION PARK FINE/GOOD R1@ 5:23pm 1 2 3 4 5 6 7 8 9 10 MT BALLARAT OCAST/GOOD R1@ 7:02pm 1 2 3 4 5 6 7 8 NT PARKES FINE/FAST R1@ 5:12pm 1 2 3 4 5 6 ST NEWCASTLE FINE/FAST R1@ 6:35pm 1 2 3 4 5 6 7 8 XT GLOUCESTER PARK FINE/GOOD R1@ 8:45pm 1 2 3 4 5 GREYHOUND MEETINGS MD THE MEADOWS FINE/GOOD R1@ 7:20pm 1 2 3 4 5 6 7 8 9 10 11 ND THE GARDENS FINE/GOOD R1@ 5:04pm 1 2 3 4 5 6 7 8 SD WENTWORTH PARK FINE/GOOD R1@ 7:27pm 1 2 3 4 5 6 7 8 9 10 XD CANNINGTON FINE/GOOD R1@ 9:05pm 1 2 3 4 5 6 END ROW
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题不在于
Simple_HTML_Dom
,而实际上在于您的代码。您已在两个
foreach
声明后添加了分号 (;
)。使用
foreach
时,不得在声明后添加分号 (;
)。 考虑以下示例:上面的输出将为
“1234”
。现在让我们看看如果在声明后添加分号 (
;
) 会发生什么:上面的输出将为
"4"
。 原因是 PHP 会执行你的循环,但不会进入大括号。 它将在循环处理完毕后进行处理,并且$value
将保存循环中最后一个可用的值。The problem is not with
Simple_HTML_Dom
but is in fact with your code.You have put semi-colons (
;
) after your twoforeach
declarations.When using
foreach
, you must not put a semi-colon (;
) after your declaration. Consider the following example:The output of the above will be
"1234"
.Now let's see what happens if we put a semi-colon (
;
) after your declaration:The output of the above will be
"4"
. The reason why is that PHP will execute your loop, but will not go in the braces. It will process after the loop has been processed, and$value
will hold the last value available in the loop.