无法让 Xpath 只输出一些 TD
如果你知道怎么做的话,这可能真的很容易,但我不知道,在花了几个小时谷歌搜索之后,我不得不问一些“真正的”程序员,因为我显然不是其中之一。
我似乎找不到适合我的教程或代码示例。假设我只想输出“EuroDiesel 10”TR(向下滚动一半即可找到它),然后我只需要 TD 编号 1 和 9 的数据。我该如何做呢?
我还想将输出数据添加到带有日期戳的 SQL DB 中,并每天更新一次。我认为这可以通过 Cron Job 来完成,这是正确的吗?是否应该为我想要从中获取数据的每个价目表做一个工作,或者我可以在单个脚本中完成它(这些站点非常不同)?
首先,我只需要正确的数据。这是我到目前为止所得到的。
<?php
$dom = new DOMDocument;
$date = date("j. F, Y");
libxml_use_internal_errors(true);
$dom->loadHTMLFile('http://www3.statoil.com/mar/kbh00438.nsf/UNID/8C81E46A6EC8BA3BC12578C0002FFF5A?OpenDocument');
libxml_use_internal_errors(false);
$xpath = new DOMXPath($dom);
$aTag = $xpath->query('//p[@class="text"]');
foreach($aTag as $val) {
echo $date, '', $val->plaintext. "". utf8_decode(trim($val->nodeValue, "")) . "<br />\n";
}
?>
我希望你们能帮助我,只是在这里学习......
谢谢! 艺术
It's probably really easy if you know how, but I don't, and after spending hours Googling it I have to ask some "real" programmers, as I'm obviously not one.
I can't seem to find a tutorial or a code example that'll work for me. Let's say I just wanted to output the "EuroDiesel 10" TR (scroll halfway down to find it) and then I only want data from TD number 1 and 9. How would I go about doing that?
I also want to add the output data to a SQL DB with a date stamp as well as update it once a day. I assume this can be done with a Cron Job, is this correct and should a make a job for each price list I want to harvest data from or could I do it in a single script (the sites are very different)?
First of all I just need the correct data. This is what I got so far.
<?php
$dom = new DOMDocument;
$date = date("j. F, Y");
libxml_use_internal_errors(true);
$dom->loadHTMLFile('http://www3.statoil.com/mar/kbh00438.nsf/UNID/8C81E46A6EC8BA3BC12578C0002FFF5A?OpenDocument');
libxml_use_internal_errors(false);
$xpath = new DOMXPath($dom);
$aTag = $xpath->query('//p[@class="text"]');
foreach($aTag as $val) {
echo $date, '', $val->plaintext. "". utf8_decode(trim($val->nodeValue, "")) . "<br />\n";
}
?>
I hope you guys can help me out, just learning here...
Thanks!
Art
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
至于XPath,我认为
/html/body/form/table/tbody/tr[normalize-space(td[1]) = 'EuroDiesel 10']/td[position() = 1 或position() = 9]
应该可以。然后访问$val->textContent
而不是 nodeValue。As for the XPath, I think
/html/body/form/table/tbody/tr[normalize-space(td[1]) = 'EuroDiesel 10']/td[position() = 1 or position() = 9]
should do. Then access$val->textContent
instead of nodeValue.