使用 PHP 从 HTML 中提取数据

发布于 2024-09-18 13:33:16 字数 1512 浏览 10 评论 0原文

这是我正在寻找的内容:

我有一个以 HTML 格式显示一些数据的链接:

http://www.118.com/people-search.mvc...0&pageNumber=1

数据采用以下格式:

<div class="searchResult regular"> 

Bird John

56 Leathwaite Road
London
SW11 6RS 020 7228 5576

我希望我的 PHP 页面执行上面的 URL,并根据上面的标签从结果 HTML 页面中提取/解析数据,如下所示 h2=姓名 地址=地址 电话号码 = 电话号码

并以表格格式显示它们。

我得到了这个,但它只显示 HTML 页面的文本格式,但在一定程度上有效:

<?
function get_content($url) 
{ 
$ch = curl_init(); 

curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_HEADER, 0); 

ob_start(); 

curl_exec ($ch); 
curl_close ($ch); 
$string = ob_get_contents(); 

ob_end_clean(); 

return $string; 

} 


$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=1"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=2"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=3"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=4"); 
echo $content;

?>

Here is what I am looking for :

I have a Link which displays some data on HTML format :

http://www.118.com/people-search.mvc...0&pageNumber=1

Data comes in below format :

<div class="searchResult regular"> 

Bird John

56 Leathwaite Road
London
SW11 6RS

020 7228 5576

I want my PHP page to execute above URL and Extract/Parse Data from the Result HTML page based on above Tags as
h2=Name
address=Address
telephoneNumber= Phone Number

and Display them in a Tabular Format.

I got this but it only shows the TEXT format of an HTML page but works to an extent:

<?
function get_content($url) 
{ 
$ch = curl_init(); 

curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_HEADER, 0); 

ob_start(); 

curl_exec ($ch); 
curl_close ($ch); 
$string = ob_get_contents(); 

ob_end_clean(); 

return $string; 

} 


$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=1"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=2"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=3"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=4"); 
echo $content;

?>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦醒时光 2024-09-25 13:33:16

您需要使用 dom 解析器 Simple HTML 或类似的

将文件读入 dom 对象并使用适当的选择器对其进行解析:

$html = new simple_html_dom("http://www.118.com/people-search.mvc...0&pageNumber=1");

foreach($html->find(.searchResult+regular) as $div) {
  //parse div contents here to extract name and address etc.
}
$html->clear();
unset($html);

有关详细信息,请参阅简单 HTML 文档。

You need to use a dom parser Simple HTML or similar

The read the file into an dom object and parse it using the appropriate selectors:

$html = new simple_html_dom("http://www.118.com/people-search.mvc...0&pageNumber=1");

foreach($html->find(.searchResult+regular) as $div) {
  //parse div contents here to extract name and address etc.
}
$html->clear();
unset($html);

For more info see the Simple HTML documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文