使用 PHP 从 HTML 中提取数据

发布于 2024-09-18 13:33:16 字数 1512 浏览 10 评论 0原文

这是我正在寻找的内容：

我有一个以 HTML 格式显示一些数据的链接：

http://www.118.com/people-search.mvc...0&pageNumber=1

数据采用以下格式：

<div class="searchResult regular">

Bird John

56 Leathwaite Road
London
SW11 6RS 020 7228 5576

我希望我的 PHP 页面执行上面的 URL，并根据上面的标签从结果 HTML 页面中提取/解析数据，如下所示 h2=姓名地址=地址电话号码 = 电话号码

并以表格格式显示它们。

我得到了这个，但它只显示 HTML 页面的文本格式，但在一定程度上有效：

<?
function get_content($url) 
{ 
$ch = curl_init(); 

curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_HEADER, 0); 

ob_start(); 

curl_exec ($ch); 
curl_close ($ch); 
$string = ob_get_contents(); 

ob_end_clean(); 

return $string; 

} 


$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=1"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=2"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=3"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=4"); 
echo $content;

?>

原文

Here is what I am looking for :

I have a Link which displays some data on HTML format :

http://www.118.com/people-search.mvc...0&pageNumber=1

Data comes in below format :

<div class="searchResult regular">

Bird John

56 Leathwaite Road
London
SW11 6RS

020 7228 5576

I want my PHP page to execute above URL and Extract/Parse Data from the Result HTML page based on above Tags as
h2=Name
address=Address
telephoneNumber= Phone Number

and Display them in a Tabular Format.

I got this but it only shows the TEXT format of an HTML page but works to an extent:

<?
function get_content($url) 
{ 
$ch = curl_init(); 

curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_HEADER, 0); 

ob_start(); 

curl_exec ($ch); 
curl_close ($ch); 
$string = ob_get_contents(); 

ob_end_clean(); 

return $string; 

} 


$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=1"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=2"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=3"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=4"); 
echo $content;

?>

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦醒时光 2024-09-25 13:33:16

您需要使用 dom 解析器 Simple HTML 或类似的

将文件读入 dom 对象并使用适当的选择器对其进行解析：

$html = new simple_html_dom("http://www.118.com/people-search.mvc...0&pageNumber=1");

foreach($html->find(.searchResult+regular) as $div) {
  //parse div contents here to extract name and address etc.
}
$html->clear();
unset($html);

有关详细信息，请参阅简单 HTML 文档。

You need to use a dom parser Simple HTML or similar

The read the file into an dom object and parse it using the appropriate selectors:

$html = new simple_html_dom("http://www.118.com/people-search.mvc...0&pageNumber=1");

foreach($html->find(.searchResult+regular) as $div) {
  //parse div contents here to extract name and address etc.
}
$html->clear();
unset($html);

For more info see the Simple HTML documentation.

回复收藏 0 原文

~没有更多了~