如何访问简单 DOM 选择器？

发布于 2024-11-04 03:37:11 字数 1298 浏览 6 评论 0原文

我可以使用 a 访问一些“类”项目

$ret = $html->find('articleINfo'); and then print the first key of the returned array.

但是，我还需要其他标签，例如 span=id"firstArticle_0"，但我似乎找不到它。

$ret = $html->find('#span=id[ etc ]');

在某些情况下，会返回一些内容，但它不是数组，或者是带有空键的数组。

不幸的是，我无法使用 var_dump 来查看该对象，因为 var_dump 会产生 1000 页不可读的垃圾。代码如下所示。

<div id="articlething"> 
    <p class="byline">By Lord Byron and <a href="www.marriedtothesea.com">Alister Crowley</a></p> 
    <p> 
    <span class="location">GEORGIA MOUNTAINS, Canada</span> | 
    <span class="timestamp">Fri Apr 29, 2011 11:27am EDT</span> 
    </p> 
</div> 
<span id="midPart_0"></span><span class="mainParagraph"><p><span        class="midLocation">TUSCALOOSA, Alabama</span> - Who invented cheese? Everyone wants to know. They held a big meeting. Tom Cruise is a scientologist. </p> 

</span><span id="midPart_1"></span><p>The president and his family visited Chuck-e-cheese in the morning </p><span id="midPart_2"></span><p>In Russia, 900 people were lost in the balls.</p><span id="midPart_3">

原文

I can access some of the 'class' items with a

$ret = $html->find('articleINfo'); and then print the first key of the returned array.

However, there are other tags I need like span=id"firstArticle_0" and I cannot seem to find it.

$ret = $html->find('#span=id[ etc ]');

In some cases something is returned, but it's not an array, or is an array with empty keys.

Unfortunately I cannot use var_dump to see the object, since var_dump produces 1000 pages of unreadable junk. The code looks like this.

<div id="articlething"> 
    <p class="byline">By Lord Byron and <a href="www.marriedtothesea.com">Alister Crowley</a></p> 
    <p> 
    <span class="location">GEORGIA MOUNTAINS, Canada</span> | 
    <span class="timestamp">Fri Apr 29, 2011 11:27am EDT</span> 
    </p> 
</div> 
<span id="midPart_0"></span><span class="mainParagraph"><p><span        class="midLocation">TUSCALOOSA, Alabama</span> - Who invented cheese? Everyone wants to know. They held a big meeting. Tom Cruise is a scientologist. </p> 

</span><span id="midPart_1"></span><p>The president and his family visited Chuck-e-cheese in the morning </p><span id="midPart_2"></span><p>In Russia, 900 people were lost in the balls.</p><span id="midPart_3">

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

燃情 2024-11-11 03:37:11

可以轻松使用简单的 HTML DOM 来查找具有特定类的跨度。

如果想要所有跨度都带有 class=location 那么：

// create HTML DOM
$html = file_get_html($iUrl);

// get text elements
$aObj = $html->find('span[class=location]');

然后执行类似的操作：

foreach($aObj as $key=>$oValue)
{
   echo $key.": ".$oValue->plaintext."<br />";
}

使用您的示例对我有用我的输出是：

label=span, class=location: Found 1

0: GEORGIA MOUNTAINS, Canada

希望有帮助...请简单的 HTML DOM 非常适合它的用途，并且一旦掌握了它的窍门就很容易使用。继续尝试，您将获得许多可以反复使用的示例。我已经抓取了一些非常疯狂的页面，它们变得越来越容易。

Simple HTML DOM can be used easily to find a span with a specific class.

If want all span's with class=location then:

// create HTML DOM
$html = file_get_html($iUrl);

// get text elements
$aObj = $html->find('span[class=location]');

Then do something like:

foreach($aObj as $key=>$oValue)
{
   echo $key.": ".$oValue->plaintext."<br />";
}

It worked for me using your example my output was:

label=span, class=location: Found 1

0: GEORGIA MOUNTAINS, Canada

Hope that helps... and please Simple HTML DOM is great for what it does and easy to use once you get the hang of it. Keep trying and you will have a number of examples that you just use over and over again. I've scraped some pretty crazy pages and they get easier and easier.

回复收藏 0 原文

深居我梦 2024-11-11 03:37:11

尝试使用这个。对我来说效果很好并且非常容易使用。 http://code.google.com/p/phpquery/

回复收藏 0 原文

紫南 2024-11-11 03:37:11

PHP Simple DOM 解析器上的文档在破译 Open Graph 元标记方面参差不齐。这似乎对我有用：

<?php
// grab the contents of the page
$summary = file_get_html($url);

// Get image possibilities (for example)

$img = array();

// First, if the webpage has an og:image meta tag, it's easy:
if ($summary->find('meta[property=og:image]')) {
  foreach ($summary->find('meta[property=og:image]') as $e) {
    $img[] = $e->attr['content'];
  }
}
?>

The docs on the PHP Simple DOM parser are spotty on deciphering Open Graph meta tags. Here's what seems to work for me:

<?php
// grab the contents of the page
$summary = file_get_html($url);

// Get image possibilities (for example)

$img = array();

// First, if the webpage has an og:image meta tag, it's easy:
if ($summary->find('meta[property=og:image]')) {
  foreach ($summary->find('meta[property=og:image]') as $e) {
    $img[] = $e->attr['content'];
  }
}
?>

回复收藏 0 原文

~没有更多了~