PHP、XML、访问属性

发布于 2024-11-04 22:29:24 字数 1949 浏览 6 评论 0原文

我在访问 XML 中的属性时遇到一些问题。我的代码如下。最初我有两个循环,并且没有任何问题。

我首先获取图像名称,然后使用第二个循环获取故事标题和故事详细信息。然后将所有内容插入数据库。我想整理一下代码并只使用一个循环。我的图像名称存储在 Href 属性中。 ()

XML 布局示例 (http://pastie.org/1850682)。 XML 布局有点混乱,这就是使用两个循环的原因。

$xml = new SimpleXMLElement('entertainment/Showbiz.xml', null, true);

    // Get story images
    //$i=0;
    //$image = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsComponent/NewsComponent/ContentItem');
  //  foreach($image as $imageNode){
    //  $attributeArray = $imageNode->attributes(); 
    //  if ($attributeArray != ""){
    //      $imageArray[$i] = $attributeArray;
    //      $i++;
    //  }
    //}

// Get story header & detail
$i=0;
$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
    //$dbImage = $imageArray[$i]['Href'];
    foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
        $strDetail = "";
        foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
            $strDetail .= '<p>'.$detail.'</p>';
            foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){
                $dbImage = $imageNode->attributes();    
            }
        }

        $link = getUnique($headline);

        $sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
        if (mysql_query($sql, $db) or die(mysql_error())){
            echo "Loaded ";
        }else{
            echo "Not Loaded "; 
        }

    }
    $i++;
}

我想我已经接近实现它了。我尝试在第四个嵌套的 foreach 循环中放入一些 echo 语句,但没有任何结果。所以它不执行该循环。我已经在这个问题上呆了几个小时了,也用谷歌搜索了,只是无法找到它。

如果所有其他方法都失败了,我将返回使用两个循环。

问候, 斯蒂芬

I'm having a some trouble accessing attributes in my XML. My code is below. Initially I had two loops and this was working with no problems.

I would first get the image names and then use the second loop to get the story heading and story details. Then insert everything into the database. I want to tidy up the code and use only one loop. My image name is store in the Href attribute. ()

Sample XML layout (http://pastie.org/1850682). The XML layout is a bit messy so that was the reason for using two loops.

$xml = new SimpleXMLElement('entertainment/Showbiz.xml', null, true);

    // Get story images
    //$i=0;
    //$image = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsComponent/NewsComponent/ContentItem');
  //  foreach($image as $imageNode){
    //  $attributeArray = $imageNode->attributes(); 
    //  if ($attributeArray != ""){
    //      $imageArray[$i] = $attributeArray;
    //      $i++;
    //  }
    //}

// Get story header & detail
$i=0;
$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
    //$dbImage = $imageArray[$i]['Href'];
    foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
        $strDetail = "";
        foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
            $strDetail .= '<p>'.$detail.'</p>';
            foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){
                $dbImage = $imageNode->attributes();    
            }
        }

        $link = getUnique($headline);

        $sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
        if (mysql_query($sql, $db) or die(mysql_error())){
            echo "Loaded ";
        }else{
            echo "Not Loaded "; 
        }

    }
    $i++;
}

I think I'm close to getting it. I tried putting a few echo statements in the fourth nested foreach loop, but nothing was out. So its not executing that loop. I've been at this for a few hours and googled as well, just can't manage to get it.

If all else fails, I'll just go back to using two loops.

Regards,
Stephen

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

污味仙女 2024-11-11 22:29:24

这很难理解。我简化了结构,以便我们可以看到我们关心的层次结构部分。

simplified XML hierarchy

看起来,具有 Duid 属性的 NewsComponent 定义/包含一个完整的新闻片段。在其两个子级中,第一个子级 NewsComponent 包含摘要和文本,而第二个子级 NewsComponent 包含图像。

您的初始 XPath 查询针对 'NewsItem/NewsComponent/NewsComponent/NewsComponent',它是第一个 NewsComponent 子组件(带有正文文本的组件)。您无法从该点找到该图像,因为该图像不在该 NewsComponent 内;你已经走得太深了。 (我得到一个 PHP 注意:未定义的变量:dbImage,这一事实让我得到了提示。)因此,将初始 XPath 查询降低一个级别,并在需要时将该额外级别添加到后续 XPath 查询中。

从这样:

$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
  foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
    foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
      foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}

到这样:

$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent');
foreach($story as $contentItem){
  foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
    foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p') as $detail){
      foreach($contentItem->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}

但是,此后图像仍然不起作用。由于您使用循环(有时不必要),$dbImage 会被重新分配给空字符串。第一个 ContentItem 具有 Href 属性,该属性被分配给 $dbImage。但随后它会循环到下一个 ContentItem,该 ContentItem 没有属性,因此会用空值覆盖 $dbImage。我建议修改该 XPath 查询以仅查找具有 Href 属性的 ContentItem,如下所示:

->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[@Href]')

这样就可以了。


其他想法

如果可能的话,重构以清理此代码。

正如我所提到的,有时您会在不需要时进行循环和嵌套,最终会变得更难以遵循,并可能引入逻辑错误(如图所示)。看来这个文件的结构永远是一致的。如果是这样,您可以放弃一些循环,直接查找您要查找的数据。你可以这样做:

// Get story header & detail
$stories = $xml->xpath('/NewsML/NewsItem/NewsComponent/NewsComponent');
foreach ($stories as $story) {
    $headlineItem = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1');
    $headline = $headlineItem[0];

    $detailItems = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p');
    $strDetail = '<p>' . implode('</p><p>', $detailItems) . '</p>';

    $imageItem = $story->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[@Href]');
    $imageAtts = $imageItem[0]->attributes();
    $dbImage = $imageAtts['Href'];

    $link = getUnique($headline);

    $sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
    if (mysql_query($sql, $db) or die(mysql_error())) {
        echo "Loaded ";
    } else {
        echo "Not Loaded "; 
    }
}

This was pretty difficult to follow. I've simplified the structure so we can see the parts of the hierarchy we care about.

simplified XML hierarchy

It appears that the NewsComponent that has a Duid attribute is what defines/contains one complete news piece. Of its two children, the first child NewsComponent contains the summary and text, while the second child NewsComponent contains the image.

Your initial XPath query is for 'NewsItem/NewsComponent/NewsComponent/NewsComponent', which is the first NewsComponent child (the one with the body text). You can't find the image from that point because the image isn't within that NewsComponent; you've gone one level too deep. (I was tipped off by the fact I got a PHP Notice: Undefined variable: dbImage.) Thus, drop your initial XPath query back a level, and add that extra level to your subsequent XPath queries where needed.

From this:

$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
  foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
    foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
      foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}

to this:

$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent');
foreach($story as $contentItem){
  foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
    foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p') as $detail){
      foreach($contentItem->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}

However, the image still doesn't work after that. Because you're using loops (sometimes unnecessarily), $dbImage gets reassigned to an empty string. The first ContentItem has the Href attribute, which gets assigned to $dbImage. But then it loops to the next ContentItem, which has no attributes and therefore overwrites $dbImage with an empty value. I'd recommend modifying that XPath query to find only ContentItems that have an Href attribute, like this:

->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[@Href]')

That should do it.


Other thoughts

Refactor to clean up this code, if/where possible.

As I mentioned, sometimes you are looping and nesting when you don't need to, and it just ends up being harder to follow and potentially introducing logical bugs (like the image one). It seems that the structure of this file will always be consistent. If so, you can forgo some looping and go straight for the pieces of data you're looking for. You could do something like this:

// Get story header & detail
$stories = $xml->xpath('/NewsML/NewsItem/NewsComponent/NewsComponent');
foreach ($stories as $story) {
    $headlineItem = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1');
    $headline = $headlineItem[0];

    $detailItems = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p');
    $strDetail = '<p>' . implode('</p><p>', $detailItems) . '</p>';

    $imageItem = $story->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[@Href]');
    $imageAtts = $imageItem[0]->attributes();
    $dbImage = $imageAtts['Href'];

    $link = getUnique($headline);

    $sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
    if (mysql_query($sql, $db) or die(mysql_error())) {
        echo "Loaded ";
    } else {
        echo "Not Loaded "; 
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文