PHP simpleXML 尝试处理相当复杂的文件

发布于 2024-11-01 09:16:16 字数 7317 浏览 3 评论 0原文

我必须使用的文件具有以下结构:

<?xml version="1.0" encoding="UTF-8" ?>
<FormattedReport xmlns = 'urn:crystal-reports:schemas' xmlns:xsi = 'http://www.w3.org/2000/10/XMLSchema-instance'>
    <FormattedAreaPair Level="0" Type="Report">
    <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
 <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
        </FormattedAreaPair>
        </FormattedReport>

所以我想要做的是调用一个 PHP 函数,该函数将解析 XML 并最终将其存储在 SQL DB 中。

例如:

清单NR:1903 发货日期: 12/04/2011 运营商 ID:TNT03 追踪编号:234234232 ...等等对于每条记录...

所以我开始尝试使用 DOM 来做到这一点,然后偶然发现了 simpleXML,我读了几篇文章,并在这里搜索了实现,但我似乎无法访问数据在最终节点(或任何其他数据)。 simpleXML 是此类结构的禁忌吗?

我正在使用的最新 PHP 是:

<?php

if (file_exists('tracking.xml')) {
    $xml = simplexml_load_file('tracking.xml');

  //  print_r($xml);

   foreach( $xml as $FormattedReport->FormattedAreaPair->FormattedAreaPair ) 
        {
        foreach($FormattedReport as $node->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects)
        echo $node->FormattedReportObject->Value;
        }

} else {
    exit('Failed to open xml');
}
?>

我尝试将其剥离回基础知识,但仍然没有成功。不回显结果。

谢谢你们的宝贵时间!

已解决

任何处于类似情况的人都会有一些方向。

  1. 忽略根节点,这是导入 XML 字符串/文件时的默认 $ 变量
  2. 如果您有嵌套组,首先创建一个到父节点的节点,如下所示 $xml->FormattedAreaPair->FormattedAreaPair as $parentnode
  3. 使用父节点循环遍历所有子项
  4. 如果您有一个属性字段,请按如下方式访问它: (string) $node['FieldName'])
  5. 将检索到的属性与字符串进行比较,然后处理结果。
  6. 别再拔头发了。

    //print_r($xml); foreach( $xml->FormattedAreaPair->FormattedAreaPair as $parentnode ) { foreach($parentnode->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects->FormattedReportObject as $node){ //echo "FormattedValue: ".$node->FormattedValue."
    "; 开关((字符串)$node['FieldName']){ 案例“{tblCon.ManifestNR}”: echo '清单:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.ShippingDate}”: echo '发货日期:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.CarrierID}”: echo '运营商ID:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.CustConRefTX}”: echo '客户参考:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.ServiceCodeTX}”: echo '服务代码:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.TotalWeightNR}”: echo '总权重:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.ValueNR}”: echo '值:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.TotalVolumeNR}”: echo '总体积:'.$node->FormattedValue."
    "; 休息; 案例“{tblCon.GoodsDesc}”: echo '商品描述:'.$node->FormattedValue."
    "; 休息; 案例“{tblConAddr.ReceiverNameTX}”: echo '接收者名称:'.$node->FormattedValue."
    "; 休息; 案例“{@SalesOrder}”: echo '销售订单:'.$node->FormattedValue."
    "; 休息; 案例“{@TrackingReference}”: echo '跟踪参考:'.$node->FormattedValue."
    "; 休息; } } echo "----------------------------
    >"; }
    } 别的 { exit('打开xml失败'); } ?>

The file I have to work with has the following structure:

<?xml version="1.0" encoding="UTF-8" ?>
<FormattedReport xmlns = 'urn:crystal-reports:schemas' xmlns:xsi = 'http://www.w3.org/2000/10/XMLSchema-instance'>
    <FormattedAreaPair Level="0" Type="Report">
    <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
 <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
        </FormattedAreaPair>
        </FormattedReport>

So what I'm trying to do, is call a PHP function which will parse the XML and eventually store it in an SQL DB.

for example:

ManifestNR: 1903
ShippingDate: 12/04/2011
CarrierID: TNT03
TrackingRef: 234234232
... etc for each record ...

so i've set about trying to do this using DOM and then stumbled across simpleXML, I've read several tuts, and searched implementations here but I just can't seem to access the data in the final nodes (or any other data tbh). Is simpleXML a no-no with these kind of structures?

The latest PHP I'm using is:

<?php

if (file_exists('tracking.xml')) {
    $xml = simplexml_load_file('tracking.xml');

  //  print_r($xml);

   foreach( $xml as $FormattedReport->FormattedAreaPair->FormattedAreaPair ) 
        {
        foreach($FormattedReport as $node->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects)
        echo $node->FormattedReportObject->Value;
        }

} else {
    exit('Failed to open xml');
}
?>

I've tried to strip it right back to basics, but still no luck. Doesn't echo a result.

Thanks for your time guys!

SOLVED

Anyone in similar circumstances heres a bit of direction.

  1. ignore the root node, thats your default $variable when you import the XML string/file
  2. If you have nested groups create a node to the parent first like so $xml->FormattedAreaPair->FormattedAreaPair as $parentnode
  3. Using your parent node loop through all the children
  4. If you have an attribute field access it as follows: (string) $node['FieldName'])
  5. Compare the retrieved attribute with a string and then handle the result.
  6. Stop pulling your hair out.

    //print_r($xml);
    foreach( $xml->FormattedAreaPair->FormattedAreaPair as $parentnode ) {
    foreach($parentnode->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects->FormattedReportObject as $node){
    //echo "FormattedValue: ".$node->FormattedValue."<br />";
    switch((string) $node['FieldName']){
    case '{tblCon.ManifestNR}':
    echo 'Manifest: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.ShippingDate}':
    echo 'Shipping Date: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.CarrierID}':
    echo 'Carrier ID: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.CustConRefTX}':
    echo 'Customer Reference: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.ServiceCodeTX}':
    echo 'Service Code: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.TotalWeightNR}':
    echo 'Total Weight: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.ValueNR}':
    echo 'Value: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.TotalVolumeNR}':
    echo 'Total Volume: '.$node->FormattedValue."<br />";
    break;
    case '{tblCon.GoodsDesc}':
    echo 'Goods Description: '.$node->FormattedValue."<br />";
    break;
    case '{tblConAddr.ReceiverNameTX}':
    echo 'Receiver Name: '.$node->FormattedValue."<br />";
    break;
    case '{@SalesOrder}':
    echo 'Sales Order: '.$node->FormattedValue."<br />";
    break;
    case '{@TrackingReference}':
    echo 'Tracking Reference: '.$node->FormattedValue."<br />";
    break;
    }
    }
    echo "---------------------------- <br />";
    }

    }
    else {
    exit('Failed to open xml');
    }

    ?>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

天邊彩虹 2024-11-08 09:16:16

手册中的示例应该足够了(特别是示例#4) )。你看起来是个足够聪明的人。问题是你做错了

example.php

<?php
$xmlstr = <<<XML
<?xml version='1.0' standalone='yes'?>
<movies>
 <movie>
  <title>PHP: Behind the Parser</title>
  <characters>
   <character>
    <name>Ms. Coder</name>
    <actor>Onlivia Actora</actor>
   </character>
   <character>
    <name>Mr. Coder</name>
    <actor>El ActÓr</actor>
   </character>
  </characters>
  <plot>
   So, this language. It's like, a programming language. Or is it a
   scripting language? All is revealed in this thrilling horror spoof
   of a documentary.
  </plot>
  <great-lines>
   <line>PHP solves all my web problems</line>
  </great-lines>
  <rating type="thumbs">7</rating>
  <rating type="stars">5</rating>
 </movie>
</movies>
XML;
?>

示例 #4

<?php
include 'example.php';

$xml = new SimpleXMLElement($xmlstr);

/* For each <character> node, we echo a separate <name>. */
foreach ($xml->movie->characters->character as $character) {
   echo $character->name, ' played by ', $character->actor, PHP_EOL;
}

?>

请注意,在使用 foreach 构造时,您需要指定特定类型节点的路径。 foreach 中的第二项只是一个(空)变量,用于存储迭代中的当前节点。

The examples in the Manual should suffice (Example #4 in particular). You seem like a sufficiently clever fellow. The problem is that you're doing it wrong.

example.php

<?php
$xmlstr = <<<XML
<?xml version='1.0' standalone='yes'?>
<movies>
 <movie>
  <title>PHP: Behind the Parser</title>
  <characters>
   <character>
    <name>Ms. Coder</name>
    <actor>Onlivia Actora</actor>
   </character>
   <character>
    <name>Mr. Coder</name>
    <actor>El ActÓr</actor>
   </character>
  </characters>
  <plot>
   So, this language. It's like, a programming language. Or is it a
   scripting language? All is revealed in this thrilling horror spoof
   of a documentary.
  </plot>
  <great-lines>
   <line>PHP solves all my web problems</line>
  </great-lines>
  <rating type="thumbs">7</rating>
  <rating type="stars">5</rating>
 </movie>
</movies>
XML;
?>

Example #4

<?php
include 'example.php';

$xml = new SimpleXMLElement($xmlstr);

/* For each <character> node, we echo a separate <name>. */
foreach ($xml->movie->characters->character as $character) {
   echo $character->name, ' played by ', $character->actor, PHP_EOL;
}

?>

Notice that when using the foreach construct you need to specify the path to the nodes of a certain type. The second item in the foreach is just an (empty) variable that you use to store the current node in the iteration.

守望孤独 2024-11-08 09:16:16

如何使用 simplexml ( XMLSchema-instance ) 访问 i:nil 等属性:

Xml :

<item i:nil="true"/>

Php :

(bool) $item->attributes('i',true)->nil;

How to access attributes like i:nil with simplexml ( XMLSchema-instance ) :

Xml :

<item i:nil="true"/>

Php :

(bool) $item->attributes('i',true)->nil;
酷炫老祖宗 2024-11-08 09:16:16

我正在处理的文件约为 1GB,因此我无法一次加载全部 xml 文件。
这是我为解析 Crystal Reports XML 所做的 CI 控制器。

<?php

class Parse_crystal_reports_xml extends CI_Controller {

    function index(){
        $base_path = "/path/to/xml/";
        $xml_file = "xml_file.xml";
        $file_header = '<?xml version="1.0" encoding="UTF-8" ?>';
        $separator = '<FormattedAreaPair Level="1" Type="Details">';
        $xml_data = explode($separator, str_replace($file_header, '', file_get_contents($base_path.$xml_file)));
        $bad_names = array('xsi:','xsd:');
        foreach($xml_data as $block_num => $block) : 
            if(!$block_num) : continue; endif;
            $fields = new SimpleXMLElement(str_replace($bad_names, '', $file_header."\n".$separator.$block));
            $temp_array = array();
            foreach($fields->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects->FormattedReportObject as $field_num => $field) :
                // print_r($field);
                $temp_array[$this->make_slug($field['FieldName'])] = $this->clean_word((string)$field->FormattedValue);
            endforeach;
            // print_r($fields);
            print_r($temp_array);
            die;
        endforeach;
    }

    function make_slug($string){
        return strtolower(trim(preg_replace('/\W+/', '_', $string), '_'));
    }

    function clean_word($string){
        return trim(preg_replace('/\s+/', ' ', $string));
    }
}
?>

The file I was dealing with was ~1GB so I couldn't load the xml file all at once.
Here's the CI controller I made to parse the Crystal Reports XML.

<?php

class Parse_crystal_reports_xml extends CI_Controller {

    function index(){
        $base_path = "/path/to/xml/";
        $xml_file = "xml_file.xml";
        $file_header = '<?xml version="1.0" encoding="UTF-8" ?>';
        $separator = '<FormattedAreaPair Level="1" Type="Details">';
        $xml_data = explode($separator, str_replace($file_header, '', file_get_contents($base_path.$xml_file)));
        $bad_names = array('xsi:','xsd:');
        foreach($xml_data as $block_num => $block) : 
            if(!$block_num) : continue; endif;
            $fields = new SimpleXMLElement(str_replace($bad_names, '', $file_header."\n".$separator.$block));
            $temp_array = array();
            foreach($fields->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects->FormattedReportObject as $field_num => $field) :
                // print_r($field);
                $temp_array[$this->make_slug($field['FieldName'])] = $this->clean_word((string)$field->FormattedValue);
            endforeach;
            // print_r($fields);
            print_r($temp_array);
            die;
        endforeach;
    }

    function make_slug($string){
        return strtolower(trim(preg_replace('/\W+/', '_', $string), '_'));
    }

    function clean_word($string){
        return trim(preg_replace('/\s+/', ' ', $string));
    }
}
?>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文