将多个 XML 文件合并为不同格式的单个文件

发布于 2024-11-02 20:33:00 字数 2589 浏览 4 评论 0原文

我有 1200 多个相同格式的 XML,我需要将它们合并到不同格式的单个 XML 文件中。各个文件都位于一个目录中。我正在使用的服务器有 SimpleXML,我尝试使用我在网上找到的一些不同的合并示例(http://www.nicolaskuttler.com/post/merging-and-splitting-xml-files-with-simplexml/,其中之一),但是当我查看“合并”的 XML 文件时,只添加了第一个 XML 文件。我的几次尝试都无法将多个文件“合并”。

各个文件的格式:

<?xml version="1.0" encoding="UTF-8"?>
<pr:press_release xmlns:alf="http://www.alfresco.org" xmlns:chiba="http://chiba.sourceforge.net/xforms" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:pr="http://www.bowl.com/pr" xmlns:xf="http://www.w3.org/2002/xforms" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <pr:headline>TITLE</pr:headline>
        <pr:title>TITLE</pr:title>
        <pr:contact_info xsi:nil="true"/>
        <pr:department>DEPT</pr:department>
        <pr:body>BODY</pr:body>
        <pr:launch_date>YYYY-MM-DD</pr:launch_date>
        <pr:expiration_date>YYYY-MM-DD</pr:expiration_date>
        <pr:category>CATEGORY</pr:category>
        <pr:tags>KEYWORDS</pr:tags>
</pr:press_release>

新文件所需的格式:

<?xml version="1.0" encoding="utf-8"?>
<contents>
  <content>
    <title>TITLE</title>
    <summary></summary>
    <body>
      <root>
        <date></date>
        <author></author>
        <department></department>
        <location></location>
        <story>BODY</story>
      </root>
    </body>
  </content>
</contents>

用于合并两个文件的代码:

<?php
        $file1 = '1027coachintermediate.xml';
        $file2 = '1027coachelite.xml';
        $fileout = 'fileout.xml';       $xml1 = simplexml_load_file( $file1 );
        $xml2 = simplexml_load_file( $file2 );  // loop through the FOO and add them and their attributes to xml1
        foreach( $xml2->FOO as $foo ) {
                $new = $xml1->addChild( 'FOO' , $foo );
                foreach( $foo->attributes() as $key => $value ) {
                        $new->addAttribute( $key, $value );
                }
        }       $fh = fopen( $fileout, 'w') or die ( "can't open file $fileout" );
        fwrite( $fh, $xml1->asXML() );
        fclose( $fh );
?>

I have 1200+ XML in the same format that I need to merge into a single XML file of a different format. The individual files are all located in a single directory. The server I am working on has SimpleXML and I've tried using a few different merge examples I've found online (http://www.nicolaskuttler.com/post/merging-and-splitting-xml-files-with-simplexml/, for one), but when I view the 'merged' XML file, only the first XML file was added to it. I have not been able to get more than one of the files to 'merge' with any of my several attempts.

Format of the individual files:

<?xml version="1.0" encoding="UTF-8"?>
<pr:press_release xmlns:alf="http://www.alfresco.org" xmlns:chiba="http://chiba.sourceforge.net/xforms" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:pr="http://www.bowl.com/pr" xmlns:xf="http://www.w3.org/2002/xforms" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <pr:headline>TITLE</pr:headline>
        <pr:title>TITLE</pr:title>
        <pr:contact_info xsi:nil="true"/>
        <pr:department>DEPT</pr:department>
        <pr:body>BODY</pr:body>
        <pr:launch_date>YYYY-MM-DD</pr:launch_date>
        <pr:expiration_date>YYYY-MM-DD</pr:expiration_date>
        <pr:category>CATEGORY</pr:category>
        <pr:tags>KEYWORDS</pr:tags>
</pr:press_release>

Format needed for new file:

<?xml version="1.0" encoding="utf-8"?>
<contents>
  <content>
    <title>TITLE</title>
    <summary></summary>
    <body>
      <root>
        <date></date>
        <author></author>
        <department></department>
        <location></location>
        <story>BODY</story>
      </root>
    </body>
  </content>
</contents>

Code used to merge two files:

<?php
        $file1 = '1027coachintermediate.xml';
        $file2 = '1027coachelite.xml';
        $fileout = 'fileout.xml';       $xml1 = simplexml_load_file( $file1 );
        $xml2 = simplexml_load_file( $file2 );  // loop through the FOO and add them and their attributes to xml1
        foreach( $xml2->FOO as $foo ) {
                $new = $xml1->addChild( 'FOO' , $foo );
                foreach( $foo->attributes() as $key => $value ) {
                        $new->addAttribute( $key, $value );
                }
        }       $fh = fopen( $fileout, 'w') or die ( "can't open file $fileout" );
        fwrite( $fh, $xml1->asXML() );
        fclose( $fh );
?>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

掩于岁月 2024-11-09 20:33:00

如果这是一项一次性任务,那么您可以将所有文件连接在一起,然后对连接的文件运行一个简单的 XSLT 进程。

1) 用于连接文件的 Shell 脚本

for file in `ls $XMLDIR`
  do
        cat $file | grep -v "xml version" >> big_concat_file.xml
  done

2) 手动编辑连接文件以放置根包装标签。

<document>
   <pr:press-release>
       ....
   </pr:press-release>
   <pr:press-release>
       ...
   </pr:press-release>
</document>

3) 在串联文件上运行 XSLT 文件

If this is a one-time task then you could concatenate all the files together and then run a simple XSLT process on the concatenated file.

1) Shell script to concatenate files

for file in `ls $XMLDIR`
  do
        cat $file | grep -v "xml version" >> big_concat_file.xml
  done

2) Hand edit concat file to put root wrapper tag.

<document>
   <pr:press-release>
       ....
   </pr:press-release>
   <pr:press-release>
       ...
   </pr:press-release>
</document>

3) Run XSLT file on concatenated file

话少心凉 2024-11-09 20:33:00

不太确定您在哪里犯了错误,但下面是应该帮助您根据规范合并文件的脚本:

<?php
$files = array( 'in1.xml', 'in2.xml');

$xml = new SimpleXMLElement(<<<XML
<?xml version="1.0" encoding="utf-8"?>
<contents>
</contents>
XML
);

foreach( $files as $filename) {
    $xml_int = simplexml_load_file( $filename );
    $conts = $xml_int->children('pr',true);
    $content = $xml->addChild( 'content'); // add content
    $content->addChild( 'title',$conts->title); // add first title
    // add the rest of the content insides
    // ...
}
var_export($xml->asXML());
?>

输出

<?xml version="1.0" encoding="utf-8"?>            
<contents><content><title>TITLE1</title></content><content><title>TITLE2</title></content></contents>

请参阅:http://pl.php.net/manual/en/simplexml.examples-basic.php 了解更多信息

另一个问题是你是否真的想保留整个内存中的xml。您只需将 $content->asXML() 一一附加到文件中即可。

No really sure where you made the error, but below is the script which should help you merge files according to specs:

<?php
$files = array( 'in1.xml', 'in2.xml');

$xml = new SimpleXMLElement(<<<XML
<?xml version="1.0" encoding="utf-8"?>
<contents>
</contents>
XML
);

foreach( $files as $filename) {
    $xml_int = simplexml_load_file( $filename );
    $conts = $xml_int->children('pr',true);
    $content = $xml->addChild( 'content'); // add content
    $content->addChild( 'title',$conts->title); // add first title
    // add the rest of the content insides
    // ...
}
var_export($xml->asXML());
?>

output

<?xml version="1.0" encoding="utf-8"?>            
<contents><content><title>TITLE1</title></content><content><title>TITLE2</title></content></contents>

see: http://pl.php.net/manual/en/simplexml.examples-basic.php for more info

The other question is if you really want to keep the whole xml in memory. You can just append the $content->asXML() one by one to the file.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文