DomDoc/SimpleXML/XSLT:解析以向元素的每个唯一元素子元素添加自动递增 id 属性

发布于 2024-11-26 15:30:24 字数 15194 浏览 1 评论 0原文

我已经解决这个问题有一段时间了,而且我对编程有点陌生。即使我发现错误,也很难弄清楚如何纠正它。现在,我正在试图弄清楚我是如何错误地使用xpath的,因为有人告诉我我使用了错误的xpath。我希望有人可以告诉我我做错了什么,特别是迭代,如果我做错了什么,可以给我一个快速的开始。这是我参与这个项目的最后一晚,如果可以的话,我真的很想完成它。所以,我真的需要帮助。这是我正在使用的代码,带有注释:

$xml = @simplexml_load_file("original.xml"); //Loading the original file, dubbed original.xml.
$array_key_target_parent = count($xml->xpath('/doc/*'); //Puts all of the children of <doc> into an _iterable_ array.
$key_targets = foreach($array_key_target_parent;){
  foreach($array_key_target_parent as $single_target){ // I tried foreach($array_key_target_parent[$i]).  It doesn't work, so don't even go there.
    $current_target = current($single_target);
    count($xml->xpath('/doc/$current_target/*');
  }
} */ ////Puts the targets for keying into iterable arrays.  =>1 makes the array start from 1, so the id's will be right.


/* At this point, we have multiple elements that we want to key, each having a unique name.  There's <element_type1a> and <element_type1b>, etc.  We want each one to have its own id set.  So, we have to embed iteration within iteration. */
foreach($key_target){ //This will ensure that every unique element that we want to key gets its key set.
  {
  $id = current($key_target=>1); //This allows us to reset the id to 1 (=>1), each time the key algorithm starts for a new element.
  foreach($key_target as $id){ //I tried for($i=0, $key_target[$i]; $i>$key_target; $i++), and it didn't work, so don't even go there.
    addAttribute('id', '$id');
  }
}  //Adds an 'id' attribute and a unique number to each target.

$xml->asXML("new.xml"); //saves the output as a new xml document, new.xml

我还有一个通用的 XML 文件:

<doc>
    <info_type1>
        <element_type1a>not_unique_data</element_type1a>
        <element_type1b>unique_data</element_type1b>
        <element_type2a>not_unique_data</element_type2a>
        <element_type2b>not_unique_data</element_type2b>
        <element_type2c lang="fr">not_unique_data</element_type2c>
        <!-- ... --->
        <element_typeNxM>unique_data</element_typeNxM>
    </info_type1>
    <info_type2>
        <element_type1a>repeat_data(info_type1_element1a)</element_type1a>
        <element_type2a>not_unique_data</element_type2a>
    </info_type2>
    <!-- ... --->
    <info_typeN>
        <descendants></descendants>
    </info_typeN>
</doc>

所需的输出:

<datatables>
    <table id="element_type1">
        <element_type1a id="1">unique_data</element_type1a>
        <element_type1b id="2">unique_data</element_type1b>
        <!-- ... --->
        <element_type1N id="M">unique_data</element_type1N>
    </table>
    <table id="element_type2">
        <element_type2a id="1">unique_data</element_type2a>
        <element_type2b id="2">unique_data</element_type2b>
        <!-- ... --->
        <element_type2N id="M">unique_data</element_type2N>
    </table>
    <table id="element_type2_fr">
        <element_type2a lang="fr" id="1">unique_data</element_type2a>
        <element_type2b lang="fr" id="2">unique_data</element_type2>
        <!-- ... (there are five languages) --->
        <element_type2N lang="fr" id="M">unique_data</element_type2N>
    </table>
    <!-- ... --->
    <table id="element_typeN">
        <descendants></descendants>
    </table>
</datatables>

<intermediary_tables>
    <table id="intermediary_table_type1xtype2">
        <element id="1">
            <type1ID>1</type1ID>
            <type2ID>1</type2ID>
        </element>
        <element id="2">
            <type1ID>1</type1ID>
            <type2ID>2</type2ID>
        </element>
        <element id="3">
            <type1ID>2</type1ID>
            <type2ID>1</type2ID>
        </element>
        <element id="4">
            <type1ID>2</type1ID>
            <type2ID>2</type2ID>
        </element>
        <!-- ... --->
        <element id="N">
            <type1ID>M</type1ID>
            <type2ID>Z</type2ID>
        </element_type2N>
    </table>

    <table id="intermediary_table_typeMxtypeN">
        <descendants></descendants>
    </table>
</intermediary_tables>

也看到了许多非常相似的问题,我从他们那里收集了一些资源并阅读:

这些是最有用的链接:

我发现这些问题的应用程序都无法产生我正在尝试的结果来实现。不过,capcourse.com 链接是个例外。它面向已毕业的 CS 受众,他们似乎在做同样的事情,只不过他们使用的 ID 不是自动增量的。他们使用的算法极其复杂,而且他们根本没有对他们的代码进行注释。由于某种原因,他们在其名称空间内使用名称空间,即使它是我能找到的最接近的名称空间,我也无法复制它。


更新

从 XML 文档中提取的真实世界摘录,我想解析该文档以更改数据结构:

<?xml version="1.0"?>
<!DOCTYPE catalog [
<!ELEMENT catalog (entry*)>
<!ELEMENT entry (ent_seq, country*, arist+, info?, title+)><!-- Entries consist of the name of the album, artist, and more information about the CD.  Each entry must contain an artist and an album title. -->
<!ELEMENT ent_seq (#PCDATA)><!-- A unique numeric sequence, showing the entry number -->
<!ELEMENT title (#PCDATA)><!-- The title of the album/the album name. -->
<!ELEMENT artist (band+, name, nickname*)><!-- The name of the band, and if there was a famous artist, his name and nickname.  Must contain a band element. -->
<!ELEMENT band (#PCDATA)><!-- The name of the band. -->
<!ELEMENT name (#PCDATA)><!-- The name of any famous artist in the band. -->
<!ELEMENT nickname (#PCDATA)><!-- The nickname of the popular artist that precedes the nickname element, from the band. -->
<!ELEMENT country (#PCDATA)><!-- Specifies countries where the album was released -->
<!ELEMENT company (name, country)><!-- Company/producer info.  The company's name is in the name element, and the country where the company originated is in the country element. -->
<!ELEMENT name (#PCDATA)><!-- The name of the producer -->
<!ELEMENT country (#PCDATA)><!-- The country where the company does its primary business -->
<!ELEMENT year (#PCDATA)><!-- The year of the album's release -->
<!ELEMENT info (link*, bibl*)><!-- Additional info, including links and bibliography information -->
<!ELEMENT link (#PCDATA)><!-- Links where people can read more about the album -->
<!ELEMENT bibl (#PCDATA)><!-- Bibliography text about the artist -->
]>
<catalog>
  <cd>
    <ent_seq>1</ent_seq>
    <title>For Your Love</title>
    <artist>
      <name>The Yardbirds</name>
      <name>Eric Clapton</name>
      <nickname>Slowhand</nickname>
    </artist>
    <country>USA</country>
    <country>UK</country>
    <company>
      <name>Sweet Music</name>
      <country>USA</country>
    </company>
    <year>1965</year>
    <info>
      <link>http://en.wikipedia.org/wiki/For_Your_Love</link>
    </info>
  </cd>
  <cd>
    <ent_seq>2</ent_seq>
    <title>Splish Splash</title>
    <artist>
      <name>Roberto Carlos</name>
      <nickname>The King</nickname>
    </artist>
    <country>USA</country>
    <country>Brazil</country>
    <country>Italy</country>
    <company>
      <name>Sweet Music</name>
    <country>Brazil</country>
    </company>
    <year>1965</year>
  </cd>
  <cd>
    <ent_seq>3</ent_seq>
    <title>How Great Thuo Art</title>
    <artist>
      <name>Elvis Presley</name>
      <nickname>The King</nickname>
      <nickname>The King of Rock 'n Roll</nickname>
    </artist>
    <country>USA</country>
    <country>Canada</country>
    <country>UK</country>
    <company>
      <name>Felton Jarvis</name>
      <country>USA</country>
    </company>
    <year>1965</year>
  </cd>
  <cd>
    <ent_seq>4</ent_seq>
    <title>Big Willie style</title>
    <artist>
      <band>Will Smith</band>
      <name>Will Smith</name>
    </artist>
    <country>USA</country>
    <company>Columbia</company>
    <year>1997</year>
  </cd>
  <cd>
    <ent_seq>5</ent_seq>
    <title>Empire Burlesque</title>
    <artist>
      <band>Bob Dylan and Boby Rockhammer</band>
      <name>Bob Dylan</name>
      <name>Boby Rockhammer</name>
    </artist>
    <country>USA</country>
    <country>India</country>
    <company>Columbia</company>
    <year>1985</year>
  </cd>
  <cd>  <!-- Update part 1: New Entry -->
    <ent_seq>6</ent_seq>
    <title>Merry Christmas</title>
    <title>White Christmas</title>
    <artist>
      <name>Bing Crosby</name>
    <artist>
    <country>USA</country>
    <company>MCA Records</company>
    <year>1995</year>
  </cd> <!-- End update part 1-->
</catalog>

所需输出示例的真实世界示例:

<datatable>
  <table id="album title">
    <title id="1">For your Love</title>
    <title id="2">Splish Splash</title>
    <title id="3">How Great Thuo Art</title>
    <title id="4">Big Willie style</title>
    <title id="5">Empire Burlesque</title>
    <title id="6">Merry Christmas</title> <!-- Update part 2: New output -->
    <title id="7">White Christmas</title> <!-- Update part 2: New output -->
  </table>
  <table id="Band Name">
    <artist id="1">The Yardbirds</artist>
    <artist id="2">Roberto Carlos</artist>
    <artist id="3">Elvis Presley</artist>
    <artist id="4">Will Smith</artist>
    <artist id="5">Bob Dylan and Boby Rockhammer</artist>
    <artist id="6"> <!-- Update part 2: New output -->
  </table>
  <table id="artist name">
    <artist id="1">Eric Clapton</artist>
    <artist id="2">Roberto Carlos</artist>
    <artist id="3">Elvis Presley</artist>
    <artist id="4">Will Smith</artist>
    <artist id="5">Bob Dylan</artist>
    <artist id="6">Boby Rockhammer</artist>
    <artist id="7">Bing Crosby</artist> <!-- Update part 2: New output -->
  </table>
  <table id="nickname">
    <nickname id="1">Slowhand</nickname>
    <nickname id="2">The King</nickname>
    <nickname id="3">The King of Rock 'n Roll</nickname>
  </table>
</datatable>

以及

<intermediarytable>
  <table id="artist by band name">
    <entry id="1">
      <band_id>1</band_id>
      <artist_id>1</artist_id>
    </entry>
    <entry id="2">
      <band_id>2</band_id>
      <artist_id>2</artist_id>
    </entry>
    <entry id="3">
      <band_id>3</band_id>
      <artist_id>3</artist_id>
    </entry>
    <entry id="4">
      <band_id>4</band_id>
      <artist_id>4</artist_id>
    </entry>
    <entry id="5">
      <band_id>5</band_id>
      <artist_id>5</artist_id>
    </entry>
    <entry id="6">
      <band_id>5</band_id>
      <artist_id>6</artist_id>
    </entry>
    <entry id="7">
      <band_id>6</band_id>
      <artist_id>7</artist_id>
    </entry>
  </table>
  <table id="artist by nickname">
    <entry id="1">
      <artist_id>1</artist_id>
      <nickname_id>1</artist_id>
    </entry>
    <entry id="2">
      <artist_id>2</artist_id>
      <nickname_id>2</nickname_id>
    </entry>
    <entry id="3">
      <artist_id>2</artist_id>
      <nickname_id>3</nickname_id>
    </entry>
    <entry id="4">
      <artist_id>3</artist_id>
      <nickname_id>3</nickname_id>
    </entry>
  </table>
</intermediarytable>

--更新-- 存在问题哪两个元素共享相同的条目 ID

在我的另一个 XML 文档中,

<entry id="1">
  <word>blue</word>
  <word>beryl</word>
  <word lang="SP">azul</word>
</entry>

我希望输出为

数据表:

<table id="en">
  <word lang="en" id="0">blue</word>
  <word lang="en" id="1">beryl</word>
</table>
<table id="sp">
  <word lang="sp" id="0">azul</word>
</table>

中间表:

<table id="translation id">
  <en_sp id="0"> <!-- en_sp means English-to-Spanish -->
    <en>0</en>
    <sp>0</sp>
  </en_sp>
  <en_sp>
    <en>1</en>
    <sp>0</sp>
  </en_sp>
</table>

I have been troubleshooting this for a while, and I am kind of new to programming. Even when I find an error, it's very difficult to figure out how to correct it. Right now, I am trying to figure out how I have used xpath wrong because someone told me that I am using xpath wrong. I hope someone can give me a jump start by telling me what I am doing wrong, specifically with iterating, if I am doing anything wrong. This is my last night to work on this project, and I really want to finish it if I can. So, I could really use help. Here is the code I am using, with comments:

$xml = @simplexml_load_file("original.xml"); //Loading the original file, dubbed original.xml.
$array_key_target_parent = count($xml->xpath('/doc/*'); //Puts all of the children of <doc> into an _iterable_ array.
$key_targets = foreach($array_key_target_parent;){
  foreach($array_key_target_parent as $single_target){ // I tried foreach($array_key_target_parent[$i]).  It doesn't work, so don't even go there.
    $current_target = current($single_target);
    count($xml->xpath('/doc/$current_target/*');
  }
} */ ////Puts the targets for keying into iterable arrays.  =>1 makes the array start from 1, so the id's will be right.


/* At this point, we have multiple elements that we want to key, each having a unique name.  There's <element_type1a> and <element_type1b>, etc.  We want each one to have its own id set.  So, we have to embed iteration within iteration. */
foreach($key_target){ //This will ensure that every unique element that we want to key gets its key set.
  {
  $id = current($key_target=>1); //This allows us to reset the id to 1 (=>1), each time the key algorithm starts for a new element.
  foreach($key_target as $id){ //I tried for($i=0, $key_target[$i]; $i>$key_target; $i++), and it didn't work, so don't even go there.
    addAttribute('id', '$id');
  }
}  //Adds an 'id' attribute and a unique number to each target.

$xml->asXML("new.xml"); //saves the output as a new xml document, new.xml

I also have a generic XML file:

<doc>
    <info_type1>
        <element_type1a>not_unique_data</element_type1a>
        <element_type1b>unique_data</element_type1b>
        <element_type2a>not_unique_data</element_type2a>
        <element_type2b>not_unique_data</element_type2b>
        <element_type2c lang="fr">not_unique_data</element_type2c>
        <!-- ... --->
        <element_typeNxM>unique_data</element_typeNxM>
    </info_type1>
    <info_type2>
        <element_type1a>repeat_data(info_type1_element1a)</element_type1a>
        <element_type2a>not_unique_data</element_type2a>
    </info_type2>
    <!-- ... --->
    <info_typeN>
        <descendants></descendants>
    </info_typeN>
</doc>

Desired output:

<datatables>
    <table id="element_type1">
        <element_type1a id="1">unique_data</element_type1a>
        <element_type1b id="2">unique_data</element_type1b>
        <!-- ... --->
        <element_type1N id="M">unique_data</element_type1N>
    </table>
    <table id="element_type2">
        <element_type2a id="1">unique_data</element_type2a>
        <element_type2b id="2">unique_data</element_type2b>
        <!-- ... --->
        <element_type2N id="M">unique_data</element_type2N>
    </table>
    <table id="element_type2_fr">
        <element_type2a lang="fr" id="1">unique_data</element_type2a>
        <element_type2b lang="fr" id="2">unique_data</element_type2>
        <!-- ... (there are five languages) --->
        <element_type2N lang="fr" id="M">unique_data</element_type2N>
    </table>
    <!-- ... --->
    <table id="element_typeN">
        <descendants></descendants>
    </table>
</datatables>

and

<intermediary_tables>
    <table id="intermediary_table_type1xtype2">
        <element id="1">
            <type1ID>1</type1ID>
            <type2ID>1</type2ID>
        </element>
        <element id="2">
            <type1ID>1</type1ID>
            <type2ID>2</type2ID>
        </element>
        <element id="3">
            <type1ID>2</type1ID>
            <type2ID>1</type2ID>
        </element>
        <element id="4">
            <type1ID>2</type1ID>
            <type2ID>2</type2ID>
        </element>
        <!-- ... --->
        <element id="N">
            <type1ID>M</type1ID>
            <type2ID>Z</type2ID>
        </element_type2N>
    </table>

    <table id="intermediary_table_typeMxtypeN">
        <descendants></descendants>
    </table>
</intermediary_tables>

I have also seen many very similar questions asked, and I have some resources that I gathered from them and read:

These are the most useful links:

And I found that none of the applications of the questions were able to produce the result I am trying to achieve. The exception, though, is the capcourse.com link. It's geared towards a graduated CS audience, and it seems like they're doing the same thing, except the ID's they are using aren't autoincrementing. The algorithm they use is extremely complex, and they haven't commented their code at all. They're using a namespace within their namespace for some reason, and even though it's the closest I can find, I can't reproduce it in the slightest.


Update

Real-world extract from an XML document that I would like to parse to change the data structure:

<?xml version="1.0"?>
<!DOCTYPE catalog [
<!ELEMENT catalog (entry*)>
<!ELEMENT entry (ent_seq, country*, arist+, info?, title+)><!-- Entries consist of the name of the album, artist, and more information about the CD.  Each entry must contain an artist and an album title. -->
<!ELEMENT ent_seq (#PCDATA)><!-- A unique numeric sequence, showing the entry number -->
<!ELEMENT title (#PCDATA)><!-- The title of the album/the album name. -->
<!ELEMENT artist (band+, name, nickname*)><!-- The name of the band, and if there was a famous artist, his name and nickname.  Must contain a band element. -->
<!ELEMENT band (#PCDATA)><!-- The name of the band. -->
<!ELEMENT name (#PCDATA)><!-- The name of any famous artist in the band. -->
<!ELEMENT nickname (#PCDATA)><!-- The nickname of the popular artist that precedes the nickname element, from the band. -->
<!ELEMENT country (#PCDATA)><!-- Specifies countries where the album was released -->
<!ELEMENT company (name, country)><!-- Company/producer info.  The company's name is in the name element, and the country where the company originated is in the country element. -->
<!ELEMENT name (#PCDATA)><!-- The name of the producer -->
<!ELEMENT country (#PCDATA)><!-- The country where the company does its primary business -->
<!ELEMENT year (#PCDATA)><!-- The year of the album's release -->
<!ELEMENT info (link*, bibl*)><!-- Additional info, including links and bibliography information -->
<!ELEMENT link (#PCDATA)><!-- Links where people can read more about the album -->
<!ELEMENT bibl (#PCDATA)><!-- Bibliography text about the artist -->
]>
<catalog>
  <cd>
    <ent_seq>1</ent_seq>
    <title>For Your Love</title>
    <artist>
      <name>The Yardbirds</name>
      <name>Eric Clapton</name>
      <nickname>Slowhand</nickname>
    </artist>
    <country>USA</country>
    <country>UK</country>
    <company>
      <name>Sweet Music</name>
      <country>USA</country>
    </company>
    <year>1965</year>
    <info>
      <link>http://en.wikipedia.org/wiki/For_Your_Love</link>
    </info>
  </cd>
  <cd>
    <ent_seq>2</ent_seq>
    <title>Splish Splash</title>
    <artist>
      <name>Roberto Carlos</name>
      <nickname>The King</nickname>
    </artist>
    <country>USA</country>
    <country>Brazil</country>
    <country>Italy</country>
    <company>
      <name>Sweet Music</name>
    <country>Brazil</country>
    </company>
    <year>1965</year>
  </cd>
  <cd>
    <ent_seq>3</ent_seq>
    <title>How Great Thuo Art</title>
    <artist>
      <name>Elvis Presley</name>
      <nickname>The King</nickname>
      <nickname>The King of Rock 'n Roll</nickname>
    </artist>
    <country>USA</country>
    <country>Canada</country>
    <country>UK</country>
    <company>
      <name>Felton Jarvis</name>
      <country>USA</country>
    </company>
    <year>1965</year>
  </cd>
  <cd>
    <ent_seq>4</ent_seq>
    <title>Big Willie style</title>
    <artist>
      <band>Will Smith</band>
      <name>Will Smith</name>
    </artist>
    <country>USA</country>
    <company>Columbia</company>
    <year>1997</year>
  </cd>
  <cd>
    <ent_seq>5</ent_seq>
    <title>Empire Burlesque</title>
    <artist>
      <band>Bob Dylan and Boby Rockhammer</band>
      <name>Bob Dylan</name>
      <name>Boby Rockhammer</name>
    </artist>
    <country>USA</country>
    <country>India</country>
    <company>Columbia</company>
    <year>1985</year>
  </cd>
  <cd>  <!-- Update part 1: New Entry -->
    <ent_seq>6</ent_seq>
    <title>Merry Christmas</title>
    <title>White Christmas</title>
    <artist>
      <name>Bing Crosby</name>
    <artist>
    <country>USA</country>
    <company>MCA Records</company>
    <year>1995</year>
  </cd> <!-- End update part 1-->
</catalog>

Real-world example of desired output sample:

<datatable>
  <table id="album title">
    <title id="1">For your Love</title>
    <title id="2">Splish Splash</title>
    <title id="3">How Great Thuo Art</title>
    <title id="4">Big Willie style</title>
    <title id="5">Empire Burlesque</title>
    <title id="6">Merry Christmas</title> <!-- Update part 2: New output -->
    <title id="7">White Christmas</title> <!-- Update part 2: New output -->
  </table>
  <table id="Band Name">
    <artist id="1">The Yardbirds</artist>
    <artist id="2">Roberto Carlos</artist>
    <artist id="3">Elvis Presley</artist>
    <artist id="4">Will Smith</artist>
    <artist id="5">Bob Dylan and Boby Rockhammer</artist>
    <artist id="6"> <!-- Update part 2: New output -->
  </table>
  <table id="artist name">
    <artist id="1">Eric Clapton</artist>
    <artist id="2">Roberto Carlos</artist>
    <artist id="3">Elvis Presley</artist>
    <artist id="4">Will Smith</artist>
    <artist id="5">Bob Dylan</artist>
    <artist id="6">Boby Rockhammer</artist>
    <artist id="7">Bing Crosby</artist> <!-- Update part 2: New output -->
  </table>
  <table id="nickname">
    <nickname id="1">Slowhand</nickname>
    <nickname id="2">The King</nickname>
    <nickname id="3">The King of Rock 'n Roll</nickname>
  </table>
</datatable>

and

<intermediarytable>
  <table id="artist by band name">
    <entry id="1">
      <band_id>1</band_id>
      <artist_id>1</artist_id>
    </entry>
    <entry id="2">
      <band_id>2</band_id>
      <artist_id>2</artist_id>
    </entry>
    <entry id="3">
      <band_id>3</band_id>
      <artist_id>3</artist_id>
    </entry>
    <entry id="4">
      <band_id>4</band_id>
      <artist_id>4</artist_id>
    </entry>
    <entry id="5">
      <band_id>5</band_id>
      <artist_id>5</artist_id>
    </entry>
    <entry id="6">
      <band_id>5</band_id>
      <artist_id>6</artist_id>
    </entry>
    <entry id="7">
      <band_id>6</band_id>
      <artist_id>7</artist_id>
    </entry>
  </table>
  <table id="artist by nickname">
    <entry id="1">
      <artist_id>1</artist_id>
      <nickname_id>1</artist_id>
    </entry>
    <entry id="2">
      <artist_id>2</artist_id>
      <nickname_id>2</nickname_id>
    </entry>
    <entry id="3">
      <artist_id>2</artist_id>
      <nickname_id>3</nickname_id>
    </entry>
    <entry id="4">
      <artist_id>3</artist_id>
      <nickname_id>3</nickname_id>
    </entry>
  </table>
</intermediarytable>

--UPDATE-- There's an issue in which two elements share the same entry ID

In another XML doc I have,

<entry id="1">
  <word>blue</word>
  <word>beryl</word>
  <word lang="SP">azul</word>
</entry>

and I want the output to be

Data Tables:

<table id="en">
  <word lang="en" id="0">blue</word>
  <word lang="en" id="1">beryl</word>
</table>
<table id="sp">
  <word lang="sp" id="0">azul</word>
</table>

Intermediary Table:

<table id="translation id">
  <en_sp id="0"> <!-- en_sp means English-to-Spanish -->
    <en>0</en>
    <sp>0</sp>
  </en_sp>
  <en_sp>
    <en>1</en>
    <sp>0</sp>
  </en_sp>
</table>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

北城半夏 2024-12-03 15:30:24

--更新

假设像这样的 xml:

<catalog>
  <cd>
    <entry id="1">
      <word>blue</word>
      <word>beryl</word>
      <word lang="SP">azul</word>
    </entry>
  </cd>
  <cd>
    <entry id="2">
  ...

试试这个:

$super = array();
$url = "original.xml";
if ($xml = @simplexml_load_file($url, 'SimpleXMLElement', LIBXML_NOCDATA)) {
  foreach($xml->cd as $cd) {
     foreach ($cd->entry as $entry) {
      $id = (string)$entry['id'];
        foreach($entry->word as $word) {
            $lang = isset($word['lang']) ? (string)$word['lang'] : 'EN';
            $super[$id][$lang][] = (string)$word;
        }
     }
  }
}

显示使用:

print "<pre>";
print_r($super);
print "</pre>";

注意: 这是另一种方法,基本上是您需要理解的方法使用 xml 对象以及更一般地使用数组,您可以通过创建基于父级 -> 的结构化层次结构来存储数据。孩子;在这种情况下,我创建了一个像这样的数组 $super[$id][$lang][] = (string)$word; 其中 $id 是父级$lang$word 的父级,而 $word 分别是两者的子级;这将产生一个像这样的数组:

Array
(
    [1] => Array
        (
            [EN] => Array
                (
                    [0] => blue
                    [1] => beryl
                )

            [SP] => Array
                (
                    [0] => azul
                )

        )
      ...

其他需要考虑的事情是,

  1. 如何获取匹配标签的属性,例如idlang,在我的示例中我使用了$entry['id']$cd->entry['id'] 也是有效的。

  2. 如何将 xml-dom-object 转换为有效字符串,以便您可以将其重用为数组索引或值,如 (string)$word


从我从您的示例中看到的:

<catalog>
  <cd>
    <ent_seq>1</ent_seq>
    <title>For Your Love</title>
    <artist>
    ...

尝试

$super = array();
$url = "original.xml";
if ($xml = @simplexml_load_file($url, 'SimpleXMLElement', LIBXML_NOCDATA)) {
  $xml_array = @json_decode(@json_encode($xml), 1);
  foreach ($xml_array['cd'] as $val) {
  $key = $val['ent_seq'];
    if (is_array($val)) {
      foreach ($val as $k1 => $v1) {
        if (is_array($v1)) {
          switch ($k1) {
            case 'artist':
              foreach ($v1 as $k2 => $v2) {
                if (is_array($v2)) {
                  foreach ($v2 as $v3) {
                    $super[$k2][$key] = $v3;
                  }
                }
                else {
                  $super[$k2][$key] = $v2;
                }
              }
              break;
          }
        }
        else {
          switch ($k1) {
            case 'title':
              $super[$k1][$key] = $v1;
              break;
          }
        }
      }
    }
  }
}

显示迭代的结果像这样的数组:

foreach( $super as $key => $val) {
  echo "<table id='{$key}'>\n";
   foreach($val as $key2 => $val2) {
    echo "<$key id='$key2'> " . $val2." </$key>\n";
    }
    echo "</table>\n";                

}

为了更好地查看数组结构你可以像这样打印它:

print "<pre>";
print_r($super);
print "</pre>";

这将显示一个像这样的数组:

Array
(
    [title] => Array
        (
            [1] => For Your Love
            [2] => Splish Splash
            [3] => How Great Thuo Art
            [4] => Big Willie style
            [5] => Empire Burlesque
        )

    [name] => Array
        (
            [1] => Eric Clapton
            [2] => Roberto Carlos
            [3] => Elvis Presley
            [4] => Will Smith
            [5] => Boby Rockhammer
        )

    [nickname] => Array
        (
            [1] => Slowhand
            [2] => The King
            [3] => The King of Rock 'n Roll
        )

    [band] => Array
        (
            [4] => Will Smith
            [5] => Bob Dylan and Boby Rockhammer
        )

)

注意:正如你所看到的,我使用了switch-case,因为你的xml标签不是始终具有相同的一致性,并且在某些情况下它们具有相似的名称,例如 ;您可以创建自己的案例。

然而,就像现在一样,它可以很好地处理您想要获取的字段,如 da 示例中所示。

--UPDATES

Assuming an xml like this:

<catalog>
  <cd>
    <entry id="1">
      <word>blue</word>
      <word>beryl</word>
      <word lang="SP">azul</word>
    </entry>
  </cd>
  <cd>
    <entry id="2">
  ...

try this:

$super = array();
$url = "original.xml";
if ($xml = @simplexml_load_file($url, 'SimpleXMLElement', LIBXML_NOCDATA)) {
  foreach($xml->cd as $cd) {
     foreach ($cd->entry as $entry) {
      $id = (string)$entry['id'];
        foreach($entry->word as $word) {
            $lang = isset($word['lang']) ? (string)$word['lang'] : 'EN';
            $super[$id][$lang][] = (string)$word;
        }
     }
  }
}

display using:

print "<pre>";
print_r($super);
print "</pre>";

note: this is another approach, substantially what you need to understand when working with xml object and more in general with arrays is that you can store data by creating a structured hierarchy based on parent -> child; in this case i'm have created an array like this $super[$id][$lang][] = (string)$word; where $id is parent of $lang that is parent of $word that is respectively child of both; this will produce an array like this:

Array
(
    [1] => Array
        (
            [EN] => Array
                (
                    [0] => blue
                    [1] => beryl
                )

            [SP] => Array
                (
                    [0] => azul
                )

        )
      ...

other things to consider are,

  1. how to get the properties of matched tags like id or lang, in my example i have used $entry['id'] but $cd->entry['id'] is also valid.

  2. how to convert an xml-dom-object into a valid string, so that you can reuse it as array index or value like (string)$word


from what i can see from your examples:

<catalog>
  <cd>
    <ent_seq>1</ent_seq>
    <title>For Your Love</title>
    <artist>
    ...

try this

$super = array();
$url = "original.xml";
if ($xml = @simplexml_load_file($url, 'SimpleXMLElement', LIBXML_NOCDATA)) {
  $xml_array = @json_decode(@json_encode($xml), 1);
  foreach ($xml_array['cd'] as $val) {
  $key = $val['ent_seq'];
    if (is_array($val)) {
      foreach ($val as $k1 => $v1) {
        if (is_array($v1)) {
          switch ($k1) {
            case 'artist':
              foreach ($v1 as $k2 => $v2) {
                if (is_array($v2)) {
                  foreach ($v2 as $v3) {
                    $super[$k2][$key] = $v3;
                  }
                }
                else {
                  $super[$k2][$key] = $v2;
                }
              }
              break;
          }
        }
        else {
          switch ($k1) {
            case 'title':
              $super[$k1][$key] = $v1;
              break;
          }
        }
      }
    }
  }
}

display the results iterating through the array like this:

foreach( $super as $key => $val) {
  echo "<table id='{$key}'>\n";
   foreach($val as $key2 => $val2) {
    echo "<$key id='$key2'> " . $val2." </$key>\n";
    }
    echo "</table>\n";                

}

in order to have a better view of the array struct you can print it like this:

print "<pre>";
print_r($super);
print "</pre>";

this will display an array like this:

Array
(
    [title] => Array
        (
            [1] => For Your Love
            [2] => Splish Splash
            [3] => How Great Thuo Art
            [4] => Big Willie style
            [5] => Empire Burlesque
        )

    [name] => Array
        (
            [1] => Eric Clapton
            [2] => Roberto Carlos
            [3] => Elvis Presley
            [4] => Will Smith
            [5] => Boby Rockhammer
        )

    [nickname] => Array
        (
            [1] => Slowhand
            [2] => The King
            [3] => The King of Rock 'n Roll
        )

    [band] => Array
        (
            [4] => Will Smith
            [5] => Bob Dylan and Boby Rockhammer
        )

)

note: as you can see i made use of switch-case, cause your xml-tags are not always the same consistence and they have similar-name in some circumstances like <company><name> and <artist><name>; you can create your own cases.

however as is now, it work well with the fields you want to grab as in da example.

毁梦 2024-12-03 15:30:24

澄清一下,您是否正在尝试获取输入 XML 文档,使用 XSL/T 将其转换为另一个(不同格式的)XML 文档,然后获取生成的 XML 并将其存储在 MySQL 数据库中?

我是堆栈溢出的新手,所以我不确定如何向原始帖子添加评论。

Just to clarify, you are trying to take an input XML document, convert it to another (differently formatted) xml document using XSL/T, and then take the resulting XML and store it in your MySQL database?

I'm new to stack overflow, so I'm not sure how add a comment to the original post.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文