将 XML 文件拆分为多个文件,每个文件包含 500 个标签
我有一个大(1 GB)文件,需要将其拆分为较小的文件。我希望每个较小的文件包含 500 个
标记。
以下是大型 XML 文件的一小段:
<?xml version="1.0"?><RESULT>
<header>
<site>http://www.thomascook.fr</site>
<marque>ThomasCook France</marque>
<logo>http://www.example.com/example.gif</logo>
</header>
<OFFER>
<IFF>5810</IFF>
<TO>TCF</TO>
<COUNTRY>Chypre</COUNTRY>
<REGION>Chypre du Sud</REGION>
<HOTELNAME>Elias Beach & Country Club</HOTELNAME>
<DESCRIPTION>....</DESCRIPTION>
<TYPE>Sejour</TYPE>
<STARS>5.0</STARS>
<THEMAS>Plage directe;Special enfant;Bien-Etre-Fitness</THEMAS>
<THUMBNAIL>http://example.com/example.jpg</THUMBNAIL>
<URL>http://example.com/example.html</URL>
<DATE>
<BROCHURE>TCFB</BROCHURE>
<DURATION>7</DURATION>
<DURATION_VAR>6_6-9</DURATION_VAR>
<BOARD>Demi-pension</BOARD>
<DEPARTURE>27.2.2011</DEPARTURE>
<RETURN>6.3.2011</RETURN>
<DEPARTURE_CITY>PAR</DEPARTURE_CITY>
<ARRIVAL_CITY>LCA</ARRIVAL_CITY>
<PRICE>790</PRICE>
<URL>http://example.com/other-example.html</URL>
</DATE>
</OFFER>
<OFFER>
(etc)
</OFFER>
我怎样才能做到这一点?
I have a large (1 GB) file that I need to split into smaller files. I want each smaller file to contain 500 of the <OFFER>
tags.
Here is a small snippet of the large XML file:
<?xml version="1.0"?><RESULT>
<header>
<site>http://www.thomascook.fr</site>
<marque>ThomasCook France</marque>
<logo>http://www.example.com/example.gif</logo>
</header>
<OFFER>
<IFF>5810</IFF>
<TO>TCF</TO>
<COUNTRY>Chypre</COUNTRY>
<REGION>Chypre du Sud</REGION>
<HOTELNAME>Elias Beach & Country Club</HOTELNAME>
<DESCRIPTION>....</DESCRIPTION>
<TYPE>Sejour</TYPE>
<STARS>5.0</STARS>
<THEMAS>Plage directe;Special enfant;Bien-Etre-Fitness</THEMAS>
<THUMBNAIL>http://example.com/example.jpg</THUMBNAIL>
<URL>http://example.com/example.html</URL>
<DATE>
<BROCHURE>TCFB</BROCHURE>
<DURATION>7</DURATION>
<DURATION_VAR>6_6-9</DURATION_VAR>
<BOARD>Demi-pension</BOARD>
<DEPARTURE>27.2.2011</DEPARTURE>
<RETURN>6.3.2011</RETURN>
<DEPARTURE_CITY>PAR</DEPARTURE_CITY>
<ARRIVAL_CITY>LCA</ARRIVAL_CITY>
<PRICE>790</PRICE>
<URL>http://example.com/other-example.html</URL>
</DATE>
</OFFER>
<OFFER>
(etc)
</OFFER>
How can I do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
从你的英语中我了解到你想将一个大的 XML 文件分割成多个小文件。最好的是 http://vtd-xml.sourceforge.net/
Sample Code,如下代码将基于XPath、TopTag/ChildTag分割大xml
From you english I understand that you want to split a big XML file into multiple small files. The best one is http://vtd-xml.sourceforge.net/
Sample Code, the following code will split the big xml based on XPath, TopTag/ChildTag
作为一个编程问题,这只是一个 stax 编程问题。
每 500 个元素都会进行必要的调用来结束元素和文档、关闭文件、打开新文件、启动新文件,然后继续。如果您有一个程序可以在 stax 中写入一个文件,那么写入多个文件也没有太大区别。
As a programming question, this is just a matter of stax programming.
Every 500 elements make the necessary calls to end the element and the document, close the file, open a new file, start the new file, and continue along. If you have a program that can write one file in stax, it's not very different to write many.