将 xml 文档分割成块

发布于 2024-07-08 19:27:27 字数 1367 浏览 4 评论 0原文

我有一个大型 xml 文档,需要一次处理 100 条记录,

它是在用 c# 编写的 Windows 服务中完成的。

结构如下:

<docket xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="docket.xsd">
    <order>
        <Date>2008-10-13</Date>
        <orderNumber>050758023</orderNumber>
        <ParcelID/>
        <CustomerName>sddsf</CustomerName>
        <DeliveryName>dsfd</DeliveryName>
        <Address1>sdf</Address1>
        <Address2>sdfsdd</Address2>
        <Address3>sdfdsfdf</Address3>
        <Address4>dffddf</Address4>
        <PostCode/>

    </order>
    <order>
        <Date>2008-10-13</Date>
        <orderNumber>050758023</orderNumber>
        <ParcelID/>
        <CustomerName>sddsf</CustomerName>
        <DeliveryName>dsfd</DeliveryName>
        <Address1>sdf</Address1>
        <Address2>sdfsdd</Address2>
        <Address3>sdfdsfdf</Address3>
        <Address4>dffddf</Address4>
        <PostCode/>

    </order>

    .....

    .....

</docket>

一个案卷中可能有数千个订单。

我需要将其切成 100 个元素块

但是,这 100 个订单中的每一个仍然需要用父“docket”节点包装并具有相同的命名空间等,

这可能吗?

I have a large xml document that needs to be processed 100 records at a time

It is being done within a Windows Service written in c#.

The structure is as follows :

<docket xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="docket.xsd">
    <order>
        <Date>2008-10-13</Date>
        <orderNumber>050758023</orderNumber>
        <ParcelID/>
        <CustomerName>sddsf</CustomerName>
        <DeliveryName>dsfd</DeliveryName>
        <Address1>sdf</Address1>
        <Address2>sdfsdd</Address2>
        <Address3>sdfdsfdf</Address3>
        <Address4>dffddf</Address4>
        <PostCode/>

    </order>
    <order>
        <Date>2008-10-13</Date>
        <orderNumber>050758023</orderNumber>
        <ParcelID/>
        <CustomerName>sddsf</CustomerName>
        <DeliveryName>dsfd</DeliveryName>
        <Address1>sdf</Address1>
        <Address2>sdfsdd</Address2>
        <Address3>sdfdsfdf</Address3>
        <Address4>dffddf</Address4>
        <PostCode/>

    </order>

    .....

    .....

</docket>

There could be thousands of orders in a docket.

I need to chop this into 100 element chunks

However each of the 100 orders still need to be wrapped with the parent "docket" node and have the same namespace etc

is this possible?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

z祗昰~ 2024-07-15 19:27:27

另一个天真的解决方案; 这次是.NET 2.0。 它应该让您了解如何实现您想要的目标。 使用 Xpath 表达式而不是 Linq to XML。 在我的 devbox 上,不到一秒即可将 100 个订单清单分成 10 个清单。

 public List<XmlDocument> ChunkDocket(XmlDocument docket, int chunkSize)
    {
        List<XmlDocument> newDockets = new List<XmlDocument>();
        //            
        int orderCount = docket.SelectNodes("//docket/order").Count;
        int chunkStart = 0;
        XmlDocument newDocket = null;
        XmlElement root = null;
        XmlNodeList chunk = null;

        while (chunkStart < orderCount)
        {
            newDocket = new XmlDocument();
            root = newDocket.CreateElement("docket");
            newDocket.AppendChild(root);

            chunk = docket.SelectNodes(String.Format("//docket/order[position() > {0} and position() <= {1}]", chunkStart, chunkStart + chunkSize));

            chunkStart += chunkSize;

            XmlNode targetNode = null;
            foreach (XmlNode c in chunk)
            {
                targetNode = newDocket.ImportNode(c, true);
                root.AppendChild(targetNode);
            }

            newDockets.Add(newDocket);
        } 

        return newDockets;
    }

Another naive solution; this time for .NET 2.0. It should give you an idea of how to go about what you want. Uses Xpath expressions instead of Linq to XML. Chunks a 100 order docket into 10 dockets in under a second on my devbox.

 public List<XmlDocument> ChunkDocket(XmlDocument docket, int chunkSize)
    {
        List<XmlDocument> newDockets = new List<XmlDocument>();
        //            
        int orderCount = docket.SelectNodes("//docket/order").Count;
        int chunkStart = 0;
        XmlDocument newDocket = null;
        XmlElement root = null;
        XmlNodeList chunk = null;

        while (chunkStart < orderCount)
        {
            newDocket = new XmlDocument();
            root = newDocket.CreateElement("docket");
            newDocket.AppendChild(root);

            chunk = docket.SelectNodes(String.Format("//docket/order[position() > {0} and position() <= {1}]", chunkStart, chunkStart + chunkSize));

            chunkStart += chunkSize;

            XmlNode targetNode = null;
            foreach (XmlNode c in chunk)
            {
                targetNode = newDocket.ImportNode(c, true);
                root.AppendChild(targetNode);
            }

            newDockets.Add(newDocket);
        } 

        return newDockets;
    }
陈独秀 2024-07-15 19:27:27

天真,迭代,但有效[编辑:仅在.NET 3.5中]

    public List<XDocument> ChunkDocket(XDocument docket, int chunkSize)
    {
        var newDockets = new List<XDocument>();
        var d = new XDocument(docket);
        var orders = d.Root.Elements("order");
        XDocument newDocket = null;

        do
        {
            newDocket = new XDocument(new XElement("docket"));
            var chunk = orders.Take(chunkSize);
            newDocket.Root.Add(chunk);
            chunk.Remove();
            newDockets.Add(newDocket);
        } while (orders.Any());

        return newDockets;
    }

Naive, iterative, but works [EDIT: in .NET 3.5 only]

    public List<XDocument> ChunkDocket(XDocument docket, int chunkSize)
    {
        var newDockets = new List<XDocument>();
        var d = new XDocument(docket);
        var orders = d.Root.Elements("order");
        XDocument newDocket = null;

        do
        {
            newDocket = new XDocument(new XElement("docket"));
            var chunk = orders.Take(chunkSize);
            newDocket.Root.Add(chunk);
            chunk.Remove();
            newDockets.Add(newDocket);
        } while (orders.Any());

        return newDockets;
    }
瑾兮 2024-07-15 19:27:27

如果一次处理 100 个订单的原因是出于性能目的,例如需要花费太多时间和资源来打开一个大文件,您可以利用 XmlReader 一次处理一个订单元素,而不会降低性能。

XmlReader reader = XmlReader.Create(@"c:\foo\Doket.xml")
while( reader.Read())
{
  if(reader.LocalName == "order")
  {
     // read each child element and its value from the reader.
     // or you can deserialize the order element by using a XmlSerializer and Order class
  }     
}

If the reason to process 100 orders at a time is for performance purposes, e.g. taking too much time and resource to open a big file, You can utilize XmlReader to process order element one at a time without degrading the performance.

XmlReader reader = XmlReader.Create(@"c:\foo\Doket.xml")
while( reader.Read())
{
  if(reader.LocalName == "order")
  {
     // read each child element and its value from the reader.
     // or you can deserialize the order element by using a XmlSerializer and Order class
  }     
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文