如何加载和合并 XML 文档数据集

发布于 2024-10-25 04:54:37 字数 1803 浏览 4 评论 0原文

我想使用 XML 文档的数据集，并将它们合并到仅包含不同元素的单个文档中。

为了说明这一点，我有一个数据集：

r, x
-- -------------------------------
1, <root><a>111</a></root>
2, <root><a>222</a><b>222</b></root>
3, <root><c>333</c></root>

会导致：

<a>111</a><b>222</b><c>333</c>

r=2 中的元素不会合并，因为我们已经有一个元素 = 从 r=1 开始。我只需要合并新元素，从 r=1 开始。

我能够迭代列表，但难以比较和合并。下面的代码无法将 222 识别为重复项。是否也可以比较元素值？

 using (SqlDataReader dsReader = cmd.ExecuteReader())
            {
                XDocument baseDoc = new XDocument();
                XDocument childDoc = new XDocument();

                while (dsReader.Read())
                {
                    // this is the base doc, merge forward from here
                    if (dsReader["r"].ToString() == "1")
                    {
                        baseDoc = XDocument.Parse(dsReader["x"].ToString());
                        SqlContext.Pipe.Send("start:" + baseDoc.ToString());

                    }
                    // this is a child doc, do merge operation
                    else
                    {
                        childDoc = XDocument.Parse(dsReader["x"].ToString());

                        // find elements only present in child
                        var childOnly = (childDoc.Descendants("root").Elements()).Except(baseDoc.Descendants("root").Elements());
                        foreach (var e in childOnly)
                        {
                            baseDoc.Root.Add(e);
                        }
                    }
                }
            }

原文

I would like to consume a dataset of XML documents, and merge them into a single document containing only distinct elements.

To illustrate, I have a dataset as:

r, x
-- -------------------------------
1, <root><a>111</a></root>
2, <root><a>222</a><b>222</b></root>
3, <root><c>333</c></root>

would result in:

<a>111</a><b>222</b><c>333</c>

The <a> element from r=2 is not merged since we already have an element = <a> from r=1. I need only merge new elements, starting with r=1 going forward.

I am able to iterate over the list, but having difficulty comparing and merging. The code below fails to identify <a>222</a> as a duplicate. Is it possibly comparing the element values as well?

 using (SqlDataReader dsReader = cmd.ExecuteReader())
            {
                XDocument baseDoc = new XDocument();
                XDocument childDoc = new XDocument();

                while (dsReader.Read())
                {
                    // this is the base doc, merge forward from here
                    if (dsReader["r"].ToString() == "1")
                    {
                        baseDoc = XDocument.Parse(dsReader["x"].ToString());
                        SqlContext.Pipe.Send("start:" + baseDoc.ToString());

                    }
                    // this is a child doc, do merge operation
                    else
                    {
                        childDoc = XDocument.Parse(dsReader["x"].ToString());

                        // find elements only present in child
                        var childOnly = (childDoc.Descendants("root").Elements()).Except(baseDoc.Descendants("root").Elements());
                        foreach (var e in childOnly)
                        {
                            baseDoc.Root.Add(e);
                        }
                    }
                }
            }

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

执手闯天涯 2024-11-01 04:54:38

我对代码中的 baseDoc 和 childDoc 用法有点困惑。我希望我正确理解了你的问题。这是我的建议：

        using (SqlDataReader dsReader = cmd.ExecuteReader())
        {
            XElement result = new XElement("root");
            while (dsReader.Read())
            {
                // Read source
                XDocument srcDoc = XDocument.Parse(dsReader["x"].ToString());

                // Construct result element
                foreach (XElement baseElement in srcDoc.Descendants("root").Elements())
                    if (result.Element(baseElement.Name) == null)   // skip already added nodes
                        result.Add(new XElement(baseElement.Name, baseElement.Value));

            }
            // Construct result string from sub-elements (to avoid "<root>..</root>" in output)
            string str = "";
            foreach (XElement element in result.Elements())
                str += element.ToString();

            // send the result
            SqlContext.Pipe.Send("start:" + str);
        }

请注意，我的代码忽略了 r 编号。我使用 order 因为它来自 sql 数据读取器。如果行不是按“r”排序，则在我的代码之前需要进行额外的排序。

I am bit confused about baseDoc and childDoc usage in your code. I hope I correctly understood your question. Here is my proposal:

        using (SqlDataReader dsReader = cmd.ExecuteReader())
        {
            XElement result = new XElement("root");
            while (dsReader.Read())
            {
                // Read source
                XDocument srcDoc = XDocument.Parse(dsReader["x"].ToString());

                // Construct result element
                foreach (XElement baseElement in srcDoc.Descendants("root").Elements())
                    if (result.Element(baseElement.Name) == null)   // skip already added nodes
                        result.Add(new XElement(baseElement.Name, baseElement.Value));

            }
            // Construct result string from sub-elements (to avoid "<root>..</root>" in output)
            string str = "";
            foreach (XElement element in result.Elements())
                str += element.ToString();

            // send the result
            SqlContext.Pipe.Send("start:" + str);
        }

Note that my code ignores r-numbering. I use order as it comes from sql data reader. If rows are not sorted by "r", then additional sort is required before my code.

回复收藏 0 原文

~没有更多了~