C# 查找并替换 XML 节点
编辑:我决定采用推荐的 LINQ to XML 方法(请参阅下面的答案),一切正常,除了我无法用增量文件中的记录替换更改的记录。我设法通过删除完整文件节点然后添加增量节点来使程序正常运行。有没有办法直接交换它们?另外,虽然这个解决方案非常好,但是有没有办法在不丢失 LINQ 代码的情况下减少内存使用量?这个解决方案可能仍然有效,但我愿意牺牲时间来降低内存使用量。
我正在尝试获取两个 XML 文件(一个完整文件和一个增量文件)并将它们合并在一起。 XML 文件如下所示:
<List>
<Records>
<Person id="001" recordaction="add">
...
</Person>
</Records>
</List>
recordaction 属性还可以是“chg”(表示更改)或“del”(表示删除)。我的程序的基本逻辑是:
1)将完整文件读入XmlDocument。
2)将增量文件读入XmlDocument,使用XmlDocument.SelectNodes()选择节点,将这些节点放入字典中以便于搜索。
3) 选择完整文件中的所有节点,循环遍历并对照包含增量记录的字典检查每个节点。如果 recordaction="chg" 或 "del" 将节点添加到列表中,则从 XmlNodeList 中删除该列表中的所有节点。最后,将增量文件中的 recordaction="chg" 或“add”记录添加到完整文件中。
4) 保存 XML 文件。
我在步骤 3 中遇到了一些严重的问题。这是该函数的代码:
private void ProcessChanges(XmlNodeList nodeList, Dictionary<string, XmlNode> dictNodes)
{
XmlNode lastNode = null;
XmlNode currentNode = null;
List<XmlNode> nodesToBeDeleted = new List<XmlNode>();
// If node from full file matches to incremental record and is change or delete,
// mark full record to be deleted.
foreach (XmlNode fullNode in fullDocument.SelectNodes("/List/Records/Person"))
{
dictNodes.TryGetValue(fullNode.Attributes[0].Value, out currentNode);
if (currentNode != null)
{
if (currentNode.Attributes["recordaction"].Value == "chg"
|| currentNode.Attributes["recordaction"].Value == "del")
{
nodesToBeDeleted.Add(currentNode);
}
}
lastNode = fullNode;
}
// Delete marked records
for (int i = nodeList.Count - 1; i >= 0; i--)
{
if(nodesToBeDeleted.Contains(nodeList[i]))
{
nodeList[i].ParentNode.RemoveChild(nodesToBeDeleted[i]);
}
}
// Add in the incremental records to the new full file for records marked add or change.
foreach (XmlNode weeklyNode in nodeList)
{
if (weeklyNode.Attributes["recordaction"].Value == "add"
|| weeklyNode.Attributes["recordaction"].Value == "chg")
{
fullDocument.InsertAfter(weeklyNode, lastNode);
lastNode = weeklyNode;
}
}
}
传入的 XmlNodeList 只是从增量文件中选择的所有增量记录,字典只是那些相同的节点,但键' d 在 id 上,这样我就不必每次都循环遍历所有增量记录。现在,由于索引越界,程序正在“删除标记记录”阶段死亡。我很确定“添加增量记录”也不起作用。有什么想法吗?另外,一些关于提高效率的建议也很好。我可能会遇到一个问题,因为它正在读取一个 250MB 的文件,而该文件在内存中会膨胀到 750MB,所以我想知道是否有一种更简单的方法可以在完整文件中逐节点读取。谢谢!
Edit: I decided to take the LINQ to XML approach (see the answer below) that was recommended and everything works EXCEPT that I can't replace out the changed records with the records from the incremental file. I managed to make the program work by just removing the full file node and then adding in the incremental node. Is there a way to just swap them instead? Also, while this solution is very nice, is there any way to shrink down memory usage without losing the LINQ code? This solution may still work, but I would be willing to sacrifice time to lower memory usage.
I'm trying to take two XML files (a full file and an incremental file) and merge them together. The XML file looks like this:
<List>
<Records>
<Person id="001" recordaction="add">
...
</Person>
</Records>
</List>
The recordaction attribute can also be "chg" for changes or "del" for deletes. The basic logic of my program is:
1) Read the full file into an XmlDocument.
2) Read the incremental file into an XmlDocument, select the nodes using XmlDocument.SelectNodes(), place those nodes into a dictionary for easier searching.
3) Select all the nodes in the full file, loop through and check each against the dictionary containing the incremental records. If recordaction="chg" or "del" add the node to a list, then delete all the nodes from the XmlNodeList that are in that list. Finally, add recordaction="chg" or "add" records from the incremental file into the full file.
4) Save the XML file.
I'm having some serious problems with step 3. Here's the code for that function:
private void ProcessChanges(XmlNodeList nodeList, Dictionary<string, XmlNode> dictNodes)
{
XmlNode lastNode = null;
XmlNode currentNode = null;
List<XmlNode> nodesToBeDeleted = new List<XmlNode>();
// If node from full file matches to incremental record and is change or delete,
// mark full record to be deleted.
foreach (XmlNode fullNode in fullDocument.SelectNodes("/List/Records/Person"))
{
dictNodes.TryGetValue(fullNode.Attributes[0].Value, out currentNode);
if (currentNode != null)
{
if (currentNode.Attributes["recordaction"].Value == "chg"
|| currentNode.Attributes["recordaction"].Value == "del")
{
nodesToBeDeleted.Add(currentNode);
}
}
lastNode = fullNode;
}
// Delete marked records
for (int i = nodeList.Count - 1; i >= 0; i--)
{
if(nodesToBeDeleted.Contains(nodeList[i]))
{
nodeList[i].ParentNode.RemoveChild(nodesToBeDeleted[i]);
}
}
// Add in the incremental records to the new full file for records marked add or change.
foreach (XmlNode weeklyNode in nodeList)
{
if (weeklyNode.Attributes["recordaction"].Value == "add"
|| weeklyNode.Attributes["recordaction"].Value == "chg")
{
fullDocument.InsertAfter(weeklyNode, lastNode);
lastNode = weeklyNode;
}
}
}
The XmlNodeList being passed in is just all of the incremental records that were selected out from the incremental file, and the dictionary is just those same nodes but key'd on the id so I didn't have to loop through all of the incremental records each time. Right now the program is dying at the "Delete marked records" stage due to indexing out of bounds. I'm pretty sure the "Add in the incremental records" doesn't work either. Any ideas? Also some suggestions on making this more efficient would be nice. I could potentially run into a problem because it's reading in a 250MB file which balloons up to 750MB in memory, so I was wondering if there was an easier way to go node-by-node in the full file. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
下面是一个示例,说明如何使用 LINQ-to-XML 来完成此任务。不需要字典:
如果您在运行它时遇到任何问题,请告诉我,我将编辑并修复它。我很确定这是正确的,但目前没有可用的 VS。
编辑:修复了
"chg"
情况以使用personToChange.ReplaceWith(person)
而不是personToChange = person
。后者不会替换任何内容,因为它只是将引用从基础文档中移开。Here's an example of how you might accomplish it with LINQ-to-XML. No dictionary is needed:
Please let me know if you have any problems running it and I'll edit and fix it. I'm pretty sure it's correct, but don't have VS available at the moment.
EDIT: fixed the
"chg"
case to usepersonToChange.ReplaceWith(person)
rather thanpersonToChange = person
. The latter doesn't replace anything, as it just shifts the reference away from the underlying document.