使用适当的错误处理（行号、原始文本等）读取 XML

发布于 2025-01-03 17:26:43 字数 1063 浏览 0 评论 0原文

我想读取一个相当大的 xml 文件。它小到足以容纳内存，但仍然很大。读取 XML 时，会根据 XSD 对其进行验证。然而，这并不能防止使用读取的数据在系统中进行进一步操作时发生业务错误。当发生此类业务错误时（XSD 验证后），我希望能够描述 xml 中元素的开始和结束位置的行号和列号。此外，在这种情况下，在从文件读取时显示输入 xml 将是用户友好的。

使用 xsd.exe 我已经代码生成了所有数据类，并使用读取了 xml

  using (var reader = new StringReader(content))
  {
    var errors = new List<string>();
    var settings = new XmlReaderSettings();
    settings.Schemas.Add("urn:import-schema", "Import.xsd");
    settings.ValidationEventHandler += (o, args) => errors.Add(args.Message);
    settings.ValidationType = ValidationType.Schema;

    using (XmlReader xr = XmlReader.Create(reader, settings))
    {
      var xs = new XmlSerializer(typeof(ImportRoot));
      var result = (ImportRoot) xs.Deserialize(xr);
      if (errors.Any())
        throw new Exception(string.Join("\n\n", errors));
      return result;
    }
  }
}

但是，我似乎找不到我正在寻找的元信息。我也检查了 XDocument 类。这里的元素似乎有一个 Value 属性，它是一个字符串。但这还不是我想要显示的全部信息。

原文

I want to read a fairly large xml file. Its small enough to fit in memory, but still very big. When reading the XML it is validated against an XSD. This, however, does not prevent business errors from happening when using the read data for further manipulation in the system. When such business errors occur (after XSD validation) I want to be able to describe the line number and column number for the start and end position of an element from my xml. Also, in this context, it would be user friendly to show the input xml as it was read from the file.

Using the xsd.exe I've code generated all the data classes and I read the xml using

  using (var reader = new StringReader(content))
  {
    var errors = new List<string>();
    var settings = new XmlReaderSettings();
    settings.Schemas.Add("urn:import-schema", "Import.xsd");
    settings.ValidationEventHandler += (o, args) => errors.Add(args.Message);
    settings.ValidationType = ValidationType.Schema;

    using (XmlReader xr = XmlReader.Create(reader, settings))
    {
      var xs = new XmlSerializer(typeof(ImportRoot));
      var result = (ImportRoot) xs.Deserialize(xr);
      if (errors.Any())
        throw new Exception(string.Join("\n\n", errors));
      return result;
    }
  }
}

However, I can't seem to find the meta-info that I'm looking for. I've checked the XDocument class as well. Here elements seems to have a Value property that is a string. But that is still not all the information I want to display.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦纸 2025-01-10 17:26:43

行号信息不是从StringReader读取的。如果您在 FileStream 上使用 StreamReader，您将能够获取行号。

您要查找的附加元数据称为““后架构验证”信息集”。

回复收藏 0 原文

树深时见影 2025-01-10 17:26:43

在 ValidationEventHandler 中查看 args.Exception 属性。它是 XmlSchemaException 类型，包含行数字等。

您可以保留所有错误，然后再解析它们。

var errors = new List<ValidationEventArgs>();
....
settings.ValidationEventHandler += (o, args) => errors.Add(args);

可以通过将业务验证错误实现为自定义 xslt 函数来处理它们。请参阅这篇文章。一旦您拥有实现 IXsltContextFunction 的函数，您就可以在调用方法来提示您在文档中的位置。

一旦获得提示，您就可以将其与原始文档中的每一行进行比较。

几年前我做了类似的事情（除了行号之外）并且效果非常好。即使对于大型 xml 文档也是如此。

In your ValidationEventHandler look at the args.Exception property. It is a XmlSchemaException type, that contains line number etc.

You could keep all the errors and then parse them afterwards.

var errors = new List<ValidationEventArgs>();
....
settings.ValidationEventHandler += (o, args) => errors.Add(args);

Business validation errors can be handled by implementing them as custom xslt functions. See this article. Once you have a function that implements IXsltContextFunction you can examine the XPathNavigator in the Invoke method for a hint about where in the document you are.

Once you have the hint you can compare it with each line in the original document.

I did something like that a couple of years ago (besides the line numbers) and it worked very nicely. Even for large xml documents.

回复收藏 0 原文

~没有更多了~