无需 try/catch 即可检查格式良好的 XML?

发布于 2024-07-25 01:46:53 字数 979 浏览 13 评论 0原文

有谁知道如何在不使用 try/catch 块中的 XmlDocument.LoadXml() 之类的情况下检查字符串是否包含格式良好的 XML? 我的输入可能是 XML,也可能不是 XML,并且我希望代码能够识别输入可能不是 XML,而不依赖于 try/catch,以提高速度并遵循非例外情况不应引发的一般原则例外情况。 我目前有执行此操作的代码;

private bool IsValidXML(string value)
    {
        try
        {
            // Check we actually have a value
            if (string.IsNullOrEmpty(value) == false)
            {
                // Try to load the value into a document
                XmlDocument xmlDoc = new XmlDocument();

                xmlDoc.LoadXml(value);

                // If we managed with no exception then this is valid XML!
                return true;
            }
            else
            {
                // A blank value is not valid xml
                return false;
            }
        }
        catch (System.Xml.XmlException)
        {
            return false;
        }
    }

但这似乎不需要 try/catch。 这个异常在调试过程中引起了快乐的地狱,因为每次我检查一个字符串时,调试器都会在这里中断,“帮助”我解决讨厌的问题。

Does anyone know how I can check if a string contains well-formed XML without using something like XmlDocument.LoadXml() in a try/catch block? I've got input that may or may not be XML, and I want code that recognises that input may not be XML without relying on a try/catch, for both speed and on the general principle that non-exceptional circumstances shouldn't raise exceptions. I currently have code that does this;

private bool IsValidXML(string value)
    {
        try
        {
            // Check we actually have a value
            if (string.IsNullOrEmpty(value) == false)
            {
                // Try to load the value into a document
                XmlDocument xmlDoc = new XmlDocument();

                xmlDoc.LoadXml(value);

                // If we managed with no exception then this is valid XML!
                return true;
            }
            else
            {
                // A blank value is not valid xml
                return false;
            }
        }
        catch (System.Xml.XmlException)
        {
            return false;
        }
    }

But it seems like something that shouldn't require the try/catch. The exception is causing merry hell during debugging because every time I check a string the debugger will break here, 'helping' me with my pesky problem.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

不必你懂 2024-08-01 01:46:54

我不同意问题出在调试器上。 一般来说,对于非异常情况,应该避免异常。 这意味着,如果有人正在寻找类似 IsWellFormed() 的方法,该方法根据输入是否为格式良好的 XML 返回 true/false,则此实现中不应引发异常,无论是否他们是否被抓住并被处理。

异常的代价是昂贵的,并且在正常成功执行期间不应该遇到它们。 一个示例是编写一个检查文件是否存在的方法,并使用 File.Open 并在文件不存在的情况下捕获异常。 这将是一个糟糕的实施。 相反,应该使用 File.Exists() (希望它的实现不会简单地在某些方法周围放置一个 try/catch ,如果文件不存在,该方法就会抛出异常,我确信事实并非如此)。

I disagree that the problem is the debugger. In general, for non-exceptional cases, exceptions should be avoided. This means that if someone is looking for a method like IsWellFormed() which returns true/false based on whether the input is well formed XML or not, exceptions should not be thrown within this implementation, regardless of whether they are caught and handled or not.

Exceptions are expensive and they should not be encountered during normal successful execution. An example is writing a method which checks for the existance of a file and using File.Open and catching the exception in the case the file doesn't exist. This would be a poor implementation. Instead File.Exists() should be used (and hopefully the implementation of that does not simply put a try/catch around some method which throws an exception if the file doesn't exist, I'm sure it doesn't).

玻璃人 2024-08-01 01:46:54

只是我的 2 美分 - 关于这一点有各种各样的问题,大多数人都同意“垃圾进 - 垃圾出”的事实。 我并不反对这一点 - 但我个人发现了以下快速而肮脏的解决方案,特别是对于您处理来自第 3 方的 xml 数据的情况,这些数据根本无法与您轻松通信。它并不能避免使用 try/ catch - 但它以更精细的粒度使用它,因此在无效 xml 字符的数量不是那么大的情况下,它会有所帮助..我使用 XmlTextReader 及其方法 ReadChars() 对于每个父元素,这是命令之一不能像 ReadInner/OuterXml 那样进行格式良好的检查。 因此,当 Read() 遇到父节点时,它是 Read() 和 ReadChars() 的组合。 当然,这是可行的,因为我可以假设 XML 的基本结构是好的,但是某些节点的内容(值)可以包含尚未被 &..; 替换的特殊字符。 等效...(我在某处找到了一篇关于此的文章,但目前找不到源链接)

Just my 2 cents - there are various questions about this around, and most people agree on the "garbage in - garbage out" fact. I don't disagree with that - but personally I found the following quick and dirty solution, especially for the cases where you deal with xml data from 3rd parties which simply do not communicate with you easily.. It doesn't avoid using try/catch - but it uses it with finer granularity, so in cases where the quantity of invalid xml characters is not that big, it helps.. I used XmlTextReader, and its method ReadChars() for each parent element, which is one of the commands that do not do well-formed checks, like ReadInner/OuterXml does. So it's a combination of Read() and ReadChars() when Read() stubmbles upon a parent node. Of course this works because I can do assumption that the basic structure of the XML is okay, but contents (values) of certain nodes can contain special characters that haven't been replaced with &..; equivalent... (I found an article about this somewhere, but can't find the source link at the moment)

小猫一只 2024-08-01 01:46:54

我使用这个函数来验证字符串/片段

<Runtime.CompilerServices.Extension()>
Public Function IsValidXMLFragment(ByVal xmlFragment As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLFragment = True

    Dim NameTable As New Xml.NameTable

    Dim XmlNamespaceManager As New Xml.XmlNamespaceManager(NameTable)
    XmlNamespaceManager.AddNamespace("xsd", "http://www.w3.org/2001/XMLSchema")
    XmlNamespaceManager.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance")

    Dim XmlParserContext As New Xml.XmlParserContext(Nothing, XmlNamespaceManager, Nothing, Xml.XmlSpace.None)

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Fragment
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLFragment = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Dim XmlReader As Xml.XmlReader = Xml.XmlReader.Create(New IO.StringReader(xmlFragment), XmlReaderSettings, XmlParserContext)
    While XmlReader.Read
        'Read entire XML
    End While
End Function

我使用这个函数来验证文件:

Public Function IsValidXMLDocument(ByVal Path As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLDocument = IO.File.Exists(Path)
    If Not IsValidXMLDocument Then Exit Function

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Document
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If
    XmlReaderSettings.CloseInput = True

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLDocument = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Using FileStream As New IO.FileStream(Path, IO.FileMode.Open)
        Using XmlReader As Xml.XmlReader = Xml.XmlReader.Create(FileStream, XmlReaderSettings)
            While XmlReader.Read
                'Read entire XML
            End While
        End Using
    End Using
End Function

I'm using this function for verifying strings/fragments

<Runtime.CompilerServices.Extension()>
Public Function IsValidXMLFragment(ByVal xmlFragment As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLFragment = True

    Dim NameTable As New Xml.NameTable

    Dim XmlNamespaceManager As New Xml.XmlNamespaceManager(NameTable)
    XmlNamespaceManager.AddNamespace("xsd", "http://www.w3.org/2001/XMLSchema")
    XmlNamespaceManager.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance")

    Dim XmlParserContext As New Xml.XmlParserContext(Nothing, XmlNamespaceManager, Nothing, Xml.XmlSpace.None)

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Fragment
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLFragment = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Dim XmlReader As Xml.XmlReader = Xml.XmlReader.Create(New IO.StringReader(xmlFragment), XmlReaderSettings, XmlParserContext)
    While XmlReader.Read
        'Read entire XML
    End While
End Function

I'm using this function for verifying files:

Public Function IsValidXMLDocument(ByVal Path As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLDocument = IO.File.Exists(Path)
    If Not IsValidXMLDocument Then Exit Function

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Document
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If
    XmlReaderSettings.CloseInput = True

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLDocument = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Using FileStream As New IO.FileStream(Path, IO.FileMode.Open)
        Using XmlReader As Xml.XmlReader = Xml.XmlReader.Create(FileStream, XmlReaderSettings)
            While XmlReader.Read
                'Read entire XML
            End While
        End Using
    End Using
End Function
山有枢 2024-08-01 01:46:54

此外,当仅验证 XML 字符串的语法正确性时(当不需要解析外部架构时),我认为添加 XmlResolver = null 设置可能是一个好主意。 这既确保了安全性(无 Web 访问)又保证了安全性(避免恶意 XML 内容引导代码访问不良站点)。 代码如下(需要 C# 2.0 或更高版本):

public static bool IsValidXml(string candidateString)
{
    try
    {
        XmlReaderSettings settings = new XmlReaderSettings();
        settings.XmlResolver = null;
        XmlDocument document = new XmlDocument();
        document.XmlResolver = null;
        document.Load(XmlReader.Create(new MemoryStream(Encoding.UTF8.GetBytes(candidateString)), settings));
        return true;
    }
    catch (XmlException)
    {
        return false;
    }
}

C# 6.0 或更高版本的优化版本:

public static bool IsValidXml(string candidateString)
{
    try
    {
        var settings = new XmlReaderSettings { XmlResolver = null };
        var document = new XmlDocument() { XmlResolver = null };
        document.Load(XmlReader.Create(new MemoryStream(Encoding.UTF8.GetBytes(candidateString)), settings));
        return true;
    }
    catch (XmlException)
    {
        return false;
    }
}

In addition, when only verifying syntactic correctness of the XML string (when there is no need to resolve an external schema), I think adding a XmlResolver = null setting may be a good idea. This both ensures security (no Web access) and security (avoid malicious XML content directing the code to access bad sites). Code follows (requires C# 2.0 or higher):

public static bool IsValidXml(string candidateString)
{
    try
    {
        XmlReaderSettings settings = new XmlReaderSettings();
        settings.XmlResolver = null;
        XmlDocument document = new XmlDocument();
        document.XmlResolver = null;
        document.Load(XmlReader.Create(new MemoryStream(Encoding.UTF8.GetBytes(candidateString)), settings));
        return true;
    }
    catch (XmlException)
    {
        return false;
    }
}

An optimized version for C# 6.0 or higher:

public static bool IsValidXml(string candidateString)
{
    try
    {
        var settings = new XmlReaderSettings { XmlResolver = null };
        var document = new XmlDocument() { XmlResolver = null };
        document.Load(XmlReader.Create(new MemoryStream(Encoding.UTF8.GetBytes(candidateString)), settings));
        return true;
    }
    catch (XmlException)
    {
        return false;
    }
}
初与友歌 2024-08-01 01:46:54

我的两分钱。 这非常简单,并且遵循一些常见的约定,因为它是关于解析的......

public bool TryParse(string s, ref XmlDocument result)
{
    try {
        result = new XmlDocument();
        result.LoadXml(s);
        return true;
    } catch (XmlException ex) {
        return false;
    }
}

My two cents. This was pretty simple and follows some common conventions since it's about parsing...

public bool TryParse(string s, ref XmlDocument result)
{
    try {
        result = new XmlDocument();
        result.LoadXml(s);
        return true;
    } catch (XmlException ex) {
        return false;
    }
}
水中月 2024-08-01 01:46:53

我不知道没有异常的验证方法,但是您可以将调试器设置更改为仅在未处理的情况下中断 XmlException - 这应该可以解决您眼前的问题,即使代码仍然不优雅。

为此,请转到“调试/异常.../公共语言运行时异常”并找到 System.Xml.XmlException,然后确保仅选中“用户未处理”(而不是引发)。

I don't know a way of validating without the exception, but you can change the debugger settings to only break for XmlException if it's unhandled - that should solve your immediate issues, even if the code is still inelegant.

To do this, go to Debug / Exceptions... / Common Language Runtime Exceptions and find System.Xml.XmlException, then make sure only "User-unhandled" is ticked (not Thrown).

世界如花海般美丽 2024-08-01 01:46:53

Steve,

我们的第 3 方有时会意外地向我们发送 JSON 而不是 XML。 这是我实现的:

public static bool IsValidXml(string xmlString)
{
    Regex tagsWithData = new Regex("<\\w+>[^<]+</\\w+>");

    //Light checking
    if (string.IsNullOrEmpty(xmlString) || tagsWithData.IsMatch(xmlString) == false)
    {
        return false;
    }

    try
    {
        XmlDocument xmlDocument = new XmlDocument();
        xmlDocument.LoadXml(xmlString);
        return true;
    }
    catch (Exception e1)
    {
        return false;
    }
}

[TestMethod()]
public void TestValidXml()
{
    string xml = "<result>true</result>";
    Assert.IsTrue(Utility.IsValidXml(xml));
}

[TestMethod()]
public void TestIsNotValidXml()
{
    string json = "{ \"result\": \"true\" }";
    Assert.IsFalse(Utility.IsValidXml(json));
}

Steve,

We had an 3rd party that accidentally sometimes sent us JSON instead of XML. Here is what I implemented:

public static bool IsValidXml(string xmlString)
{
    Regex tagsWithData = new Regex("<\\w+>[^<]+</\\w+>");

    //Light checking
    if (string.IsNullOrEmpty(xmlString) || tagsWithData.IsMatch(xmlString) == false)
    {
        return false;
    }

    try
    {
        XmlDocument xmlDocument = new XmlDocument();
        xmlDocument.LoadXml(xmlString);
        return true;
    }
    catch (Exception e1)
    {
        return false;
    }
}

[TestMethod()]
public void TestValidXml()
{
    string xml = "<result>true</result>";
    Assert.IsTrue(Utility.IsValidXml(xml));
}

[TestMethod()]
public void TestIsNotValidXml()
{
    string json = "{ \"result\": \"true\" }";
    Assert.IsFalse(Utility.IsValidXml(json));
}
一腔孤↑勇 2024-08-01 01:46:53

这是一种合理的方法,只是 IsNullOrEmpty 是多余的(LoadXml 可以很好地解决这个问题)。 如果您确实保留 IsNullOrEmpty,请执行 if(!string.IsNullOrEmpty(value))。

但基本上,问题在于调试器,而不是代码。

That's a reasonable way to do it, except that the IsNullOrEmpty is redundant (LoadXml can figure that out fine). If you do keep IsNullOrEmpty, do if(!string.IsNullOrEmpty(value)).

Basically, though, your debugger is the problem, not the code.

烟柳画桥 2024-08-01 01:46:53

[System.Diagnostics.DebuggerStepThrough] 属性添加到 IsValidXml 方法。 这会抑制 XmlException 被调试器捕获,这意味着您可以打开首次更改异常的捕获,并且不会调试此特定方法。

Add the [System.Diagnostics.DebuggerStepThrough] attribute to the IsValidXml method. This suppresses the XmlException from being caught by the debugger, which means you can turn on the catching of first-change exceptions and this particular method will not be debugged.

未央 2024-08-01 01:46:53

使用 XmlDocument 时要小心,因为可能会使用 <0>some text 加载元素
XmlDocument doc = (XmlDocument)JsonConvert.DeserializeXmlNode(object) 没有抛出异常。

数字元素名称不是有效的 xml,在我的例子中,直到我尝试将 xmlDoc.innerText 写入 xml 的 Sql 服务器数据类型时才发生错误。

这就是我现在验证的方式,并抛出异常
XmlDocument tempDoc = XmlDocument)JsonConvert.DeserializeXmlNode(formData.ToString(), "data");
doc.LoadXml(tempDoc.InnerXml);

Caution with using XmlDocument for it possible to load an element along the lines of <0>some text</0> using
XmlDocument doc = (XmlDocument)JsonConvert.DeserializeXmlNode(object) without an exception being thrown.

Numeric element names are not valid xml, and in my case an error did not occur until I tried to write the xmlDoc.innerText to an Sql server datatype of xml.

This how I validate now, and an exception gets thrown
XmlDocument tempDoc = XmlDocument)JsonConvert.DeserializeXmlNode(formData.ToString(), "data");
doc.LoadXml(tempDoc.InnerXml);

谁与争疯 2024-08-01 01:46:53

XmlTextReader 类是一个
XmlReader 的实现,以及
提供快速、高性能的解析器。 它
强制执行 XML 必须遵循的规则
结构良好。 它既不是一个
验证或非验证解析器
因为它没有 DTD 或模式
信息。 它可以读取文本
块,或从a读取字符
流。

另一篇 MSDN 文章中的示例,我已添加代码以供阅读
XML 流的全部内容。

string str = "<ROOT>AQID</ROOT>";
XmlTextReader r = new XmlTextReader(new StringReader(str));
try
{
  while (r.Read())
  {
  }
}
finally
{
  r.Close();
}

来源:http://bytes.com/topic/c-sharp /answers/261090-check-wellformedness-xml

The XmlTextReader class is an
implementation of XmlReader, and
provides a fast, performant parser. It
enforces the rules that XML must be
well-formed. It is neither a
validating nor a non-validating parser
since it does not have DTD or schema
information. It can read text in
blocks, or read characters from a
stream.

And an example from another MSDN article to which I have added code to read
the whole contents of the XML stream.

string str = "<ROOT>AQID</ROOT>";
XmlTextReader r = new XmlTextReader(new StringReader(str));
try
{
  while (r.Read())
  {
  }
}
finally
{
  r.Close();
}

source: http://bytes.com/topic/c-sharp/answers/261090-check-wellformedness-xml

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文