在 java 中使用 XSD 进行 XML 验证

发布于 2024-10-10 02:42:57 字数 1338 浏览 0 评论 0原文

我有以下课程:

package com.somedir.someotherdir;

import java.util.logging.Level;
import java.util.logging.Logger;

import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;

public class SchemaValidator
{
 private static Logger _logger = Logger.getLogger(SchemaValidator.class.getName());

 /**
  * @param file - the relative path to and the name of the XML file to be validated
  * @return true if validation succeeded, false otherwise
  */
 public final static boolean validateXML(String file)
 {
  try
  {
   SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
   Schema schema = factory.newSchema();
   Validator validator = schema.newValidator();
   validator.validate(new StreamSource(file));
   return true;
  }
  catch (Exception e)
  {
   _logger.log(Level.WARNING, "SchemaValidator: failed validating " + file + ". Reason: " + e.getMessage(), e);
   return false;
  }
 }
}

我想知道我是否应该使用 schema.newValidator("dir/to/schema.xsd") 还是当前版本可以?我读到有一些 DoS 漏洞,也许有人可以提供更多信息?另外,路径必须是绝对路径还是相对路径?
大多数要验证的 XML 都有自己的 XSD,因此我想读取 XML 本身中提到的架构 (xs:noNamespaceSchemaLocation="schemaname.xsd")。
仅在启动或手动重新加载(服务器软件)期间进行验证。

I have the following class:

package com.somedir.someotherdir;

import java.util.logging.Level;
import java.util.logging.Logger;

import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;

public class SchemaValidator
{
 private static Logger _logger = Logger.getLogger(SchemaValidator.class.getName());

 /**
  * @param file - the relative path to and the name of the XML file to be validated
  * @return true if validation succeeded, false otherwise
  */
 public final static boolean validateXML(String file)
 {
  try
  {
   SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
   Schema schema = factory.newSchema();
   Validator validator = schema.newValidator();
   validator.validate(new StreamSource(file));
   return true;
  }
  catch (Exception e)
  {
   _logger.log(Level.WARNING, "SchemaValidator: failed validating " + file + ". Reason: " + e.getMessage(), e);
   return false;
  }
 }
}

I would like to know if I should use schema.newValidator("dir/to/schema.xsd") after all or is the current version alright? I read that there's some DoS vulnerability, maybe someone could provide more info on that? Also, does the path have to be absolute or relative?
Most of the XMLs to be validated each have their own XSD, so I'd like to read the schema that is mentioned in the XML itself (xs:noNamespaceSchemaLocation="schemaname.xsd").
The validation is done only during startup or manual reload (server software).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

乱世争霸 2024-10-17 02:42:57

您真的是指 XML DTD DOS 攻击吗?如果是这样,网上有一些不错的文章:

XML 拒绝服务攻击和防御 http://msdn.microsoft.com/en-us/magazine/ee335713.aspx

来自 IBMdeveloperWorks。 “提示:配置 SAX 解析器以进行安全处理”

实体解析在 XML 中打开了许多潜在的安全漏洞。[...]
- 托管外部 DTD 的站点可以记录通信。 [...]
- 托管 DTD 的站点可能会减慢解析速度 [...] 它还可以通过提供格式错误的 DTD 来完全停止解析。
- 如果远程站点更改了 DTD,它可以使用默认属性值将新内容注入到文档中[...]它可以通过重新定义实体引用来更改文档的内容。

我想我不确定它是否可以直接应用于你的程序,它可以为进一步调查提供一些线索

Are you really meaning XML DTD DOS attack? If so, there are some good articles on the net:

XML Denial of Service Attacks and Defenses http://msdn.microsoft.com/en-us/magazine/ee335713.aspx

From IBM developerWorks. "Tip: Configure SAX parsers for secure processing":

Entity resolution opens a number of potential security holes in XML.[...]
- The site where the external DTD is hosted can log the communication. [...]
- The site that hosts the DTD can slow the parsing [...] It can also stop the parse completely by serving a malformed DTD.
- If the remote site changes the DTD, it can use dafault attribute values to inject new content into the document[...] It can change the content of the document by redefining entity references.

Thought I am not sure that it can be directly applied to your program, it can give some clues for further investigation

随梦而飞# 2024-10-17 02:42:57

据我解释,

  1. 如果您的模式引用互联网上托管的模式,则 Schema 对象将尝试在运行时获取它们。据我所知,默认的 Schema 实现不会缓存这些架构。 W3C 已报告 不良编码实践导致对其网站事实上的 DDoS(每天高达 1.3 亿个 dtd 请求!)。
  2. 如果您要验证外部不受控制的 xml 文件,那么您还会遇到尝试从“可能是恶意的”xml 源获取其他架构的 Schema

对于更多邪恶的攻击媒介,请查看 sign 之前的答案

为了避免这个陷阱,您可以在本地存储所有外部资源并使用 SchemaFactory.setResourceResolver 方法指示 Schema 如何获取它们。

As I interpret it, the javax.xml.validation.Schema object returned by SchemaFactory.newSchema() will try to fetch other schemas referred in the xml/xsd files to validate as indicated in the corresponding xsi:schemaLocation attributes. This implies that:

  1. If your schemas refer to schemas hosted in the internet, the Schema object will try to fetch them during runtime. As long as I'm aware, the default Schema implementation does not cache those schemas. The W3C already reported on bad coding practices resulting in de-facto DDoS to their website (up to 130M dtd requests per day!).
  2. If you are going to validate external uncontrolled xml files, then you are also exposed to the Schema trying to fetch other schemas from "possibly bad intended" xml sources.

For more evil attack vectors, take a look into sign's previous answer

To avoid this pitfall, you can store all external resources locally and use the SchemaFactory.setResourceResolver method to instruct the Schema how to fetch them.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文