xsd 中正则表达式中的反斜杠与 xjc (ant) 和 jaxb 验证
我的 xsd 文件中有以下正则表达式类型:
<xsd:simpleType name="Host">
<xsd:restriction base="xsd:string">
<xsd:pattern
value="\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b">
</xsd:pattern>
</xsd:restriction>
</xsd:simpleType>
当通过 xjc 在 ant 中生成此类型时,出现以下异常:
[xjc] [ERROR] InvalidRegex: Pattern value '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b' is not a valid regular expression. The reported error was: 'This expression is not supported in the current option setting.' at column '2'.
[xjc] line 10 of file:/.../src/META-INF/portscan.xsd
我可以通过将每个反斜杠 () 更改为双反斜杠 (\) 来修复此问题:
<xsd:simpleType name="Host">
<xsd:restriction base="xsd:string">
<xsd:pattern
value="\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b">
</xsd:pattern>
</xsd:restriction>
</xsd:simpleType>
但是,当验证在编组期间运行,我收到以下异常:
Caused by - class org.xml.sax.SAXParseException: cvc-pattern-valid: Value '80.245.120.45' is not facet-valid with respect to pattern '\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b' for type 'Host'.
显然,双反斜杠 (\\) 是导致验证失败的原因。但是如何对单个反斜杠进行编码以使 xjc 正常工作?
编辑:
嗯,现在找到答案了,似乎 xjc 正则表达式不支持“\b”。不考虑它们就解决了问题,它现在生成时没有错误,并且似乎在运行时可以工作。耶! :)
不过有谁知道这在没有单词边界的情况下是否安全?也许还有其他选择?
I have the following regex type in my xsd file:
<xsd:simpleType name="Host">
<xsd:restriction base="xsd:string">
<xsd:pattern
value="\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b">
</xsd:pattern>
</xsd:restriction>
</xsd:simpleType>
When generating from this in ant via xjc, I am getting the following exception:
[xjc] [ERROR] InvalidRegex: Pattern value '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b' is not a valid regular expression. The reported error was: 'This expression is not supported in the current option setting.' at column '2'.
[xjc] line 10 of file:/.../src/META-INF/portscan.xsd
I can fix this, by changing every backslash () to a double backslash (\):
<xsd:simpleType name="Host">
<xsd:restriction base="xsd:string">
<xsd:pattern
value="\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b">
</xsd:pattern>
</xsd:restriction>
</xsd:simpleType>
But then, when the validation runs during the marshalling I am getting the following exception:
Caused by - class org.xml.sax.SAXParseException: cvc-pattern-valid: Value '80.245.120.45' is not facet-valid with respect to pattern '\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b' for type 'Host'.
Obviously, the double backslash (\\) is responsible for the validation to fail. But how can I encode the single backslash to get xjc working?
Edit:
Ah well, found the answer now, seems like "\b" aint supported in xjc regexp's. Leaving them out fixed the issue, it now generated without error and seems to work during runtime. Yay! :)
Though does anyone know if this is secure without the word boundaries? Maybe there's an alternative?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
XML 模式规范中定义的正则表达式风格不支持字边界。
在您的情况下,不需要单词边界。 XML 架构类型中的模式方面始终要求正则表达式匹配整个字符串,就好像正则表达式以字符串开头锚点
^
或\A
开头并以\A
结尾带有字符串结尾锚点$
或\z
。由于 XML 架构正则表达式始终匹配整个字符串,因此您也不能在正则表达式中使用这些锚点。The regex flavor defined in the XML Schema specification does not support word boundaries.
In your case, the word boundaries are not needed. Pattern facets in XML schema types always require the regular expression to match the entire string, as if the regex started with a start-of-string anchor
^
or\A
and ended with an end-of-string anchor$
or\z
. Because XML schema regexes always match the whole string, you cannot use these anchors in your regexes either.