XSLT:获取或匹配 Base64 编码数据的哈希值
我需要找到一种方法来找到 XML 节点 //note/resource/data 中的 base64 编码数据的哈希值,或者以某种方式将其与节点中的哈希值匹配//note/content/en-note//en-media@hash
请参阅下面的完整 XML 文件
请建议一种{获取|匹配}的方法 使用 XSLT
4aaafc3e14314027bb1d89cf7d59a06c
{from|with}
R0lGODlhEAAQAPMAMcDAwP/crv/erbigfVdLOyslHQAAAAECAwECAwECAwECAwECAwECAwECAwEC
AwECAyH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAHgfNAGQAh/wtNU09GRklDRTku
MBUAAAAJcEhZcwAACxMAAAsTAQCanBgAIf8LTVNPRkZJQ0U5LjATAAAAB3RJTUUH1AkWBTYSQXe8
fQAh+QQBAAAAACwAAAAAEAAQAAADSQhgpv7OlDGYstCIMqsZAXYJJEdRQRWRrHk2I9t28CLfX63d
ZEXovJ7htwr6dIQB7/hgJGXMzFApOBYgl6n1il0Mv5xuhBEGJAAAOw==
这个示例 XML 文件显然已被精简以简洁/简单。实际可能包含>每个注释 1 个图像,因此需要获取/匹配哈希值。
XML 文件:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE en-export SYSTEM "http://xml.evernote.com/pub/evernote-export.dtd">
<en-export export-date="20091029T063411Z" application="Evernote/Windows" version="3.0">
<note>
<title>A title here</title>
<content><![CDATA[
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml.dtd">
<en-note bgcolor="#FFFFFF">
<p>Some text here (followed by the picture)
<p><en-media hash="4aaafc3e14314027bb1d89cf7d59a06c" type="image/gif" border="0" width="16" height="16" alt="A picture"/></p>
<p>Some more text here (preceded by the picture)
</en-note>
]]></content>
<created>20090925T063154Z</created>
<note-attributes>
<author/>
</note-attributes>
<resource>
<data encoding="base64">
R0lGODlhEAAQAPMAMcDAwP/crv/erbigfVdLOyslHQAAAAECAwECAwECAwECAwECAwECAwECAwEC
AwECAyH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAHgfNAGQAh/wtNU09GRklDRTku
MBUAAAAJcEhZcwAACxMAAAsTAQCanBgAIf8LTVNPRkZJQ0U5LjATAAAAB3RJTUUH1AkWBTYSQXe8
fQAh+QQBAAAAACwAAAAAEAAQAAADSQhgpv7OlDGYstCIMqsZAXYJJEdRQRWRrHk2I9t28CLfX63d
ZEXovJ7htwr6dIQB7/hgJGXMzFApOBYgl6n1il0Mv5xuhBEGJAAAOw==
</data>
<mime>image/gif</mime>
<resource-attributes>
<file-name>clip_image001.gif</file-name>
</resource-attributes>
</resource>
</note>
</en-export>
实施的解决方案
使用 Jackem 建议的解决方案的概念。主要区别在于我避免创建自己的 Java 类(并创建额外的依赖项)。我在 XSLT 中进行处理,因为它足够简单,仅引用基本 Java 库附带的外部依赖项。
Jackem 的解决方案更正确,因为它不会丢失某些哈希值中的前导零,但是我发现使用基本的黑客技术在其他地方处理这个问题要容易得多。
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
...
xmlns:md5="java.security.MessageDigest"
xmlns:bigint="java.math.BigInteger"
exclude-result-prefixes="md5 bigint">
...
<xsl:for-each select="resource">
<xsl:variable name="md5inst" select="md5:getInstance('MD5')" />
<xsl:value-of select="md5:update($md5inst, $b64bin)" />
<xsl:variable name="imgmd5bytes" select="md5:digest($md5inst)" />
<xsl:variable name="imgmd5bigint" select="bigint:new(1, $imgmd5bytes)" />
<xsl:variable name="imgmd5str" select="bigint:toString($imgmd5bigint, 16)" />
<!-- NOTE: $imgmd5str loses the leading zero from imgmd5bytes (if there is one) -->
</xsl:for-each>
...
PS 请参阅 兄弟问题 了解我对 base64 的实现-->图像文件
转换
This question is a subquestion of another question I have asked previously.
I need to find a way to find a way to find the hash for the base64 encoded data in the XML node //note/resource/data, or somehow otherwise match it to the hash value in the node //note/content/en-note//en-media@hash
See below for the full XML file
Please suggest a way to {obtain|match} using XSLT
4aaafc3e14314027bb1d89cf7d59a06c
{from|with}
R0lGODlhEAAQAPMAMcDAwP/crv/erbigfVdLOyslHQAAAAECAwECAwECAwECAwECAwECAwECAwEC
AwECAyH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAHgfNAGQAh/wtNU09GRklDRTku
MBUAAAAJcEhZcwAACxMAAAsTAQCanBgAIf8LTVNPRkZJQ0U5LjATAAAAB3RJTUUH1AkWBTYSQXe8
fQAh+QQBAAAAACwAAAAAEAAQAAADSQhgpv7OlDGYstCIMqsZAXYJJEdRQRWRrHk2I9t28CLfX63d
ZEXovJ7htwr6dIQB7/hgJGXMzFApOBYgl6n1il0Mv5xuhBEGJAAAOw==
This sample XML file has obviously been trimmed for brevity/simplicity. The actual may contain > 1 image per note, therefore the need to obtain/match hashes.
The XML file:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE en-export SYSTEM "http://xml.evernote.com/pub/evernote-export.dtd">
<en-export export-date="20091029T063411Z" application="Evernote/Windows" version="3.0">
<note>
<title>A title here</title>
<content><![CDATA[
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml.dtd">
<en-note bgcolor="#FFFFFF">
<p>Some text here (followed by the picture)
<p><en-media hash="4aaafc3e14314027bb1d89cf7d59a06c" type="image/gif" border="0" width="16" height="16" alt="A picture"/></p>
<p>Some more text here (preceded by the picture)
</en-note>
]]></content>
<created>20090925T063154Z</created>
<note-attributes>
<author/>
</note-attributes>
<resource>
<data encoding="base64">
R0lGODlhEAAQAPMAMcDAwP/crv/erbigfVdLOyslHQAAAAECAwECAwECAwECAwECAwECAwECAwEC
AwECAyH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAHgfNAGQAh/wtNU09GRklDRTku
MBUAAAAJcEhZcwAACxMAAAsTAQCanBgAIf8LTVNPRkZJQ0U5LjATAAAAB3RJTUUH1AkWBTYSQXe8
fQAh+QQBAAAAACwAAAAAEAAQAAADSQhgpv7OlDGYstCIMqsZAXYJJEdRQRWRrHk2I9t28CLfX63d
ZEXovJ7htwr6dIQB7/hgJGXMzFApOBYgl6n1il0Mv5xuhBEGJAAAOw==
</data>
<mime>image/gif</mime>
<resource-attributes>
<file-name>clip_image001.gif</file-name>
</resource-attributes>
</resource>
</note>
</en-export>
Implemented solution
Using concept of the solution suggested by Jackem. The main difference is that I avoid creating my own Java class (and creating an extra dependency). I do the processing within the XSLT, since it's straight forward enough, only referencing external dependencies that come with the basic Java libraries.
Jackem's solution is more correct because it doesn't lose the leading zero in some hashes, however I found that it was much easier to take care of this elsewhere using li'l basic hackery.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
...
xmlns:md5="java.security.MessageDigest"
xmlns:bigint="java.math.BigInteger"
exclude-result-prefixes="md5 bigint">
...
<xsl:for-each select="resource">
<xsl:variable name="md5inst" select="md5:getInstance('MD5')" />
<xsl:value-of select="md5:update($md5inst, $b64bin)" />
<xsl:variable name="imgmd5bytes" select="md5:digest($md5inst)" />
<xsl:variable name="imgmd5bigint" select="bigint:new(1, $imgmd5bytes)" />
<xsl:variable name="imgmd5str" select="bigint:toString($imgmd5bigint, 16)" />
<!-- NOTE: $imgmd5str loses the leading zero from imgmd5bytes (if there is one) -->
</xsl:for-each>
...
P.S. see sibling question for my implementation of of the base64-->image file
conversion
This question is a subquestion of another question I have asked previously.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对于有关在 XSLT 中进行 Base64 解码的相关问题,您有 接受了使用 Saxon 和 Java 扩展的答案。所以我假设你可以使用这些。
在这种情况下,您可以在 Java 中创建一个扩展来计算 MD5 和:
然后您可以从使用 Saxon 运行的 XSLT 2.0 样式表中调用该扩展。假设您已经在变量
data
中拥有 Base64 解码数据(例如,来自扩展函数saxon:base64Binary-to-octets
,如链接答案中所示):For your related question about doing the base64 decoding in XSLT, you have accepted an answer which uses Saxon and Java extensions. So I assume you are OK with using those.
In that case, you can create an extension in Java for computing the MD5 sum:
From your XSLT 2.0 stylesheet which you run with Saxon, you can then just call that extension. Assuming you already have the base64-decoded data (for example from extension function
saxon:base64Binary-to-octets
as in the linked answer) in variabledata
:4aaaf...
是解码 Base64 编码数据时获得的二进制数据的 MD5。我认为您别无选择,只能解码元素的内容并通过 MD5 实现运行它,这显然超出了 XSL 转换的范围。据推测,XSLT 的结果将由其他一些代码处理,这些代码可以提取和验证图像。
The
4aaaf...
is the MD5 of the binary data you get when you decode the base64-encoded data. I don't think you have any choice but to decode the contents of<data>
element and run it through an MD5 implementation, which is obviously outside the scope of an XSL transformation. Presumably, the result of the XSLT will be processed by some other code, which can extract and verify the images.怎么样(将 commons-codec 添加到您的类路径中):
How about this (add commons-codec to your classpath):