如何从具有多个XML的文件中提取多个字节序列

发布于 2025-02-07 16:51:12 字数 1344 浏览 1 评论 0原文

让我们考虑一个.dll我拥有的文件。基本上它包含file.blob。该文件由这样的多个链接的代码组成:

navbar\groupopenbutton[0].svg<?xml version='1.0' encoding='UTF-8'?>
<svg x="0px" y="0px" viewBox="0 0 32 32" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xml:space="preserve" id="Layer_2" tag="Element1">
  <g tag="Element">
    <rect x="0" y="0" width="32" height="32" rx="0" ry="0" fill="#AFAFAF" class="Paint" tag="Paint" />
  </g>
</svg>    l  navbar\groupopenbutton_glyph.png‰PNG

   
IHDR   
      u0   gAMA  ±
üa     pHYs  ¼  ¼•¼rI  IDAT8OÝ“;Ja…ИJ›X)‰
ðш… HŠ)æýÀ"E
eÀZ[ÁX‰«Ð
$;ÉFÄ5X™ï2‡iEÈCþó˜ ÷fÜ?#MÓ«<ÏO$˜EÑXÒ¹ªªzq¿$Iòl¡ÑΔ^-S­Á ¾Á#¾ A°«¸
îtHaFqÎï±ìnP
`,¹ÆȲì^J60­M$+Ër‹-|2»……F;3¦¥ïû}ÕjPÜ#ø‚¿FŠßEQì+nƒÂy†?Fβ»A¡‚w’k¦Ä$˜×ú³™
ŒÆ2µÐhg¼[Ïó6U«Áf¶  îá“øÀúv·A8„F>¬‘ìnðÈSÞùLòÏàÜ
n
^¨2ñL    IEND®B`‚#     

此代码描述了我的Devexpress皮肤之一的图标。

如我们所见,这是一个XML文件,以及一块二进制数据,这是一个图像。请注意,他的二进制代码不属于任何标签,它在XML结构之外。整个代码基本上都说:“这是Navbar的组打开按钮,它具有特定尺寸的视图框,下面是图标本身的二进制框”。此file.blob由这些块中的多个组成,因此变得很繁琐。

我的问题是,这种数据存储是什么,我该如何正确阅读和解析?理想情况下,我希望将该二进制代码作为一个值,而名称(navbar \ groupopenbutton)作为键。我真的不能使用正则表达式,因为我必须将所有内容转换为二进制,不要在二进制中丢失这些“怪异”符号后面的数据,而且它也非常慢。有什么想法吗?

也许我可以使用汇编类中的某些东西,或者可能会以精致的方式解析此内容?

谢谢。

Let's consider a .dll file I have. Basically it contains file.blob. This file consists of multiple linked pieces of code like this:

navbar\groupopenbutton[0].svg<?xml version='1.0' encoding='UTF-8'?>
<svg x="0px" y="0px" viewBox="0 0 32 32" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xml:space="preserve" id="Layer_2" tag="Element1">
  <g tag="Element">
    <rect x="0" y="0" width="32" height="32" rx="0" ry="0" fill="#AFAFAF" class="Paint" tag="Paint" />
  </g>
</svg>    l  navbar\groupopenbutton_glyph.png‰PNG

   
IHDR   
      u0   gAMA  ±
üa     pHYs  ¼  ¼•¼rI  IDAT8OÝ“;Ja…ИJ›X)‰
ðш… HŠ)æýÀ"E
eÀZ[ÁX‰«Ð
$;ÉFÄ5X™ï2‡iEÈCþó˜ ÷fÜ?#MÓ«<ÏO$˜EÑXÒ¹ªªzq¿$Iòl¡ÑΔ^-S­Á ¾Á#¾ A°«¸
îtHaFqÎï±ìnP
`,¹ÆȲì^J60­M$+Ër‹-|2»……F;3¦¥ïû}ÕjPÜ#ø‚¿FŠßEQì+nƒÂy†?Fβ»A¡‚w’k¦Ä$˜×ú³™
ŒÆ2µÐhg¼[Ïó6U«Áf¶  îá“øÀúv·A8„F>¬‘ìnðÈSÞùLòÏàÜ
n
^¨2ñL    IEND®B`‚#     

This code describes an icon for one of my DevExpress Skins.

As we can see, this is an XML file, plus a chunk of binary data, which is an image. Note that his binary code doesn't belong to any tag, it's just outside the XML structure. This entire piece of code basically says "it is a group open button from navbar, it has a viewbox with particular size, and below is the binary of the icon itself". And this file.blob consists of multiple of these chunks, thus becoming tedious to parse and read.

My question is, what is this kind of data storage, how do I read and parse it correctly? Ideally, I would want to have a hashmap of that binary code as a value, and name (navbar\groupopenbutton) as a key. I can't really use regex, because I would have to convert everything to binary, to not lose the data behind those "weird" symbols in binary, and also it would be extremely slow. Any ideas?

Maybe there is something in the Assembly class that I could use, or maybe some special Stream that can parse this in a refined way?

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文