更新
原来的问题不再适合这个问题,所以我将单独保留这个问题来演示我尝试/学到的内容和背景。很明显,这不仅仅是一个“Base64 变体”,而且涉及更多一些。
背景:
我使用 python 3.x 进行编程主要是为了与开源程序 Blender 一起使用。我是一名新手/业余级程序员,但我对大概念了解得相当好
我已阅读与我的问题相关的这些文章。
问题:
我有一个二进制文件,其中包含与每个顶点(浮点数)的 x、y、z 坐标相对应的 3d 网格数据(浮点数列表和整数列表)以及构成网格面的顶点索引(整数) 。该文件以 xml 风格组织...
<SomeFieldLabel and header like info>**thensomedatabetween**</SomeFieldLabel>
这是“Vertices”字段的示例
<Vertices vertex_count="42816" base64_encoded_bytes="513792" check_value="4133547451">685506bytes of b64 encoded data
</Vertices>
- “Vertices”和“/Vertices<”之间有 685506 字节的数据/strong>"
- 这些字节仅由 aa、AZ、0-9 和 +,/ 组成,这是 base64 的标准
- 当我抓取这些字节并在 python 中使用标准 base64 解码时,我得到 513792 字节 如果 vertex_count="42816" 可信,则
- 每个顶点应该需要 42816*12 字节来表示 x,y,z。 42816*12 = 513792。非常好。
- 现在,如果我尝试将解码后的字节解压为 32 位浮点数,我会得到垃圾……所以有些东西是 ammis。
我认为某个地方有一个额外的加密步骤。也许有一个转换表、旋转密码或某种流密码?奇怪的是,字节数是正确的,但结果却不是,这应该限制可能性。有什么想法吗?以下是两个文件扩展名更改为 *.mesh 的示例文件。我不想公开这种文件格式,只是想为 Blender 编写一个导入器,这样我就可以使用这些模型。
这是两个示例文件。我从 Vertices 和 Facets 字段中提取了原始二进制文件(不是 b64 解码的),并从公司提供的此类文件的“查看器”中提供了边界框信息。
示例文件 1
- 未修改的文件
- <一个href="http://dl.dropbox.com/u/2586482/Mesh%20Data%20Demo/model2_base64vertices.data" rel="nofollow noreferrer">顶点二进制:
- facet 二进制:
- 解密数据: 这是包含解密的顶点字段和解密的面字段(分别为mesh2.vertices 和mesh2.faces)的.zip。它还包含一个 .stl 网格文件,可以在许多应用程序中查看/打开。
示例文件 2
- 未修改的文件
- <一个href="http://dl.dropbox.com/u/2586482/Mesh%20Data%20Demo/model3_base64vertices.data" rel="nofollow noreferrer">顶点二进制:
- facets 二进制:
- Bounding Box: Min[-4.6, -40.3, -7.3] Max[7.5, -23.1, 2.6]
关于 Vertices 字段的注释
- 标头指定 vertex_count
- 标头指定了 base64_encoded_bytes,它是进行 base64 编码之前的字节数。
- 标头指定了一个“check_value”,其重要性还有待确定。确定
- 该字段中的数据仅包含标准 Base64 字符
- 标准 Base64 解码后,输出数据具有... length = vertex_count*12 = base64_encoded_bytes。 b64 输出中偶尔会有 4 个额外字节?
-编码/解码字节的比率是4/3,这也是典型的base64
有关Facets字段的注释
更多窥探
我打开了公司提供的viewer.exe(在十六进制编辑器中)来查看这些文件(也是我获取边界框信息的地方)。以下是一些我觉得有趣并且可以进一步搜索的片段。
f_LicenseClient...Ì[电子邮件受保护][电子邮件受保护][电子邮件受保护][电子邮件受保护]_bLoadXXXXXXInternalEncrypted...¼[电子邮件受保护]_strSiteKey....í†......< /p>
在LoadXXXXXXInternalEncrypted 和 SaveXXXXXXInternalEncrypted 我已经用 XX 屏蔽了公司名称。看起来除了简单的 Base64 表变体之外,我们肯定还有一些加密。
SaveEncryptedModelToStream.................自...pUx....模型...ˆíC....流....
对我来说这看起来像是一个函数定义如何保存加密模型。
DefaultEncryptionMethod¼!@........ÿ.......€...€ÿÿ.DefaultEncryptionKey€–†....ÿ...ÿ.......€ ....ÿÿ.DefaultInincludeModelData–†....ÿ...ÿ.......€...€ÿÿ.DefaultVersion.@
啊哈...现在很有趣。默认加密密钥。请注意,每个描述符之间有 27 个字节,并且它们始终以“ÿÿ”结尾。这里是 24 个字节,不包括“ÿÿ”。对我来说,这是一个 192 位密钥……但谁知道所有 24 个字节是否都对应于该密钥?有什么想法吗?
80 96 86 00 18 00 00 FF 18 00 00 FF 01 00 00 00 00 00 00 80 01 00 00 00
代码片段
为了节省该线程的空间,我将此脚本放入我的投递箱中以供下载。它读取字段,从顶点和面字段中提取基本信息,并打印出一堆内容。您可以取消注释末尾,使其将数据块保存到单独的文件中,以便于分析。
basic_mesh_read.py
这是我用来的代码尝试标准 base64 库的所有“合理”变体。
try_all_b64_tables.py
Update
The original question is no longer the appropriate question for this problem, so I'm going to leave this alone to demonstrate what I tried/learned and for the background. It's clear that this is not just a "Base64 variation" and is a bit more involved.
Background:
I program in python 3.x mainly for use with the open source program Blender. I'm a novice/amateur level programmer but I understand the big concepts fairly well
I've read these articles relevant to my question.
Problem:
I have a binary file which contains 3d mesh data (lists of floats and lists of integers) corresponding to x,y,z coordinates for each vertex (floats) and the indices of the vertices which make up the faces of the mesh (integers). The file is organized in an xml'ish kind of feeling...
<SomeFieldLabel and header like info>**thensomedatabetween**</SomeFieldLabel>
Here is the example from the "Vertices" field
<Vertices vertex_count="42816" base64_encoded_bytes="513792" check_value="4133547451">685506bytes of b64 encoded data
</Vertices>
- There are 685506 bytes of data between "Vertices" and "/Vertices"
- Those bytes only consist of a-a, A-Z, 0-9, and +,/ which is standard for base64
- When I grab those bytes, and use standard base64decode in python, I get 513792 bytes back out
- If vertex_count="42816" can be believed, there should be 42816*12bytes needed to represent x,y,z for each vertex. 42816*12 = 513792. excellent.
- Now, if I try and unpack my decoded bytes as 32bit floats, I get garbage...so something is ammis.
I'm thinking there is an extra cryptographic step somewhere. Perhaps there is a translation table, rotation cipher or some kind of stream cipher? It's strange that the number of bytes is correct but that the results are not which should limit the possibilities. Any ideas? Here are two example files with the file extension changed to *.mesh. I don't want to publicly out this file format, just want to write an importer for Blender so I can use the models.
Here are two example files. I have extracted the raw binary (not b64 decoded) from the Vertices and Facets fields as well as provided the bounding box information from a "Viewer" for this type of file provided by the company.
Example File 1
Example File 2
Notes About the Vertices field
- The header specifies the vertex_count
- The header specifies base64_encoded_bytes which is the # of bytes BEFORE base64 encoding takes place
- The header specifies a "check_value" whose significance is yet to be determined
- The data in the field only contains the standard base64 characters
- After standard base64 decoding the output data has... length = vertex_count*12 = base64_encoded_bytes. Occasionally there are 4 extra bytes in the b64 output?
-the ratio of encoded/decoded bytes is 4/3 which is also typical base64
Notes about the Facets field
- The header specifies a facet_count
-
The header base64_encoded_bytes which is the # of bytes BEFORE base64 encoding takes place
-
The ratio of base64_encoded_bytes/facet_count seems to vary quite a
bit. From 1.1 to about 1.2. We would expect a ratio of 12 if they
were encoded as 3x4byte integers corresponding to the vertex indices.
So either this field is compresesed or the model is saved with
triangle strips, or both :-/
More Snooping
I opened up the viewer.exe (in a hex editor) which is provided by the company to view these files (also where I got the bounding box info). Here are some snippets which I found interesting and could further the search.
f_LicenseClient...Ì[email protected][email protected][email protected][email protected]_bLoadXXXXXXInternalEncrypted...¼[email protected]_strSiteKey....í†......
In LoadXXXXXXInternalEncrypted and SaveXXXXXXInternalEncrypted I've blocked out the company name with XX. It looks like we definitely have some encryption beyond a simple base64 table variation.
SaveEncryptedModelToStream.................Self...pUx....Model...ˆÃC....Stream....
This to me looks like a function definition on how to save an encrypted model.
DefaultEncryptionMethod¼!@........ÿ.......€...€ÿÿ.DefaultEncryptionKey€–†....ÿ...ÿ.......€....ÿÿ.DefaultIncludeModelData –†....ÿ...ÿ.......€...€ÿÿ.DefaultVersion.@
Ahhh...now that is interesting. A default encryption key. Notice there are 27 bytes between each of those descriptors and they always end with "ÿÿ." Here is 24 bytes excluding "ÿÿ." To me, this is a 192 bit key...but who knows if all 24 of those bytes correspond to the key? Any thoughts?
80 96 86 00 18 00 00 FF 18 00 00 FF 01 00 00 00 00 00 00 80 01 00 00 00
Code Snippets
To save space in this thread, I put this script in my drop-box for download. It reads through the fiel, extracts basic info from the vertices and facets fields, and prints out a bunch of stuff. You can de-comment the end to have it save a data block into a separate file for easier analysis.
basic_mesh_read.py
This is the code I used to try all "reasonable" variations on the standard base64 library.
try_all_b64_tables.py
发布评论
评论(1)
我不知道为什么你认为结果不是浮点数。您提供的“解密数据”中的顶点数据包含前 4 个字节“f2 01 31 41”。给定 LSB 字节顺序,对应于位模式“413101f2”,它是浮点值 11.062973 的 IEEE 754 表示形式。该文件中的所有 4 字节值都在同一范围内,因此我假设它们都是浮点值。
I am not sure why you think the results are not floating point numbers. The vertices data in the "decrypted data" you gave, contains as first 4 bytes "f2 01 31 41". Given an LSB byte order, that corresponds to the bit pattern "413101f2", which is the IEEE 754 representation of the float value 11.062973. All the 4 byte values in that file are in that same range, so I assume they all are float values.