在 UNIX 中创建二进制文件

发布于 2024-12-14 12:47:28 字数 4523 浏览 6 评论 0原文

这个问题已经存在了一段时间,我想如果我能让它发挥作用,我应该提供一些奖励积分。

我做了什么......

最近在工作中,我编写了一个解析器,可以将二进制文件转换为可读格式。二进制文件不是带有 10101010 字符的 Ascii 文件。它已被编码为二进制。因此,如果我对文件执行 cat 操作,我会得到以下内容 -

[jaypal~/Temp/GTP]$ cat T20111017153052.NEW 
==?sGTP?ղ?N????W????&Xx1?T?&Xx1?;
?d@#e?
      ?0H????????|?X?@@(?ղ??VtPOC01
cceE??k@9??W傇??R?K?i2??d@#e???&Xx1&Xx??!?
blackberrynet?/??!

??!

??#ripassword??W傅?W傆??0H??
                            #R??@Vtc@@(?ղ??n?POC01

因此,我使用 hexdump 实用程序使文件显示以下内容并将其重定向到文件。现在我有了输出文件,它是一个包含十六进制值的文本文件。

[jaypal~/Temp/GTP]$ hexdump -C T20111017153052.NEW 
00000000  3d 3d 01 f8 73 47 54 50  02 f1 d5 b2 be 4e e4 d7  |==..sGTP.....N..|
00000010  00 01 01 00 01 80 00 cc  57 e5 82 00 00 00 00 00  |........W.......|
00000020  00 00 00 00 00 00 00 00  87 d3 f5 13 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 01 00 10  |................|
00000040  01 01 0f 00 00 00 00 00  26 58 78 31 00 b3 54 c5  |........&Xx1..T.|
00000050  26 58 78 31 00 b4 3b 0a  00 00 ad 64 13 40 01 03  |&Xx1..;....d.@..|
00000060  23 16 65 f3 01 01 0b 91  30 19 48 99 f2 ff ff ff  |#.e.....0.H.....|
00000070  ff ff ff 02 00 7c 00 dc  01 58 00 a0 40 40 28 02  |.....|...X..@@(.|
00000080  f1 d5 b2 b8 ca 56 74 50  4f 43 30 31 00 00 00 00  |.....VtPOC01....|
00000090  00 04 0a 63 63 07 00 00  00 00 00 00 00 00 00 00  |...cc...........|
000000a0  00 00 00 65 45 00 00 b4  fb 6b 40 00 39 11 16 cd  |[email protected]...|
000000b0  cc 57 e5 82 87 d3 f5 52  85 a1 08 4b 00 a0 69 02  |.W.....R...K..i.|
000000c0  32 10 00 90 00 00 00 00  ad 64 00 00 02 13 40 01  |2........d....@.|

经过大量 awksedcut 后,脚本将十六进制值转换为可读文本。为此,我使用了偏移定位,它将标记转换的每个参数的开始和结束位置。所有转换后生成的文件看起来像这样

[jaypal:~/Temp/GTP] cat textfile.txt 
Beginning of DB Package Identifier: ==
Total Package Length: 508
Offset to Data Record Count field: 115
Data Source: GTP
Timestamp: 2011-10-25
Matching Site Processor ID: 1
DB Package format version: 1
DB Package Resolution Type: 0
DB Package Resolution Value: 1
DB Package Resolution Cause Value: 128
Transport Protocol: 0
SGSN IP Address: 220.206.129.47
GGSN IP Address: 202.4.210.51

为什么我这样做

我是一名测试工程师,手动验证二进制文件是一个很大的痛苦。我必须手动解析偏移量并使用计算器来转换它们并针对 Wireshark 和 GUI 进行验证。

现在问题部分

我希望做与我所做的相反的事情。这是我的计划 -

  • 有一个易于阅读的输入文本文件,其中包含参数:值
  • 用户可以简单地将值放在它们旁边(例如,日期将是一个参数,用户可以给出他们希望数据文件具有的日期)。
  • 该脚本将从输入文本文件中删除所有相关信息(用户提供的信息)并将其转换为十六进制值。
  • 一旦文件被转换为十六进制值,我希望将其编码回二进制。

前三个步骤已完成

问题

一旦我的脚本将输入文本文件转换为具有十六进制值的文本文件,我就会得到如下文件(注意我可以执行 cat 就可以了)。

[visdba@hw-diam-test01 ParserDump]$ cat temp_file | sed 's/.\{32\}/&\n/g' | sed 's/../& /g'
3d 3d 01 fc 73 47 54 50 02 f1 d6 55 3c 9f 49 9c
00 01 01 00 01 80 00 dc ce 81 2f 00 00 00 00 00
00 00 00 00 00 00 00 00 ca 04 d2 33 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10
01 01 0f 00 00 07 04 ea 00 00 ff ff 00 00 14 b7
00 00 ff ff 00 00 83 ec 00 00 83 62 54 14 59 00
60 38 34 f5 01 01 0b 58 62 70 11 60 f6 ff ff ff
ff ff ff 02 00 7c 00 d0 01 4c 00 b0 40 40 28 02
f1 d6 55 38 cb 2b 23 50 4f 43 30 31 00 00 00 00
00 04 0a 63 63 07 00 00 00 00 00 00 00 00 00 00

我的意图是将这个转换后的文件编码二进制,这样当我对文件执行cat时,我会得到一堆垃圾值。

[jaypal~/Temp/GTP]$ cat temp.file 
==?sGTP?ղ?N????W????&Xx1?T?&Xx1?;
?d@#e?
      ?0H????????|?X?@@(?ղ??VtPOC01
cceE??k@9??W傇??R?K?i2??d@#e???&Xx1&Xx??!?
blackberrynet?/??!

??!

所以问题是这样的。 如何以这种形式对其进行编码?

为什么要这样做?

我们没有大量生产中的 GTP(GPRS 隧道协议)消息。我想如果我对此进行逆向工程,我可以有效地创建一个数据生成器并生成我自己的数据。

总结

可能有复杂的工具,但我不想花太多时间学习它们。大约两个月过去了,我开始在 *nix 平台上工作,并且刚刚掌握了它的强大工具,例如 sedawk

我真正想要的是一些帮助和指导来实现这一目标。

再次感谢您的阅读! 200分等待着能指引我正确方向的人。 :)

示例文件

这是原始 二进制文件

示例 这是 输入文本文件,允许用户输入值

这是 归档我的脚本在输入文本文件的所有转换完成后创建。

如何将文件 3 的编码更改为文件 1

This question was out there for a while and I thought I should offer some bonus points if I can get it to work.

What did I do…

Recently at work, I wrote a parser that would convert a binary file in a readable format. Binary file isn't an Ascii file with 10101010 characters. It has been encoded in binary. So if I do a cat on the file, I get the following -

[jaypal~/Temp/GTP]$ cat T20111017153052.NEW 
==?sGTP?ղ?N????W????&Xx1?T?&Xx1?;
?d@#e?
      ?0H????????|?X?@@(?ղ??VtPOC01
cceE??k@9??W傇??R?K?i2??d@#e???&Xx1&Xx??!?
blackberrynet?/??!

??!

??#ripassword??W傅?W傆??0H??
                            #R??@Vtc@@(?ղ??n?POC01

So I used hexdump utility to make the file display following content and redirected it to a file. Now I had my output file which was a text file containing Hex values.

[jaypal~/Temp/GTP]$ hexdump -C T20111017153052.NEW 
00000000  3d 3d 01 f8 73 47 54 50  02 f1 d5 b2 be 4e e4 d7  |==..sGTP.....N..|
00000010  00 01 01 00 01 80 00 cc  57 e5 82 00 00 00 00 00  |........W.......|
00000020  00 00 00 00 00 00 00 00  87 d3 f5 13 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 01 00 10  |................|
00000040  01 01 0f 00 00 00 00 00  26 58 78 31 00 b3 54 c5  |........&Xx1..T.|
00000050  26 58 78 31 00 b4 3b 0a  00 00 ad 64 13 40 01 03  |&Xx1..;....d.@..|
00000060  23 16 65 f3 01 01 0b 91  30 19 48 99 f2 ff ff ff  |#.e.....0.H.....|
00000070  ff ff ff 02 00 7c 00 dc  01 58 00 a0 40 40 28 02  |.....|...X..@@(.|
00000080  f1 d5 b2 b8 ca 56 74 50  4f 43 30 31 00 00 00 00  |.....VtPOC01....|
00000090  00 04 0a 63 63 07 00 00  00 00 00 00 00 00 00 00  |...cc...........|
000000a0  00 00 00 65 45 00 00 b4  fb 6b 40 00 39 11 16 cd  |[email protected]...|
000000b0  cc 57 e5 82 87 d3 f5 52  85 a1 08 4b 00 a0 69 02  |.W.....R...K..i.|
000000c0  32 10 00 90 00 00 00 00  ad 64 00 00 02 13 40 01  |2........d....@.|

After tons of awk, sed and cut, the script converted hex values into readable text. To do so, I used the offset positioning which would mark start and end position of each parameter converted. The resulting file after all conversion looks like this

[jaypal:~/Temp/GTP] cat textfile.txt 
Beginning of DB Package Identifier: ==
Total Package Length: 508
Offset to Data Record Count field: 115
Data Source: GTP
Timestamp: 2011-10-25
Matching Site Processor ID: 1
DB Package format version: 1
DB Package Resolution Type: 0
DB Package Resolution Value: 1
DB Package Resolution Cause Value: 128
Transport Protocol: 0
SGSN IP Address: 220.206.129.47
GGSN IP Address: 202.4.210.51

Why did I do it

I am a test engineer and to manually validate binary files was a major pain. I had to manually parse through the offsets and use a calculator to convert them and validate it against Wireshark and GUI.

Now the question part

I wish to do the reverse of what I did. This was my plan -

  • Have an easy to read Input text file which would have Parameters : Values.
  • User can simply put values next to them (eg Date would be a parameter and user can give date they want the data file to have).
  • The script will cut out all relevent information (user provided information) from the Input text file and convert them into hex values.
  • Once the file has been converted in to hex values, I wish to encode it back into binary.

First three steps are done

Problem

Once my script converts the Input text file in to a text file with hex values, I get a file like follows (notice I can do cat on it).

[visdba@hw-diam-test01 ParserDump]$ cat temp_file | sed 's/.\{32\}/&\n/g' | sed 's/../& /g'
3d 3d 01 fc 73 47 54 50 02 f1 d6 55 3c 9f 49 9c
00 01 01 00 01 80 00 dc ce 81 2f 00 00 00 00 00
00 00 00 00 00 00 00 00 ca 04 d2 33 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10
01 01 0f 00 00 07 04 ea 00 00 ff ff 00 00 14 b7
00 00 ff ff 00 00 83 ec 00 00 83 62 54 14 59 00
60 38 34 f5 01 01 0b 58 62 70 11 60 f6 ff ff ff
ff ff ff 02 00 7c 00 d0 01 4c 00 b0 40 40 28 02
f1 d6 55 38 cb 2b 23 50 4f 43 30 31 00 00 00 00
00 04 0a 63 63 07 00 00 00 00 00 00 00 00 00 00

My intension is to encoded this converted file in to a binary so that when I do cat on the file, I get bunch of garbage values.

[jaypal~/Temp/GTP]$ cat temp.file 
==?sGTP?ղ?N????W????&Xx1?T?&Xx1?;
?d@#e?
      ?0H????????|?X?@@(?ղ??VtPOC01
cceE??k@9??W傇??R?K?i2??d@#e???&Xx1&Xx??!?
blackberrynet?/??!

??!

So the question is this. How do I encode it in this form?

Why I want to do this?

We don't have a lot of GTP (GPRS Tunnelling Protocol) messages on production. I thought if I reverse engineer this, I could effectively create a data generator and make my own data.

Sum things up

There may be sophisticated tools out there, but I don't want to spend too much time learning them. It's been around 2 months, I have started working on the *nix platform and just getting hand around it's power tools like sed and awk.

What I do want is some help and guidance to make this happen.

Thanks again for reading! 200 points awaits for someone who can guide me in the right direction. :)

Sample Files

Here is a sample of Original Binary File

Here is a sample of Input Text File that would allow the User to punch in values

Here is a sample of File that my script creates after all the conversion from the Input Text File is complete.

How do I change the encoding of File 3 to File 1?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

赴月观长安 2024-12-21 12:47:28

您可以使用 xxd 非常简单地在二进制文件/十六进制文件之间进行转换。

数据到十六进制

echo  Hello | xxd -p 
48656c6c6f0a

十六进制到数据

echo 48656c6c6f0a | xxd -r -p
Hello

echo 48 65 6c 6c 6f 0a | xxd -r -p
Hello

-p 是 postscript 模式,允许更自由格式的输入

这是 xxd -r -p text 的输出,其中文本是数据你在上面给出

==▒sGTP▒▒U<▒I▒▒▒΁/▒▒3▒▒▒▒▒▒▒▒▒bTY`84▒
                                     Xbp`▒▒▒▒▒▒▒|▒L▒@@(▒▒U8▒+#POC01
:▒ިv▒b▒▒▒▒TY`84Ud▒▒▒▒>▒▒▒▒▒▒▒!▒
blackberrynet▒/▒▒!
M
▒▒!
N
▒▒#Oripassword▒▒΁/▒▒΁/▒▒Xbp`▒@@(▒▒U8▒IvPOC01
:qU▒b▒▒▒▒▒▒TY`84U▒▒▒*:▒▒!
▒k▒▒▒#O Welcmme!
▒!
M

You can use xxd to convert to and from binary files / hexdumps quite simply.

data to hex

echo  Hello | xxd -p 
48656c6c6f0a

hex to data

echo 48656c6c6f0a | xxd -r -p
Hello

or

echo 48 65 6c 6c 6f 0a | xxd -r -p
Hello

The -p is postscript mode which allows for a more freeform input

This is the output from xxd -r -p text where text is the data you give above

==▒sGTP▒▒U<▒I▒▒▒΁/▒▒3▒▒▒▒▒▒▒▒▒bTY`84▒
                                     Xbp`▒▒▒▒▒▒▒|▒L▒@@(▒▒U8▒+#POC01
:▒ިv▒b▒▒▒▒TY`84Ud▒▒▒▒>▒▒▒▒▒▒▒!▒
blackberrynet▒/▒▒!
M
▒▒!
N
▒▒#Oripassword▒▒΁/▒▒΁/▒▒Xbp`▒@@(▒▒U8▒IvPOC01
:qU▒b▒▒▒▒▒▒TY`84U▒▒▒*:▒▒!
▒k▒▒▒#O Welcmme!
▒!
M
水晶透心 2024-12-21 12:47:28

使用 cutawk,您可以使用 gawk (GNU Awk) 扩展函数 strtonum() 来相当简单地完成此操作code>:

cut -c11-60 inputfile |
awk '{ for (i = 1; i <= NF; i++)
       {
           c = strtonum("0x" $i)
           printf("%c", c);
       }
     }' > outputfile

或者,如果您使用的是“new awk”的非 GNU 版本,那么您可以使用:

cut -c11-60 inputfile |
awk '{  for (i = 1; i <= NF; i++)
        {
            s = toupper($i)
            c0 = index("0123456789ABCDEF", substr(s, 1, 1)) - 1
            c1 = index("0123456789ABCDEF", substr(s, 2, 1)) - 1
            printf("%c", c0*16 + c1);
        }
     }' > outputfile

如果您想使用其他工具(请记住 Perl 和 Python sprint;Ruby 将是另一种可能性),你可以做这很容易。

odx 是一个类似于 hexdump 程序的程序。上面的脚本被修改为读取“hexdump.out”作为输入文件,并将输出通过管道传输到 odx 而不是文件,并给出以下输出:

$ cat hexdump.out
00000000  3d 3d 01 fc 73 47 54 50  02 f1 d6 55 3c 9f 49 9c  |==..sGTP...U<.I.|
00000010  00 01 01 00 01 80 00 dc  ce 81 2f 00 00 00 00 00  |........../.....|
00000020  00 00 00 00 00 00 00 00  ca 04 d2 33 00 00 00 00  |...........3....|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 10  |................|
00000040  01 01 0f 00 00 07 04 ea  00 00 ff ff 00 00 14 b7  |................|
00000050  00 00 ff ff 00 00 83 ec  00 00 83 62 54 14 59 00  |...........bT.Y.|
00000060  60 38 34 f5 01 01 0b 58  62 70 11 60 f6 ff ff ff  |`84....Xbp.`....|
00000070  ff ff ff 02 00 7c 00 d0  01 4c 00 b0 40 40 28 02  |.....|...L..@@(.|
$ sh -x revdump.sh | odx
+ cut -c11-60 hexdump.out
+ awk '{  for (i = 1; i <= NF; i++)
        {
            #c = strtonum("0x" $i)
            #printf("%c", c);
            s = toupper($i)
            c0 = index("0123456789ABCDEF", substr(s, 1, 1)) - 1
            c1 = index("0123456789ABCDEF", substr(s, 2, 1)) - 1
            printf("%c", c0*16 + c1);
        }
     }'
0x0000: 3D 3D 01 FC 73 47 54 50 02 F1 D6 55 3C 9F 49 9C   ==..sGTP...U<.I.
0x0010: 00 01 01 00 01 80 00 DC CE 81 2F 00 00 00 00 00   ........../.....
0x0020: 00 00 00 00 00 00 00 00 CA 04 D2 33 00 00 00 00   ...........3....
0x0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10   ................
0x0040: 01 01 0F 00 00 07 04 EA 00 00 FF FF 00 00 14 B7   ................
0x0050: 00 00 FF FF 00 00 83 EC 00 00 83 62 54 14 59 00   ...........bT.Y.
0x0060: 60 38 34 F5 01 01 0B 58 62 70 11 60 F6 FF FF FF   `84....Xbp.`....
0x0070: FF FF FF 02 00 7C 00 D0 01 4C 00 B0 40 40 28 02   .....|...L..@@(.
0x0080:
$ 

或者,使用 hexdump -C 代替 odx:

$ sh -x revdump.sh | hexdump -C
+ cut -c11-60 hexdump.out
+ awk '{  for (i = 1; i <= NF; i++)
        {
            #c = strtonum("0x" $i)
            #printf("%c", c);
            s = toupper($i)
            c0 = index("0123456789ABCDEF", substr(s, 1, 1)) - 1
            c1 = index("0123456789ABCDEF", substr(s, 2, 1)) - 1
            printf("%c", c0*16 + c1);
        }
     }'
00000000  3d 3d 01 fc 73 47 54 50  02 f1 d6 55 3c 9f 49 9c  |==..sGTP...U<.I.|
00000010  00 01 01 00 01 80 00 dc  ce 81 2f 00 00 00 00 00  |........../.....|
00000020  00 00 00 00 00 00 00 00  ca 04 d2 33 00 00 00 00  |...........3....|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 10  |................|
00000040  01 01 0f 00 00 07 04 ea  00 00 ff ff 00 00 14 b7  |................|
00000050  00 00 ff ff 00 00 83 ec  00 00 83 62 54 14 59 00  |...........bT.Y.|
00000060  60 38 34 f5 01 01 0b 58  62 70 11 60 f6 ff ff ff  |`84....Xbp.`....|
00000070  ff ff ff 02 00 7c 00 d0  01 4c 00 b0 40 40 28 02  |.....|...L..@@(.|
00000080
$

Using cut and awk, you can do it fairly simply using a gawk (GNU Awk) extension function, strtonum():

cut -c11-60 inputfile |
awk '{ for (i = 1; i <= NF; i++)
       {
           c = strtonum("0x" $i)
           printf("%c", c);
       }
     }' > outputfile

Or, if you are using a non-GNU version of 'new awk', then you can use:

cut -c11-60 inputfile |
awk '{  for (i = 1; i <= NF; i++)
        {
            s = toupper($i)
            c0 = index("0123456789ABCDEF", substr(s, 1, 1)) - 1
            c1 = index("0123456789ABCDEF", substr(s, 2, 1)) - 1
            printf("%c", c0*16 + c1);
        }
     }' > outputfile

If you want to use other tools (Perl and Python sprint to mind; Ruby would be another possibility), you can do it easily enough.

odx is a program similar to the hexdump program. The script above was modified to read 'hexdump.out' as the input file, and the output piped into odx instead of a file, and gives the following output:

$ cat hexdump.out
00000000  3d 3d 01 fc 73 47 54 50  02 f1 d6 55 3c 9f 49 9c  |==..sGTP...U<.I.|
00000010  00 01 01 00 01 80 00 dc  ce 81 2f 00 00 00 00 00  |........../.....|
00000020  00 00 00 00 00 00 00 00  ca 04 d2 33 00 00 00 00  |...........3....|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 10  |................|
00000040  01 01 0f 00 00 07 04 ea  00 00 ff ff 00 00 14 b7  |................|
00000050  00 00 ff ff 00 00 83 ec  00 00 83 62 54 14 59 00  |...........bT.Y.|
00000060  60 38 34 f5 01 01 0b 58  62 70 11 60 f6 ff ff ff  |`84....Xbp.`....|
00000070  ff ff ff 02 00 7c 00 d0  01 4c 00 b0 40 40 28 02  |.....|...L..@@(.|
$ sh -x revdump.sh | odx
+ cut -c11-60 hexdump.out
+ awk '{  for (i = 1; i <= NF; i++)
        {
            #c = strtonum("0x" $i)
            #printf("%c", c);
            s = toupper($i)
            c0 = index("0123456789ABCDEF", substr(s, 1, 1)) - 1
            c1 = index("0123456789ABCDEF", substr(s, 2, 1)) - 1
            printf("%c", c0*16 + c1);
        }
     }'
0x0000: 3D 3D 01 FC 73 47 54 50 02 F1 D6 55 3C 9F 49 9C   ==..sGTP...U<.I.
0x0010: 00 01 01 00 01 80 00 DC CE 81 2F 00 00 00 00 00   ........../.....
0x0020: 00 00 00 00 00 00 00 00 CA 04 D2 33 00 00 00 00   ...........3....
0x0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10   ................
0x0040: 01 01 0F 00 00 07 04 EA 00 00 FF FF 00 00 14 B7   ................
0x0050: 00 00 FF FF 00 00 83 EC 00 00 83 62 54 14 59 00   ...........bT.Y.
0x0060: 60 38 34 F5 01 01 0B 58 62 70 11 60 F6 FF FF FF   `84....Xbp.`....
0x0070: FF FF FF 02 00 7C 00 D0 01 4C 00 B0 40 40 28 02   .....|...L..@@(.
0x0080:
$ 

Or, using hexdump -C in place of odx:

$ sh -x revdump.sh | hexdump -C
+ cut -c11-60 hexdump.out
+ awk '{  for (i = 1; i <= NF; i++)
        {
            #c = strtonum("0x" $i)
            #printf("%c", c);
            s = toupper($i)
            c0 = index("0123456789ABCDEF", substr(s, 1, 1)) - 1
            c1 = index("0123456789ABCDEF", substr(s, 2, 1)) - 1
            printf("%c", c0*16 + c1);
        }
     }'
00000000  3d 3d 01 fc 73 47 54 50  02 f1 d6 55 3c 9f 49 9c  |==..sGTP...U<.I.|
00000010  00 01 01 00 01 80 00 dc  ce 81 2f 00 00 00 00 00  |........../.....|
00000020  00 00 00 00 00 00 00 00  ca 04 d2 33 00 00 00 00  |...........3....|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 10  |................|
00000040  01 01 0f 00 00 07 04 ea  00 00 ff ff 00 00 14 b7  |................|
00000050  00 00 ff ff 00 00 83 ec  00 00 83 62 54 14 59 00  |...........bT.Y.|
00000060  60 38 34 f5 01 01 0b 58  62 70 11 60 f6 ff ff ff  |`84....Xbp.`....|
00000070  ff ff ff 02 00 7c 00 d0  01 4c 00 b0 40 40 28 02  |.....|...L..@@(.|
00000080
$
弥繁 2024-12-21 12:47:28

要将编码从 File3 更改为 File1,您可以使用如下脚本:

#!/bin/bash

# file name: tobin.sh

fileName="tobin.txt"   # todo: pass it as parameter
                       #       or prepare it to be used via the pipe...
while read line; do
  for hexValue in $line; do
    echo -n -e "\x$hexValue"
  done
done < $fileName

或者,如果您只想通过管道传输它,并像本线程中的 xxd 示例一样使用:

#!/bin/bash

# file name: tobin.sh
# usage: cat file3.txt | ./tobin.sh > file1.bin

while read line; do
  for hexValue in $line; do
    echo -n -e "\x$hexValue"
  done
done

如果您确实想为此使用 BASH,那么我建议您开始使用数组来很好地构建你的数据包。这是起始代码:

#!/bin/sh

# We assume the script will run on a LSB architecture.

hexDump() {
  for idx in $(seq 0 ${#buffer[@]}); do
    printf "%02X", ${buffer[$idx]}
  done
} # hexDump() function

###
# dump() dumps the current content of the buffer[] array to the STDOUT.
#
dump() {
  # or, use $ptr here...
  for idx in $(seq 0 ${#buffer[@]}); do
    printf "%c" ${buffer[$idx]}
  done
} # dump() function

# Beginning of DB Package Identifier: ==
buffer[0]=

输出:

$ ./tobin2.sh | hexdump -C
00000000  3d 3d 02 00 00 00 73 00  00 00 00                 |==....s....|
0000000b

当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x3d' # = buffer[1]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x3d' # = size=2 # Total Package Length: 2 # We start with 2, and later on we update it once we know the exact size... # Assuming 32bit architecture, LSB, this is how we encode number 2 (that is our current size of the packet) buffer[2]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x02' buffer[3]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x00' buffer[4]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x00' buffer[5]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x00' # Offset to Data Record Count field: 115 # I assume this is also a 32bit field of unsigned int type ptr=5 buffer[++ptr]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x73' # 115 buffer[++ptr]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x00' buffer[++ptr]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x00' buffer[++ptr]=

输出:


当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

\x00' #hexDump dump

输出:

当然,这不是原始帖子的解决方案...该解决方案将使用类似的东西来生成二进制输出。最大的问题是我们仍然不知道数据包中的字段类型。我们也不知道其架构(是bigendian还是littleendian,是32位还是64位)。您必须向我们提供规格。例如,包裹的长度是什么类型?我们不知道该 TXT 文件!

为了帮助您完成您必须做的事情,您必须向我们找到有关这些字段大小的规范。

请注意,这是一个好的开始。您需要实现便利的函数,例如,使用十六进制值编码的字符串中的值自动填充 buffer[]。所以你可以做类似write $offset "ff c0 d3 ba be"的事情。

To change encoding from File3 to File1, you use a script like this:

#!/bin/bash

# file name: tobin.sh

fileName="tobin.txt"   # todo: pass it as parameter
                       #       or prepare it to be used via the pipe...
while read line; do
  for hexValue in $line; do
    echo -n -e "\x$hexValue"
  done
done < $fileName

Or, if you just want to pipe it, and use like the xxd example in this thread:

#!/bin/bash

# file name: tobin.sh
# usage: cat file3.txt | ./tobin.sh > file1.bin

while read line; do
  for hexValue in $line; do
    echo -n -e "\x$hexValue"
  done
done

If you really want to use BASH for this, then I suggest you start using array to nicely build your packet. Here is starting code:

#!/bin/sh

# We assume the script will run on a LSB architecture.

hexDump() {
  for idx in $(seq 0 ${#buffer[@]}); do
    printf "%02X", ${buffer[$idx]}
  done
} # hexDump() function

###
# dump() dumps the current content of the buffer[] array to the STDOUT.
#
dump() {
  # or, use $ptr here...
  for idx in $(seq 0 ${#buffer[@]}); do
    printf "%c" ${buffer[$idx]}
  done
} # dump() function

# Beginning of DB Package Identifier: ==
buffer[0]=

Output:

$ ./tobin2.sh | hexdump -C
00000000  3d 3d 02 00 00 00 73 00  00 00 00                 |==....s....|
0000000b

Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x3d' # = buffer[1]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x3d' # = size=2 # Total Package Length: 2 # We start with 2, and later on we update it once we know the exact size... # Assuming 32bit architecture, LSB, this is how we encode number 2 (that is our current size of the packet) buffer[2]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x02' buffer[3]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x00' buffer[4]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x00' buffer[5]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x00' # Offset to Data Record Count field: 115 # I assume this is also a 32bit field of unsigned int type ptr=5 buffer[++ptr]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x73' # 115 buffer[++ptr]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x00' buffer[++ptr]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x00' buffer[++ptr]=

Output:


Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

\x00' #hexDump dump

Output:

Sure, this is not solution the the original post... The solution will use something like this to generate binary output. The biggest problem is that we still do not know the types of fields in the packet. We also do not know the architecture (is it bigendian, or littleendian, is it 32bit, or 64bit). You must give us the specification. For an instance, the lenght of the package is of what type? We do not know that from that TXT file!

In order to help you do what you have to do, you must find us the specification about sizes of those fields.

Note it is a good start though. You need to implement convenience functions to, for an example, automatically fill the buffer[] with values from a string encoded with hex values. So you can do something like write $offset "ff c0 d3 ba be".

半衬遮猫 2024-12-21 12:47:28

有一个工具 binmake 允许以文本格式描述一些二进制数据并生成二进制文件(或输出到标准输出)。它允许更改字节序和数字格式并接受注释。

首先获取并编译binmake(二进制程序将位于bin/中):

$ git clone https://github.com/dadadel/binmake
$ cd binmake
$ make

创建您的文本文件file.txt

# an exemple of file description of binary data to generate
# set endianess to big-endian
big-endian

# default number is hexadecimal
00112233

# man can explicit a number type: %b means binary number
%b0100110111100000

# change endianess to little-endian
little-endian

# if no explicit, use default
44556677

# bytes are not concerned by endianess
88 99 aa bb

# change default to decimal
decimal

# following number is now decimal
0123

# strings are delimited by " or '
"this is some raw string"

# explicit hexa number starts with %x
%xff

生成您的二进制文件< code>file.bin:

$ ./binmake file.txt file.bin
$ hexdump file.bin -C
00000000  00 11 22 33 4d e0 77 66  55 44 88 99 aa bb 7b 74  |.."3M.wfUD....{t|
00000010  68 69 73 20 69 73 20 73  6f 6d 65 20 72 61 77 20  |his is some raw |
00000020  73 74 72 69 6e 67 ff                              |string.|
00000027

您还可以使用 stdinstdout 通过管道传输它:

$ echo '32 decimal 32 %x61 61' | ./binmake | hexdump -C
00000000  32 20 61 3d                                       |2 a=|
00000004

There's a tool binmake allowing to describe in text format some binary data and generate a binary file (or output to stdout). It allows to change the endianess and number formats and accepts comments.

First get and compile binmake (the binary program will be in bin/):

$ git clone https://github.com/dadadel/binmake
$ cd binmake
$ make

Create your text file file.txt:

# an exemple of file description of binary data to generate
# set endianess to big-endian
big-endian

# default number is hexadecimal
00112233

# man can explicit a number type: %b means binary number
%b0100110111100000

# change endianess to little-endian
little-endian

# if no explicit, use default
44556677

# bytes are not concerned by endianess
88 99 aa bb

# change default to decimal
decimal

# following number is now decimal
0123

# strings are delimited by " or '
"this is some raw string"

# explicit hexa number starts with %x
%xff

Generate your binary file file.bin:

$ ./binmake file.txt file.bin
$ hexdump file.bin -C
00000000  00 11 22 33 4d e0 77 66  55 44 88 99 aa bb 7b 74  |.."3M.wfUD....{t|
00000010  68 69 73 20 69 73 20 73  6f 6d 65 20 72 61 77 20  |his is some raw |
00000020  73 74 72 69 6e 67 ff                              |string.|
00000027

You can also pipe it using stdin and stdout:

$ echo '32 decimal 32 %x61 61' | ./binmake | hexdump -C
00000000  32 20 61 3d                                       |2 a=|
00000004
风蛊 2024-12-21 12:47:28

awk 对于这里的工作来说是错误的工具,但是有一千种方法可以做到这一点。最简单的方法通常是一个小型 C 程序,或任何其他明确区分字符和十进制数字字符串的语言。

但是,要在 awk 中执行此操作,请使用“%c”printf 格式。

awk is the wrong tool for the job here, but there are a thousand ways to do it. The easiest way is often a small C program, or any other language that explicitely makes a distinction between a character and a string of decimal digits.

However, to do it in awk, use the "%c" printf format.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文