使用 awk 提取文件所需部分时出现混乱
我有一个使用 awk、sed、grep 和其他 shell 功能的脚本。
我被困在一个地方,所以需要你的帮助...
这是我的问题的输入
文件
udit@udit-Dabba ~/ah $ cat decrypt.txt
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02 *00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e* 00
00 00 03 29
我的目的是提取00 00 e0 f9 6a 61 61 6e 65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e
来自上述文件 ,也在上面的 *
之间标记,
虽然很明显,但是显示这些 *
是为了清楚这里的情况,它们实际上并不存在于文件中。
如上所示的文件的最后五个单元是 ..
00 00 00 03 29
这些 00
是简单的填充字节,03
指定它们的填充长度
,现在这是提取所需部分的脚本部分:
size=`wc -w decrypt.txt`
padlen=3 // calculated by some other mechanism
awk -v size=$size -v padlen=$padlen 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40
&& NR <=size-padlen-2) print $0}' decrypt.txt | sed '1,1s/ //'
输出:
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69
我的问题: 最后一个单元 6e
丢失
还通过终端尝试过...
size=68,padlen=3
所以循环应该从 NR=40 到 NR<=63
code>
udit@udit-Dabba ~/ah $ awk 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40 && NR <= 65)
print $0}' decrypt.txt | sed '1,1s/ //'
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e 00
00
如果循环达到 65
则工作正常。所以也应该达到 63
udit@udit-Dabba ~/ah $ awk 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40 && NR <= 64)
print $0}' decrypt.txt | sed '1,1s/ //'
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e
但这是什么???当我将 65
减少到 64
时,会丢失两个 00
单位。为什么会发生这种情况???
也尝试了这个,但找不到这种奇怪输出的原因。
udit@udit-Dabba ~/ah $ awk 'BEGIN {RS="[ \n]";ORS=" ";} {if (NR > 40
&& NR <=65)print $0}' decrypt.txt | sed '1,1s/ //'
0002 00 00 e0 f9 6a 61 61 6e 65 6b 61 68 61 6e 67 61 79 65 77 6f 64
请帮助我......
可能我对问题的解释超出了要求,但确实需要它。
我对所有这些 shell 和 awk 的东西都很陌生,因此可能存在一个我无法发现的愚蠢错误。
请帮助我解决这个问题..
提前谢谢..
编辑:
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02
这些是固定的 40 个 ipv6 标头单元,将始终保持不变。
* 之间的部分是可变长度的,这就是为什么我需要以这种方式工作,否则这将是一个简单的任务。
I have a script making use of awk,sed,grep and other shell features.
I have stuck at a place so need your help ...
This is the input
file for the my problem
udit@udit-Dabba ~/ah $ cat decrypt.txt
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02 *00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e* 00
00 00 03 29
My purpose is to extract 00 00 e0 f9 6a 61 61 6e
from the above mentioned file
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e
,also marked between *
's above
Although obvious but these *
's are shown to clear the situation here , they are not actually present in the file.
The last five units of the file as shown above are ..
00 00 00 03 29
These 00
are simple pad bytes and 03
specify their pad length
and now here is the part of script to extract the required part :
size=`wc -w decrypt.txt`
padlen=3 // calculated by some other mechanism
awk -v size=$size -v padlen=$padlen 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40
&& NR <=size-padlen-2) print $0}' decrypt.txt | sed '1,1s/ //'
output :
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69
My problem :
last unit 6e
missing
Also tried through terminal ...
size=68,padlen=3
so loop should go from NR=40 to NR<=63
udit@udit-Dabba ~/ah $ awk 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40 && NR <= 65)
print $0}' decrypt.txt | sed '1,1s/ //'
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e 00
00
Working fine if loop goes upto 65
.So should also work upto 63
udit@udit-Dabba ~/ah $ awk 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40 && NR <= 64)
print $0}' decrypt.txt | sed '1,1s/ //'
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e
But what is this ???? when I decrease 65
to 64
, there is loss of two 00
units.Why this is happening ???
Also tried this one but could not find a reason why this weird output.
udit@udit-Dabba ~/ah $ awk 'BEGIN {RS="[ \n]";ORS=" ";} {if (NR > 40
&& NR <=65)print $0}' decrypt.txt | sed '1,1s/ //'
0002 00 00 e0 f9 6a 61 61 6e 65 6b 61 68 61 6e 67 61 79 65 77 6f 64
Plase help me out ...
May be I have explained the problem more than the required but really need it .
I am new to all these shell and awk things and so there may be a silly mistake which I could not find out .
Please help me on this ..
Thnx in advance ..
EDIT :
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02
These are fixed 40 units of ipv6 header,will always remain same.
The portion between *'s is of variable length that is why I need to work in that way otherwise it would have been a simple task .
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果我将问题理解为:丢弃前 40 个值和最后 n 个值(其中 n 是填充 + 2,即在本例中为 3 + 2 = 5),这可能会起作用:
技巧是展开数据,然后选择你想要的部分。
If I understand the problem as being: discard the first 40 values and the last n values (where n is the padding + 2 i.e. in this case 3 + 2 = 5), this might work:
The trick is to unroll the data and then pick the bits you want.
我在代码中做了一些小改动,直到 6e*
我将大小设置为 68,因为 wc 会打印大小和文件名,当您将其传递给 awk 脚本时,您必须将其删除。
注意:我还没有完全理解你的要求
I made some small changes in the code and able to get till 6e*
I made size as 68 becos wc wis printing the size and file name and you have to remove it when u are passing the same to the awk script.
Note: I havent understood your requirement fully