异或文件解密
因此,我必须解密一个使用 XOR 代码
和未知的重复密码加密的 .txt
文件,目标是发现该消息。
以下是我通过教授已经知道的事情:
首先我需要找到未知密码的长度
消息已被更改并且它没有空格(这可能会增加一点难度,因为空格字符在消息中出现的频率最高)
有关如何的任何想法解决这个问题吗?
提前谢谢:)
So I have to decrypt a .txt
file that is crypted with XOR code
and with a repeated password that is unknown, and the goal is to discover the message.
Here are the things that I already know because of the professor:
First I need to find the length of the unknown password
The message has been altered and it doesn't have spaces (this may add a bit more difficulty because the space character has the highest frequency in a message)
Any ideas on how to solve this?
thx in advanced :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
首先您需要找出密码的长度。您可以通过评估重合指数或 Kappa 测试来完成此操作。将密文与其自身移位 1 步进行异或,并计算相同的字符数(值 0)。将结果除以字符总数减 1 即可得到 Kappa 值。再移动一次并再次计算 Kappa 值。根据需要多次移动密文,直到发现密码长度。如果长度为 4,您应该会看到与此类似的内容:
如您所见,在 4 的倍数(4、8 和 12)上,Kappa 值明显高于其他值。这表明密码的长度是 4。
现在您已经有了密码长度,您应该再次将密文与其本身进行异或,但现在您要移动长度的倍数。为什么?由于密文看起来像这样:
当两个相同的值进行 XOR:ed 时,结果为 0:
实际上是:
即:
正如您所看到的,密码“消失”,明文与其自身进行 XOR:ed。
那么我们现在能做什么呢?您写道,空格已被删除。这使得获取明文或密码变得更加困难。但也并非完全不可能。
下表显示了所有英文字符的密文值:
那么这意味着什么呢?如果 A 和 B 进行 XOR:ed,则结果值为 3。E 和 P 将得到 21。等等。好的,但这对您有什么帮助呢?
请记住,明文是 XOR:ed 的,其自身移位了密码长度的倍数。对于每个值,您可以检查上表并确定该位置可以有哪些组合。假设值是 25,那么导致值 25 的两个字符可能是以下组合之一:(IP)、(HQ)、(KR)、(JS)、(MT)、(LU)、(OV )、(NW)、(AX) 或 (CZ)。但哪一个呢?现在,您进行更多轮班,并再次在表中查找每个位置的相应值。下次该值可能是 7,并且由于您已经有了可能的字符组合列表,因此您只需检查它们。接下来的两个班次的值分别是 3 和 1。 现在您可以确定该字符是 W,因为这是每个班次中唯一的公共字符 (NW)、(PW)、(TW)、(VW)。对于大多数职位,您都可以这样做。
您不会获得所有明文,但您将获得足够的字符来发现密码。取出已知字符并将它们异或到密文中的正确位置。这将产生密码。如果这些字符位于密码的“正确”位置,那么您至少需要的已知字符数就是密码中的字符数。
祝你好运!
First you need to find out the length of the password. You do this by assessing the Index of Coincidence or Kappa-test. XOR the ciphertext with itself shifted 1 step and count the number of characters that are the same (value 0). You get the Kappa value by dividing the result with the total number of characters minus 1. Shift one more time and again calculate the Kappa value. Shift the ciphertext as many times as needed until you discover the password length. If the length is 4 you should see something similar to this:
As you see the Kappa value is significantly higher on multiples of 4 (4, 8 and 12) than the others. This suggests that the length of the password is 4.
Now that you have the password length you should again XOR the cipher text with itself but now you shift by multiples of the length. Why? Since the ciphertext looks like this:
When two values which are the same are XOR:ed the result is 0:
Is in reality:
Which is:
As you see the password "disappears" and the plaintext is XOR:ed with itself.
So what can we do now then? You wrote that the spaces are removed. This makes it a bit harder to get the plaintext or password. But not at all impossible.
The following table shows the ciphertext values for all english characters:
What does this mean then? If an A and a B is XOR:ed then the resulting value is 3. E and P will result in 21. Etc. OK but how will this help you?
Remember that the plaintext is XOR:ed with itself shifted by multiples of the password length. For each value you can check the above table and determine what combinations that position could have. Lets say the value is 25 then the two characters that resulted in the value 25 could be one of the following combinations:(I-P), (H-Q), (K-R), (J-S), (M-T), (L-U), (O-V), (N-W), (A-X) or (C-Z). But which one? Now you do more shifts and look up the corresponding values in the table again for each position. Next time the value might be 7 and since you already have a list of possible character combinations you only check against them. At the next two shifts the values are 3 and 1. Now you can determine that the character is W since that is the only common character in each shift, (N-W), (P-W), (T-W), (V-W). You can do this for most positions.
You will not get all the plaintext but you will get enough characters to discover the password. Take the known characters and XOR them in the correct position in the ciphertext. This will yield the password. The number of known characters you need atleast is the number of characters in the password if they are at the "correct" positions in regards to the password.
Good luck!
你应该考虑破解 vigenere chiffre,特别是在自相关方面。后者将帮助您找出密码的长度,其余的通常只是对字母的正态分布进行暴力破解(其中最常见的是英语中的字母 e)。
you should look at cracking a vigenere chiffre, especially at auto-correlation. The latter will help you finding out the length of the password and the rest is usually just bruteforcing on the normal distribution of letters (where the most common one is the letter e in the english language).
尽管空格是最常见的字符并且使解密变得容易,但其他字符也有不同的频率。例如,查看这篇维基百科文章。如果您有足够的加密文本并且密码长度不太长,则可能足以找出加密文本中最常见的字节。它们很可能是英文文本中出现频率最高的
e
的加密版本。仅此一项不会为您提供解密的文本,但您很可能可以用它找出密码长度和(部分)密码本身。例如,假设最常见的加密字节是
wxmz y ,
其频率几乎相同,并且在最后一个之后频率显着下降。这将告诉您两件事:
e
的可能性都是相同的。编辑:好的,这是不正确的,它将是 5 或以上,因为密码可以多次包含相同的字符。wxmz y
XOReeee e
) 的某种排列 - 您可以使用字节偏移量对密码长度取模来获得正确的排列。编辑:密码中多次出现相同的字符会使事情变得有点困难,但您很可能能够识别这些字符,因为正如我所说,
e
的加密版本将围绕频率聚集>f
- 现在,如果该字符出现n
次,则其频率将接近n*f
。Although spaces are the most common characters and make decryptions like this easy, the other character also have different frequencies. For example, see this Wikipedia article. If you've got enough encrypted text and the password length isn't too large, it might just be enough to find out the most common bytes in the encrypted text. They will most likely be the encrypted versions of
e
that has the highest frequency in english texts.This alone won't give you the decrypted text, but it's very likely you can find out the password length and (part of) the password itself with it. For example, let's assume the most frequent encrypted bytes are
w x m z y
with almost the same frequency and there's a significant drop in frequency after the last one. This will tell you two things:
e
will be equally likely. EDIT: OK, this isn't correct, it will be 5 or above because the password can contain the same character multiple times.w x m z y
XORe e e e e
) - you can use the byte offsets modulo the password length to get the correct permutation.EDIT: The same character occuring in the password multiple times makes things a bit harder, but you'll most likely be able to identify those because as I said, encrypted versions of
e
will cluster around frequencyf
- now if the character occursn
times, it will have a frequency nearn*f
.英语中最常见的三字母三元组(假设该语言可能是英语)是“the”。将“the”放在密文上所有可能的点处,以导出可能的 3 个字符的密钥。在密文上所有其他可能的位置尝试每个可能的密钥片段,看看会得到什么。例如,“qzg”不太可能是正确的,但“fen”可能是正确的。查看可能位置之间的间距来得出密钥长度。通过密钥长度和密钥片段,您可以放置更多的密钥。
正如拉尔斯所说,看看解密维吉尼亚的方法,这实际上就是你这里所拥有的。
The most common three letter trigram in English (assuming the language is probably English) is "the". Place "the" at all possible points on your cyphertext to derive a possible 3 characters of the key. Try each possible key fragment at all other possible positions on the cyphertext and see what you get. For example, "qzg" is unlikely to be correct, but "fen" could be. Look at the spacing between possible positions to derive the key length. With a key length and a key fragment you can place a lot more of the key.
As Lars said, look at ways of decrypting Vigenère, which is effectively what you have here.