密码分析:两个明文文件的异或
我有一个文件,其中包含两个异或纯文本文件的结果。如何攻击该文件以解密任一明文文件?我搜索了很多,但找不到任何答案。谢谢!
编辑:
嗯,我还有两个密文,我对它们进行异或以获得两个明文的异或。我之所以问这个问题,是因为据布鲁斯·施奈尔(Bruce Schneier)说。 198,应用密码学,1996 “...她可以将它们异或在一起,并得到两个彼此异或的明文消息。这很容易破解,然后她可以将其中一个明文与密文进行异或以获得密钥流。” (这与简单的流密码有关)但除此之外,他没有提供任何解释。这就是我在这里问的原因。原谅我的无知。
另外,使用的算法很简单,并且使用长度为 3 的对称密钥。
进一步编辑:
我忘记添加:我假设使用简单的流密码进行加密。
I have a file which contains the result of two XORed plaintext files. How do I attack this file in order to decrypt either of the plaintext files? I have searched quite a bit, but could not find any answers. Thanks!
EDIT:
Well, I also have the two ciphertexts which i XORed to get the XOR of the two plaintexts. The reason I ask this question, is because, according to Bruce Schneier, pg. 198, Applied Cryptography, 1996 "...she can XOR them together and get two plaintext messages XORed with each other. This is easy to break, and then she can XOR one of the plaintexts with the ciphertext to get the keystream." (This is in relation to a simple stream cipher) But beyond that he provided no explanation. Which is why I asked here. Forgive my ignorance.
Also, the algorithm used is a simple one, and a symmetric key is used whose length is 3.
FURTHER EDIT:
I forgot to add: Im assuming that a simple stream cipher was used for encryption.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我不是密码分析师,但如果您对文件的特征有所了解,您可能就有机会。
例如,假设您知道两个原始明文:
给定这两条信息,您可能采取的一种方法是使用以下单词扫描密文“解密”:您可能期望其中包含“足球”、“球员”、“得分”等。在密文的位置 0、位置 1、位置 2 等处使用“足球”执行解密。
如果解密字节序列的结果看起来是一个单词或单词片段,那么您很有可能从这两个文件中找到了明文。这可能会给你一些周围明文的线索,你可以看看这是否会导致合理的解密。等等。
使用您可能期望出现在明文中的其他单词/短语/片段重复此过程。
回应你的问题的编辑:施奈尔所说的是,如果有人有 2 个使用相同密钥进行异或加密的密文,对这些密文进行异或将“取消”密钥流,因为:
所以现在,攻击者有了一个新的仅由两个明文组成的密文。如果攻击者知道其中一个明文(假设攻击者可以合法访问 A,但不能合法访问 B),则可用于恢复另一个明文:
现在攻击者拥有 B 的明文
。实际上比这更糟糕 - 如果攻击者拥有 A 和 A 的密文,那么他就可以恢复密钥流。
但是,我上面给出的猜测方法是上述方法的变体,攻击者使用(希望是好的)猜测而不是已知的明文。显然这并不那么容易,但它是相同的概念,并且不需要从已知的明文开始就可以完成。现在,攻击者有了一个密文,当他正确猜出某些明文时,该密文会“告诉”他(因为解密后会产生其他明文)。因此,即使原始 XOR 运算中使用的密钥是随机乱码,攻击者在进行有根据的猜测时也可以使用已“删除”随机乱码的文件来获取信息。
I'm no cryptanalyst, but if you know something about the characteristics of the files you might have a chance.
For example, lets assume that you know that both original plaintexts:
Given those 2 pieces of information, one approach you might take is to scan through the ciphertext 'decrypting' using words that you might expect to be in them, such as "football", "player", "score", etc. Perform the decryption using "football" at position 0 of the ciphertext, then at position 1, then 2 and so on.
If the result of decrypting a sequence of bytes appears to be a word or word fragment, then you have a good chance that you've found plaintext from both files. That may give you a clue as to some surrounding plaintext, and you can see if that results in a sensible decryption. And so on.
Repeat this process with other words/phrases/fragments that you might expect to be in the plaintexts.
In response to your question's edit: what Schneier is talking about is that if someone has 2 ciphertexts that have been XOR encrypted using the same key, XORing those ciphertexts will 'cancel out' the keystream, since:
So now, the attacker has a new ciphertext that's composed only of the two plaintexts. If the attacker knows one of the plaintexts (say the attacker has legitimate access to A, but not B), that can be used to recover the other plaintext:
Now the attacker has the plaintext for B.
It's actually worse than this - if the attacker has A and the ciphertext for A then he can recover the keystream already.
But, the guessing approach I gave above is a variant of the above with the attacker using (hopefully good) guesses instead of a known plaintext. Obviously it's not as easy, but it's the same concept, and it can be done without starting with known plaintext. Now the attacker has a ciphertext that 'tells' him when he's correctly guessed some plaintext (because it results in other plaintext from the decryption). So even if the key used in the original XOR operation is random gibberish, an attacker can use the file that has that random gibberish 'removed' to gain information when he's making educated guesses.
您需要利用这两个文件都是纯文本的事实。从这个事实可以得出很多含义。假设两个文本都是英文文本,您可以使用某些字母比其他字母更受欢迎的事实。请参阅本文。
另一个提示是注意正确英文文本的结构。例如,每当一个语句结束,下一个语句开始时,就会出现一个(点、空格、大写字母)序列。
请注意,在 ASCII 代码中,空格是二进制“0010 0000”,更改字母中的该位将更改字母大小写(从小写到大写,反之亦然)。如果两个文件都是纯文本,将会有大量使用空间的异或运算,对吧?
在此页面上分析可打印字符表。
另外,最后您可以使用拼写检查器。
我知道我没有为你的问题提供解决方案。
我只是给了你一些提示。玩得开心,请分享您的发现。
这确实是一项有趣的任务。
You need to take advantage of the fact that both files are plain text. There is a lot of implications which can be derived from that fact. Assuming that both texts are English texts, you can use fact that some letters are much more popular than the others. See this article.
Another hint is to note the structure of correct English text. For example, every time one statements ends, and next begins you there is a (dot, space, capital letter) sequence.
Note that in ASCII code, space is binary "0010 0000" and changing that bit in a letter will change the letter case (lower to upper and vice versa). There will be a lot of XORing using space, if both files are plain text, right?
Analyse printable characters table on this page.
Also, at the end you can use spell checker.
I know I didn't provide a solution for your question.
I just gave you some hints. Have fun, and please share your findings.
It's really an interesting task.
这很有趣。施奈尔的书确实说打破这一点很容易。然后他就把这件事搁置了。我想你必须给读者留下一些练习!
Dawson 和 Nielson 发表了一篇文章,显然描述了一种自动化的文本文件的此任务的流程。购买单篇文章有点贵。然而,第二篇论文的标题为自动密码分析的自然语言方法
Two-time Pads 引用了 Dawson 和 Nielsen 的工作并描述了他们所做的一些假设(主要是文本限制为 27 个字符)。但第二篇论文似乎是免费提供的,并描述了他们自己的系统。我不确定它是否免费,但它在约翰霍普金斯大学的服务器上公开可用。
那篇论文大约有 10 页长,看起来很有趣。我现在没有时间阅读它,但稍后可能会。我觉得很有趣(并且很能说明问题)的是,需要一篇 10 页的论文来描述另一位密码学家描述为“简单”的任务。
That is interesting. The Schneier book does indeed say that it is easy to break this. And then he kind of leaves it hanging at that. I guess you have to leave some exercises up to the reader!
There is an article by Dawson and Nielson that apparently describes an automated process for this task for text files. It's a bit on the $$ side to buy the single article. However, a second paper titled A Natural Language Approach to Automated Cryptanalysis
of Two-time Pads references the Dawson and Nielsen work and describes some assumptions they made (primarily that the text was limited to 27 characters). But this second paper appears to be freely available and describes their own system. I don't know for sure that it is free, but it is openly available on a Johns Hopkins University server.
That paper is about 10 pages long and looks interesting. I don't have time to read it at the moment but may later. I find it interesting (and telling) that it takes a 10 page paper to describe a task that another cryptographer describes as "easy".
我认为你不能——在不了解这两个文件的结构的情况下。
I don't think you can - not without knowing anything about the structure of the two files.
除非您拥有其中一个明文文件,否则您无法获得另一个文件的原始信息。用数学表达:
你有一个带有两个未知数的方程,你不可能从中得到有意义的东西。
Unless you have one of the plaintext files, you can't get the original information of the other. Mathematically expressed:
You have one equation with two unknowns, you can't possibly get something meaningful out of it.