使用 DOS 通配符暴力破解字符串的最快方法
这个问题与 SQL 盲注类似。 目标是确定字符串的确切值,您可以做的唯一测试是查看您指定的 DOS 样式通配符(? = 任何字符,* = 任意数量的任何字符)是否与该字符串匹配。 (因此实际上您只能访问 boolDoesWildcardMatch(string wildcard)
函数)。
直接的方法是测试 a*, b*, c*...
直到找到第一个字母,然后重复。 我能想到的一些优化:
- 搜索
*a*, *b*
等以确定字符集, - 当找到
*x*
上的匹配时,执行除法 - et-impera (*a*x*, *b*x*, ...
)
This problem is similar to blind SQL injections. The goal is to determine the exact value of a string, and the only test you can do is to see if a DOS-style wildcard (? = any character, * = any number of any characters) you specify is matched by the string. (So practically you only have access to a bool DoesWildcardMatch(string wildcard)
function).
The straight-forward way is to test against a*, b*, c*...
until you find the first letter, then repeat. Some optimizations I can think of:
- search for
*a*, *b*
etc. to determine the character set - when a match on
*x*
is found, perform divide-et-impera (*a*x*, *b*x*, ...
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
第一个想法。 您可以在
O(log2(n))
中确定字符串的长度n
。Z*
,其中Z
代表k
个问号,从 0 开始,然后是 1,然后每次检查时将问号数量加倍,直到没有匹配发生。n
必须介于k / 2
和k
之间k
来查找准确的长度就像二分查找一样。知道确切的长度可能有助于在空间域中执行一种分而治之的方法。
更新
如果您知道长度,则可以使用相同的模式来正确定位符号。
示例:
对于字符串长度
n
和字母大小m
,这将需要大约O(log2(n))
才能找到字符串的长度,大约O(n • log2(n))
正确放置n
个符号,O(m)
查找使用的符号 - 对所有符号求和一起产生O(n • log2(n) + m)
。我可以想象,可以通过合并几个步骤来加快速度 - 也许在确定字符串长度时测试使用的符号,或者同时在字符串的前半部分和后半部分中定位两个(甚至更多?)符号。 如果检查失败,则需要单独重新检查合并的步骤,以确定哪个检查失败。 但只要合并检查成功,您就可以获得两者的信息。
也许明天我会计算一下,看看它是否真的会加快速度。
A first thought. You can determin the length
n
of the string inO(log2(n))
.Z*
whereZ
representsk
question marks starting with 0, then 1, and then doubling the number of question marks with every check until no match occurs.n
must be betweenk / 2
andk
k
in the same way as binary search does.Knowing the exact length might help to perform a kind of divide-et-impera in the spatial domain.
UPDATE
If you know the length, you can use the same pattern to correctly locate a symbol.
Example:
For string length
n
and alphabet sizem
this will take aboutO(log2(n))
to find the length of the string, aboutO(n • log2(n))
to correctly placen
symbols, andO(m)
to find the used symbols - summing all together yieldsO(n • log2(n) + m)
.I could imagine that it is possible to speed this up by merging several steps - maybe test for used symbols while determining the string length or simultaneously locating two (or even more?) symbols in the first and second half of the string. This will require to recheck the merged steps in isolation if the check fails in order to determine which check faild. But as long as the merged check succeeds, you gain information on both.
Maybe I will calculate that tomorrow in order to see if it will really speed the thing up.
至于分而治之,请务必记录您已知不存在的值。 另外,我不会选择
a,b,c
,而是选择频率顺序。 某种马尔可夫链可能会让它变得更快。需要注意的一件事是,您不能假设给定的文字始终与输入中的相同位置匹配。 对于最后删除通配符,这将特别令人感兴趣。
As for the divide-et-impera, be sure to keep track of value that you known are not present. Also I'd not go with
a, b, c
, but with frequency order. Some sort of markov chain from that might make it even faster.One thing to watch out for is that you can't assume that a given literal will always match the same location in the input. This will be of particular interest regarding removing the wild cards at the end.
如果有具体数量? 有效,您还可以检查“?”,“??”,“???” 等来获取字符串的长度,但我怀疑这会有多大帮助,因为您还可以在每一轮之后仅进行一次额外检查而不使用任何通配符来检查是否获得了正确的长度。
我认为之前带有字符集检查的除法方法几乎是最佳的,还有一些额外的细节,例如如果您匹配
*a*b*
,则应该检查*ab**ab
和“ab”以了解其间是否有字母,当然如上所述,检查此后是否已完成右侧或完全完成。If a specific number of ? works, you can also check "?", "??", "???" etc. to get the length of the string, but I doubt this will help much as you can also check if you've got the right length with just one additional check without any wildcards after each round.
I think the divide method with a character set check before is almost optimal, there are some additional details, for example if you matched
*a*b*
, you should check*ab*
afterwards to know if there are letters in between and of course as stated above, check*ab
and "ab" after this to know if you've finished on the right side or completely.为什么不将 DOS 风格的通配符字符串转换为正则表达式? 例如:
?a*
变为:
.a.*
然后只需执行简单的正则表达式匹配即可将其与测试字符串进行比较。
Why not convert your DOS style wildcard string into a regular expression? e.g.:
?a*
becomes:
.a.*
Then just perform a simple regular expression match comparing that to your test string.