使用 DOS 通配符暴力破解字符串的最快方法

发布于 2024-07-20 09:13:04 字数 385 浏览 6 评论 0原文

这个问题与 SQL 盲注类似。 目标是确定字符串的确切值,您可以做的唯一测试是查看您指定的 DOS 样式通配符(? = 任何字符,* = 任意数量的任何字符)是否与该字符串匹配。 (因此实际上您只能访问 boolDoesWildcardMatch(string wildcard) 函数)。

直接的方法是测试 a*, b*, c*... 直到找到第一个字母,然后重复。 我能想到的一些优化:

  • 搜索 *a*, *b* 等以确定字符集,
  • 当找到 *x* 上的匹配时,执行除法 - et-impera (*a*x*, *b*x*, ...)

This problem is similar to blind SQL injections. The goal is to determine the exact value of a string, and the only test you can do is to see if a DOS-style wildcard (? = any character, * = any number of any characters) you specify is matched by the string. (So practically you only have access to a bool DoesWildcardMatch(string wildcard) function).

The straight-forward way is to test against a*, b*, c*... until you find the first letter, then repeat. Some optimizations I can think of:

  • search for *a*, *b* etc. to determine the character set
  • when a match on *x* is found, perform divide-et-impera (*a*x*, *b*x*, ...)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

陈独秀 2024-07-27 09:13:04

第一个想法。 您可以在 O(log2(n)) 中确定字符串的长度 n

  • 检查 Z*,其中 Z 代表 k 个问号,从 0 开始,然后是 1,然后每次检查时将问号数量加倍,直到没有匹配发生。 n 必须介于 k / 2k 之间
  • 使用相同的模式更改相同的 k 来查找准确的长度就像二分查找一样。

知道确切的长度可能有助于在空间域中执行一种分而治之的方法。

更新

如果您知道长度,则可以使用相同的模式来正确定位符号。

示例:

    ..X. ..XX (spaces added for readability)

                              + symbol may be X
                              - symbol is not X
                              X symbol is X

    *X*         => MATCH      ++++ ++++
    *X*   ????  => MATCH      ++++ ++++
    *X*?? ????  => NO MATCH   --++ ++++
    ??X?  ????  => MATCH      --X+ ++++
    ??XX  ????  => NO MATCH   --X- ++++
    ??X?  *X*?? => NO MATCH   --X- --++
    ??X?  ??X?  => MATCH      --X- --X+
    ??X?  ??XX  => MATCH      --X- --XX

对于字符串长度 n 和字母大小 m,这将需要大约 O(log2(n)) 才能找到字符串的长度,大约 O(n • log2(n)) 正确放置 n 个符号,O(m) 查找使用的符号 - 对所有符号求和一起产生O(n • log2(n) + m)

我可以想象,可以通过合并几个步骤来加快速度 - 也许在确定字符串长度时测试使用的符号,或者同时在字符串的前半部分和后半部分中定位两个(甚至更多?)符号。 如果检查失败,则需要单独重新检查合并的步骤,以确定哪个检查失败。 但只要合并检查成功,您就可以获得两者的信息。

也许明天我会计算一下,看看它是否真的会加快速度。

A first thought. You can determin the length n of the string in O(log2(n)).

  • Check Z* where Z represents k question marks starting with 0, then 1, and then doubling the number of question marks with every check until no match occurs. n must be between k / 2 and k
  • Find the exact length using the same pattern changing k in the same way as binary search does.

Knowing the exact length might help to perform a kind of divide-et-impera in the spatial domain.

UPDATE

If you know the length, you can use the same pattern to correctly locate a symbol.

Example:

    ..X. ..XX (spaces added for readability)

                              + symbol may be X
                              - symbol is not X
                              X symbol is X

    *X*         => MATCH      ++++ ++++
    *X*   ????  => MATCH      ++++ ++++
    *X*?? ????  => NO MATCH   --++ ++++
    ??X?  ????  => MATCH      --X+ ++++
    ??XX  ????  => NO MATCH   --X- ++++
    ??X?  *X*?? => NO MATCH   --X- --++
    ??X?  ??X?  => MATCH      --X- --X+
    ??X?  ??XX  => MATCH      --X- --XX

For string length n and alphabet size m this will take about O(log2(n)) to find the length of the string, about O(n • log2(n)) to correctly place n symbols, and O(m) to find the used symbols - summing all together yields O(n • log2(n) + m).

I could imagine that it is possible to speed this up by merging several steps - maybe test for used symbols while determining the string length or simultaneously locating two (or even more?) symbols in the first and second half of the string. This will require to recheck the merged steps in isolation if the check fails in order to determine which check faild. But as long as the merged check succeeds, you gain information on both.

Maybe I will calculate that tomorrow in order to see if it will really speed the thing up.

南城旧梦 2024-07-27 09:13:04

至于分而治之,请务必记录您已知不存在的值。 另外,我不会选择a,b,c,而是选择频率顺序。 某种马尔可夫链可能会让它变得更快。

需要注意的一件事是,您不能假设给定的文字始终与输入中的相同位置匹配。 对于最后删除通配符,这将特别令人感兴趣。

c a b a
--------
* a *     match
  * b*a*  woops!

As for the divide-et-impera, be sure to keep track of value that you known are not present. Also I'd not go with a, b, c, but with frequency order. Some sort of markov chain from that might make it even faster.

One thing to watch out for is that you can't assume that a given literal will always match the same location in the input. This will be of particular interest regarding removing the wild cards at the end.

c a b a
--------
* a *     match
  * b*a*  woops!
城歌 2024-07-27 09:13:04

如果有具体数量? 有效,您还可以检查“?”,“??”,“???” 等来获取字符串的长度,但我怀疑这会有多大帮助,因为您还可以在每一轮之后仅进行一次额外检查而不使用任何通配符来检查是否获得了正确的长度。

我认为之前带有字符集检查的除法方法几乎是最佳的,还有一些额外的细节,例如如果您匹配*a*b*,则应该检查*ab**ab 和“ab”以了解其间是否有字母,当然如上所述,检查此后是否已完成右侧或完全完成。

If a specific number of ? works, you can also check "?", "??", "???" etc. to get the length of the string, but I doubt this will help much as you can also check if you've got the right length with just one additional check without any wildcards after each round.

I think the divide method with a character set check before is almost optimal, there are some additional details, for example if you matched *a*b*, you should check *ab* afterwards to know if there are letters in between and of course as stated above, check *ab and "ab" after this to know if you've finished on the right side or completely.

审判长 2024-07-27 09:13:04

为什么不将 DOS 风格的通配符字符串转换为正则表达式? 例如:

?a*

变为:

.a.*

然后只需执行简单的正则表达式匹配即可将其与测试字符串进行比较。

Why not convert your DOS style wildcard string into a regular expression? e.g.:

?a*

becomes:

.a.*

Then just perform a simple regular expression match comparing that to your test string.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文