执行基于分布式 CUDA/OpenCL 的密码破解

发布于 2024-12-18 11:30:28 字数 182 浏览 4 评论 0原文

有没有办法执行基于 CUDA/openCL 的分布式(如在连接的计算机集群中)字典攻击?

例如,如果我有一台带有 NVIDIA 卡的计算机,它与另一台耦合的计算机分担字典攻击的负载,从而利用那里的第二个 GPU 阵列?

我们的想法是确保未来扩展的可扩展性选项,而无需更换我们正在使用的整套硬件。 (假设云不是一个选择)

Is there a way to perform a distributed (as in a cluster of a connected computers) CUDA/openCL based dictionary attack?

For example, if I have a one computer with some NVIDIA card that is sharing the load of the dictionary attack with another coupled computer and thus utilizing a second array of GPUs there?

The idea is to ensure a scalability option for future expanding without the need of replacing the whole set of hardware that we are using. (and let's say cloud is not an option)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

維他命╮ 2024-12-25 11:30:28

这是一个简单的主/从工作委托问题。主工作服务器向任何连接的从属进程分发一个工作单元。从属设备在一台设备上工作并在一台设备上排队。当他们完成一个单元时,他们会向服务器报告。经过彻底检查的工作单位用于估计每秒的操作数。根据您的设置,我会将工作单位调整为 15-60 秒范围内的某个位置。任何在 10 分钟内未得到响应的内容都会被回收回队列中。

对于排队,提供当前未破解的哈希列表、要检查的字典范围以及要应用的排列规则。主服务器应该能够调整每台机器的队列和每个排列规则集,以便所有机器在一分钟左右的时间内完成各自的工作。

或者,如果每个工作单元的大小相同,则编码可以变得更简单。即便如此,任何机器的空闲时间都不会超过最慢机器完成一个工作单元的时间。调整工作单元的大小,以便最快的机器不会出现资源匮乏的情况(完成工作的速度不应超过五秒,应始终有第二个单元排队)。使用这种方法,希望您最快的机器和最慢的机器相差不超过 100 倍。

This is a simple master / slave work delegation problem. The master work server hands out to any connecting slave process a unit of work. Slaves work on one unit and queue one unit. When they complete a unit, they report back to the server. Work units that are exhaustively checked are used to estimate operations per second. Depending on your setup, I would adjust work units to be somewhere in the 15-60 second range. Anything that doesn't get a response by the 10 minute mark is recycled back into the queue.

For queuing, offer the current list of uncracked hashes, the dictionary range to be checked, and the permutation rules to be applied. The master server should be able to adapt queues per machine and per permutation rule set so that all machines are done their work within a minute or so of each other.

Alternately, coding could be made simpler if each unit of work were the same size. Even then, no machine would be idle longer than the amount of time for the slowest machine to complete one unit of work. Size your work units so that the fastest machine doesn't enter a case of resource starvation (shouldn't complete work faster than five seconds, should always have a second unit queued). Using that method, hopefully your fastest machine and slowest machine aren't different by a factor of more than 100x.

调妓 2024-12-25 11:30:28

在我看来,编写自己的服务来做这件事是很容易的。

超级简单设置

假设您有一些支持 GPU 的程序 X,它采用哈希 h 作为输入和字典单词 D 列表,然后使用字典单词尝试破解密码。使用一台机器,您只需运行 X(h,D)。

如果您有 N 台机器,则将字典分为 N 个部分(D_1、D_2、D_3、...、D_N)。然后在机器 i 上运行 P(x,D_i)。

使用 SSH 可以轻松完成此操作。主机将字典拆分,使用 SCP 将其复制到每台从机,然后连接到从机并告诉它们运行程序。

稍微智能的设置

当一台机器破解密码时,他们可以轻松地通知主人他们已经完成了任务。然后主设备杀死其他从设备上运行的程序。

It would seem to me that it would be quite easy to write your own service that would do just this.

Super Easy Setup

Let's say you have some GPU enabled program X that takes a hash h as input and a list of dictionary words D, then uses the dictionary words to try and crack the password. With one machine, you simply run X(h,D).

If you have N machines, you split the dictionary into N parts (D_1, D_2, D_3,...,D_N). Then run P(x,D_i) on machine i.

This could easily be done using SSH. The master machine splits the dictionary up, copies it to each of the slave machines using SCP, then connects to the slaves and tells them to run the program.

Slightly Smarter Setup

When one machine cracks the password, they could easily notify the master that they have completed the task. The master then kills the programs running on the other slaves.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文