在此算法上实现多核线程

发布于 2024-11-26 04:48:33 字数 938 浏览 4 评论 0原文

我正在尝试找到一种方法来使以下算法在多个核心上进行处理,但我没有得到很好的观点。我认为使用在多个进程之间共享的锁定迭代器并不是最有效的方法。

 def sortCharset(set):
   _set = ""
   for c in set:
     if c not in _set:
       _set += c
   set = _set
   del _set
   set = list(set)
   set.sort()
   return "".join(set)
 
 
 def stringForInt(num, set, length):
   setLen = len(set)
   string = ""
   string += set[num % setLen]
   for n in xrange(1,length):
     num //= setLen
     string += set[num % setLen]
   return string
 
 
 def bruteforce(set, length, raw = False):
   if raw is False:
     set = sortCharset(set)
 
   for n in xrange(len(set) ** length):
     yield stringForInt(n, set, length)

简短说明: 该代码用于创建每种可能的组合 来自一组字符,即破解密码。 (当然不是我的本意,只是一些 Py 训练。;-)

在多核上运行此算法的好方法是什么?

I'm trying to find a way to make the following algorithm being processed on multiple cores, but I don't get on a good point. Using a locked iterator shared between multiple processes wouldn't be the most efficient way I think.

 def sortCharset(set):
   _set = ""
   for c in set:
     if c not in _set:
       _set += c
   set = _set
   del _set
   set = list(set)
   set.sort()
   return "".join(set)
 
 
 def stringForInt(num, set, length):
   setLen = len(set)
   string = ""
   string += set[num % setLen]
   for n in xrange(1,length):
     num //= setLen
     string += set[num % setLen]
   return string
 
 
 def bruteforce(set, length, raw = False):
   if raw is False:
     set = sortCharset(set)
 
   for n in xrange(len(set) ** length):
     yield stringForInt(n, set, length)

Short explanation:
The code is used to create every possible combination
from a set of chars, i.e. to hack a password.
(Of course not my intention, just some Py-training. ;-)

What is a good way to run this algorithm on multiple cores ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

痴者 2024-12-03 04:48:33

问题实际上并不是关于命名样式或如何从字符串中获取一组排序的字符。

您可能想查看 multiprocessing 模块。我对多核并行性几乎是个新手,但得到了一些工作:

import multiprocessing, itertools

def stringForInt(args):
    num, charset, length = args ## hack hack hack
    setlen = len(charset)
    s = []
    s.append(charset[num % setlen])
    for n in xrange(1, length):
        num //= setlen
        s.append(charset[num % setlen])
    return ''.join(s)

def bruteforce(charset, length, mapper, raw=False):
    if not raw:
        charset = sorted(set(charset))
    return mapper(stringForInt, ((n,charset,length) for n in xrange(len(charset)**length)))

if __name__ == '__main__':
    import time, sys
    if len(sys.argv) == 1 or sys.argv[1] == 'map':
        mapper = map
    else:
        p = multiprocessing.Pool()
        pfunc = {'pmap':p.map,
                 'imap':p.imap,
                 'imapu':p.imap_unordered}[sys.argv[1]]
        mapper = lambda f, i: pfunc(f, i, chunksize=5)
    o = bruteforce('abcdefghijk',6,mapper)
    if not isinstance(o, list):
        list(o)

黑客的本质是,您需要为 multiprocessing 中的函数使用 pickleable 对象,并且仅使用以下函数:定义在顶层的是可以被腌制的。 (还有其他方法可以使用 multiprocessing.Valuemultiprocessing.Manager 来解决这个问题,但就目前的目的而言,它们并不值得深入研究。)

以下是各种运行的输出:

$ for x in map pmap imap imapu ; do time python mp.py $x; done

real    0m9.351s
user    0m9.253s
sys     0m0.096s

real    0m10.523s
user    0m20.753s
sys     0m0.176s

real    0m4.081s
user    0m13.797s
sys     0m0.276s

real    0m4.215s
user    0m14.013s
sys     0m0.236s

The question isn't really about naming style or how to get a sorted set of characters out of a string.

You might want to look into the multiprocessing module. I'm pretty much a n00b w/r/t multi-core parallelism but got something working:

import multiprocessing, itertools

def stringForInt(args):
    num, charset, length = args ## hack hack hack
    setlen = len(charset)
    s = []
    s.append(charset[num % setlen])
    for n in xrange(1, length):
        num //= setlen
        s.append(charset[num % setlen])
    return ''.join(s)

def bruteforce(charset, length, mapper, raw=False):
    if not raw:
        charset = sorted(set(charset))
    return mapper(stringForInt, ((n,charset,length) for n in xrange(len(charset)**length)))

if __name__ == '__main__':
    import time, sys
    if len(sys.argv) == 1 or sys.argv[1] == 'map':
        mapper = map
    else:
        p = multiprocessing.Pool()
        pfunc = {'pmap':p.map,
                 'imap':p.imap,
                 'imapu':p.imap_unordered}[sys.argv[1]]
        mapper = lambda f, i: pfunc(f, i, chunksize=5)
    o = bruteforce('abcdefghijk',6,mapper)
    if not isinstance(o, list):
        list(o)

The nature of the hack is that you need to use pickleable objects for the functions in multiprocessing and only functions that are defined at the top-level are can be pickled. (There would be other ways around this using multiprocessing.Value or multiprocessing.Manager but they aren't really worth going into for present purposes.)

Here's output for various runs:

$ for x in map pmap imap imapu ; do time python mp.py $x; done

real    0m9.351s
user    0m9.253s
sys     0m0.096s

real    0m10.523s
user    0m20.753s
sys     0m0.176s

real    0m4.081s
user    0m13.797s
sys     0m0.276s

real    0m4.215s
user    0m14.013s
sys     0m0.236s
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文