在 python 中使用 unrar lib 检查字典密码时的内存使用情况累积

发布于 2025-01-12 07:46:30 字数 1500 浏览 2 评论 0原文

我用 python 编写了一些粗略的代码,用于从受密码保护的 rar 存档的字典文件中检查密码。我什至添加了一些多线程,运行得很好。不幸的是,当脚本遍历密码列表时,内存使用量开始增长。经过超过 10k 次尝试,内存使用量超过 10GB...我在 unrar lib 文档中找不到任何释放资源的方法,并且使用 gc.collector 没有帮助。每次密码检查后如何释放缓冲区? 这是代码:

import os
import os.path
import fileinput
import sys
from unrar import rarfile
import gc
import threading
import linecache

class App():
    def check(fraction, n):
        FILE = sys.argv[1]
        DICT = sys.argv[2]
        
        with open(DICT, 'r') as passdict:
            k = len([0 for l in passdict])
        
        counter = int(k / n)
        start = counter * fraction
        stop = counter * (fraction + 1)
        i = start
        print('fr: %s start: %s stop: %s'% (fraction, start, stop))
        while i < stop:
            p = linecache.getline(DICT, i)
            #print(i)              
            try:
                rf = rarfile.RarFile(FILE, pwd=p)
                if len(rf.namelist())>0:
                    print(p)
                    
                    break
                
                i += 1
                pass
            except rarfile.BadRarFile:
                gc.collect(generation=0)
                
                i += 1
                pass
        return
        

if __name__ == '__main__':
    for k in range(6):
        t = threading.Thread(target=App.check, args=(k, 6,))
        t.start()

编辑- 好的,所以我改为 rarfile lib (pypi.org/project/rarfile),内存没有累积,但多线程停止工作,而且运行速度慢得多......看起来它都在一个线程上运行(任务管理器) :/

I wrote some crude code in python for checking passwords from a dictionary file for password protected rar archive. I even added some multi-threading, runs great. Unfortunately as the script goes through password list the memory usage starts growing. With >10k of tries the memory usage goes over 10GB... I couldn't find any methods in unrar lib documentation for freeing resources, and using gc.collector didn't help. How can I free the buffer after every password check?
Here's the code:

import os
import os.path
import fileinput
import sys
from unrar import rarfile
import gc
import threading
import linecache

class App():
    def check(fraction, n):
        FILE = sys.argv[1]
        DICT = sys.argv[2]
        
        with open(DICT, 'r') as passdict:
            k = len([0 for l in passdict])
        
        counter = int(k / n)
        start = counter * fraction
        stop = counter * (fraction + 1)
        i = start
        print('fr: %s start: %s stop: %s'% (fraction, start, stop))
        while i < stop:
            p = linecache.getline(DICT, i)
            #print(i)              
            try:
                rf = rarfile.RarFile(FILE, pwd=p)
                if len(rf.namelist())>0:
                    print(p)
                    
                    break
                
                i += 1
                pass
            except rarfile.BadRarFile:
                gc.collect(generation=0)
                
                i += 1
                pass
        return
        

if __name__ == '__main__':
    for k in range(6):
        t = threading.Thread(target=App.check, args=(k, 6,))
        t.start()

Edit-
Ok, so I changed to rarfile lib (pypi.org/project/rarfile), the memory doesn't buildup but the multi-threading stopped working, and also it works much slower... Looks like it all runs on one thread (task manager) :/

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

随梦而飞# 2025-01-19 07:46:30

我想我修好了。不幸的是尝试使用 never 库没有帮助。嗯,它有助于解决内存问题,但不知何故它不想进行多线程,所以它真的很慢。我设法修复了 unrar 库 - 我在异常语句中添加了对 _close 函数的调用。看起来它只是在异常退出时没有释放资源 - 比如密码错误。也许它对具有加密文件名的档案执行此操作(就像我的情况一样),但我没有检查。这是 unrar 库的 rarfile.py 中的修改代码:

def _read_header(self, handle):
    """Read current member header into a RarInfo object."""
    header_data = unrarlib.RARHeaderDataEx()
    try:
        res = unrarlib.RARReadHeaderEx(handle, ctypes.byref(header_data))
        rarinfo = RarInfo(header=header_data)
    except unrarlib.ArchiveEnd:
        return None
    except unrarlib.MissingPassword:
        raise RuntimeError("Archive is encrypted, password required")
    except unrarlib.BadPassword:
        raise RuntimeError("Bad password for Archive")
    except unrarlib.UnrarException as e:
        self._close(handle) #This line fixes the memory issue
        raise BadRarFile(str(e))

    return rarinfo

这是我修改后的脚本:

import os
import os.path
import fileinput
import sys
import re
from unrar import rarfile
import threading
import linecache

class App():
    def check(fraction, n):
        FILE = sys.argv[1]
        DICT = sys.argv[2]
        
        with open(DICT, 'r') as passdict:
            k = len([0 for l in passdict])
        
        counter = int(k / n)
        start = counter * fraction
        stop = counter * (fraction + 1)
        i = start
        print('fr: %s start: %s stop: %s'% (fraction, start, stop))
        while i <= stop:
            p = re.search('\S*',linecache.getline(DICT, i)).group()
            try:
                with rarfile.RarFile(FILE, pwd=p) as rf:
                    rf.extractall(path='D:\\',pwd=p)
                    if len(rf.namelist())>0:
                        print(p)
                        break
                    i += 1
                pass
            except:
                i += 1
                pass
        print('stop')
        return
        

if __name__ == '__main__':
    for k in range(8):
        t = threading.Thread(target=App.check, args=(k, 8,))
        t.start()

I think I fixed it. Unfortunately trying to use never library didn't help. Well, it helped with memory issue, but somehow it didn't want to do multithreading so it was really slow. I manged to fix the unrar library - I added call to _close function in exception statement. Looks like it just didn't free the resources when exiting with exception - like bad password. Maybe it does that with archives that have encrypted filenames (like in my case), but I didn't check. Here is modified code in rarfile.py of unrar library:

def _read_header(self, handle):
    """Read current member header into a RarInfo object."""
    header_data = unrarlib.RARHeaderDataEx()
    try:
        res = unrarlib.RARReadHeaderEx(handle, ctypes.byref(header_data))
        rarinfo = RarInfo(header=header_data)
    except unrarlib.ArchiveEnd:
        return None
    except unrarlib.MissingPassword:
        raise RuntimeError("Archive is encrypted, password required")
    except unrarlib.BadPassword:
        raise RuntimeError("Bad password for Archive")
    except unrarlib.UnrarException as e:
        self._close(handle) #This line fixes the memory issue
        raise BadRarFile(str(e))

    return rarinfo

And here is my modified script:

import os
import os.path
import fileinput
import sys
import re
from unrar import rarfile
import threading
import linecache

class App():
    def check(fraction, n):
        FILE = sys.argv[1]
        DICT = sys.argv[2]
        
        with open(DICT, 'r') as passdict:
            k = len([0 for l in passdict])
        
        counter = int(k / n)
        start = counter * fraction
        stop = counter * (fraction + 1)
        i = start
        print('fr: %s start: %s stop: %s'% (fraction, start, stop))
        while i <= stop:
            p = re.search('\S*',linecache.getline(DICT, i)).group()
            try:
                with rarfile.RarFile(FILE, pwd=p) as rf:
                    rf.extractall(path='D:\\',pwd=p)
                    if len(rf.namelist())>0:
                        print(p)
                        break
                    i += 1
                pass
            except:
                i += 1
                pass
        print('stop')
        return
        

if __name__ == '__main__':
    for k in range(8):
        t = threading.Thread(target=App.check, args=(k, 8,))
        t.start()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文