在 python 中使用 unrar lib 检查字典密码时的内存使用情况累积
我用 python 编写了一些粗略的代码,用于从受密码保护的 rar 存档的字典文件中检查密码。我什至添加了一些多线程,运行得很好。不幸的是,当脚本遍历密码列表时,内存使用量开始增长。经过超过 10k 次尝试,内存使用量超过 10GB...我在 unrar lib 文档中找不到任何释放资源的方法,并且使用 gc.collector 没有帮助。每次密码检查后如何释放缓冲区? 这是代码:
import os
import os.path
import fileinput
import sys
from unrar import rarfile
import gc
import threading
import linecache
class App():
def check(fraction, n):
FILE = sys.argv[1]
DICT = sys.argv[2]
with open(DICT, 'r') as passdict:
k = len([0 for l in passdict])
counter = int(k / n)
start = counter * fraction
stop = counter * (fraction + 1)
i = start
print('fr: %s start: %s stop: %s'% (fraction, start, stop))
while i < stop:
p = linecache.getline(DICT, i)
#print(i)
try:
rf = rarfile.RarFile(FILE, pwd=p)
if len(rf.namelist())>0:
print(p)
break
i += 1
pass
except rarfile.BadRarFile:
gc.collect(generation=0)
i += 1
pass
return
if __name__ == '__main__':
for k in range(6):
t = threading.Thread(target=App.check, args=(k, 6,))
t.start()
编辑- 好的,所以我改为 rarfile lib (pypi.org/project/rarfile),内存没有累积,但多线程停止工作,而且运行速度慢得多......看起来它都在一个线程上运行(任务管理器) :/
I wrote some crude code in python for checking passwords from a dictionary file for password protected rar archive. I even added some multi-threading, runs great. Unfortunately as the script goes through password list the memory usage starts growing. With >10k of tries the memory usage goes over 10GB... I couldn't find any methods in unrar lib documentation for freeing resources, and using gc.collector didn't help. How can I free the buffer after every password check?
Here's the code:
import os
import os.path
import fileinput
import sys
from unrar import rarfile
import gc
import threading
import linecache
class App():
def check(fraction, n):
FILE = sys.argv[1]
DICT = sys.argv[2]
with open(DICT, 'r') as passdict:
k = len([0 for l in passdict])
counter = int(k / n)
start = counter * fraction
stop = counter * (fraction + 1)
i = start
print('fr: %s start: %s stop: %s'% (fraction, start, stop))
while i < stop:
p = linecache.getline(DICT, i)
#print(i)
try:
rf = rarfile.RarFile(FILE, pwd=p)
if len(rf.namelist())>0:
print(p)
break
i += 1
pass
except rarfile.BadRarFile:
gc.collect(generation=0)
i += 1
pass
return
if __name__ == '__main__':
for k in range(6):
t = threading.Thread(target=App.check, args=(k, 6,))
t.start()
Edit-
Ok, so I changed to rarfile lib (pypi.org/project/rarfile), the memory doesn't buildup but the multi-threading stopped working, and also it works much slower... Looks like it all runs on one thread (task manager) :/
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我想我修好了。不幸的是尝试使用 never 库没有帮助。嗯,它有助于解决内存问题,但不知何故它不想进行多线程,所以它真的很慢。我设法修复了 unrar 库 - 我在异常语句中添加了对 _close 函数的调用。看起来它只是在异常退出时没有释放资源 - 比如密码错误。也许它对具有加密文件名的档案执行此操作(就像我的情况一样),但我没有检查。这是 unrar 库的 rarfile.py 中的修改代码:
这是我修改后的脚本:
I think I fixed it. Unfortunately trying to use never library didn't help. Well, it helped with memory issue, but somehow it didn't want to do multithreading so it was really slow. I manged to fix the unrar library - I added call to _close function in exception statement. Looks like it just didn't free the resources when exiting with exception - like bad password. Maybe it does that with archives that have encrypted filenames (like in my case), but I didn't check. Here is modified code in rarfile.py of unrar library:
And here is my modified script: