解压缩操作花费几个小时
我正在使用以下shell脚本来循环超过90个zip文件& 在托管的Linux框中,
#!/bin/bash
SOURCE_DIR="<path_to_archives>"
cd ${SOURCE_DIR}
for f in *.zip
do
# unzip -oqq "$f" -d "${f%.zip}" &
python3 scripts/extract_archives.py "${f}" &
done
wait
下面的shell脚本调用的python脚本
import shutil
import sys
source_path = "<path to source dir>"
def extract_files(in_file):
shutil.unpack_archive(source_path + in_file, source_path + in_file.split('.')[0])
print('Extracted : ', in_file)
extract_files(sys.argv[1].strip())
在下面 - 下面是 -无论我是使用Innouilt unzip
命令还是python,它都会采用大约在2.5小时解压缩所有文件。所有ZIP文件的不合格结果总体上有170000个文件的90个文件夹。我会认为15/20分钟之间的任何地方都是可以接受的时间表。
我尝试了一些不同的变体,我尝试过刻上文件夹,而不是将它们拉开,以为只是不统一的速度可能比解压缩更快。我使用了从源服务器的TAR命令将文件传输到SSH&amp;在记忆中untar类似 -
time tar zcf - . | ssh -p <port> user@host "tar xzf - -C <dest dir>"
没有任何帮助。我愿意使用其他任何编程语言,例如Perl,Go或其他其他编程语言,以加快速度。
请有人可以帮助我解决这个绩效问题。
I am using the following shell script to loop over 90 zip files & unarchive them on a Linux box hosted with Hostinger (Shared web hosting)
#!/bin/bash
SOURCE_DIR="<path_to_archives>"
cd ${SOURCE_DIR}
for f in *.zip
do
# unzip -oqq "$f" -d "${f%.zip}" &
python3 scripts/extract_archives.py "${f}" &
done
wait
The python script being called by the above shell script is below -
import shutil
import sys
source_path = "<path to source dir>"
def extract_files(in_file):
shutil.unpack_archive(source_path + in_file, source_path + in_file.split('.')[0])
print('Extracted : ', in_file)
extract_files(sys.argv[1].strip())
Irrespective of whether I use the inbuilt unzip
command or a python, it's taking about 2.5 hours to unzip all the files. unarchiving all the zip files results 90 folders with 170000 files overall. I would've thought anywhere between 15/20 min is reasonably acceptable timeframe.
I've tried a few different variations in that, I have tried just tarring the folders instead of zipping them up thinking just un-tarring may be faster than unzipping. I've used tar command from source server to transfer the files over ssh & untar in memory something like this -
time tar zcf - . | ssh -p <port> user@host "tar xzf - -C <dest dir>"
Nothing is helping. I am open to using any other programming language like Perl, Go or others too if necessary to speed things up.
Please can someone help me solve this performance problem.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
谢谢大家的回答。正如您指出的那样,这与托管环境中的服务器上的节流有关
Thank you everyone for your answers. As you indicated, this was to do with throttling on the servers in a hosted environment