如何使python脚本用于移动文件更快?

发布于 2025-01-28 22:06:38 字数 1420 浏览 4 评论 0原文

嗨,我是Python的新手,我制作了这个简单的程序来整理下载文件夹。当我在下载文件夹中有不错的文件时运行此程序时,大约需要10到15秒。

我听说Python相对于C的

最佳方法使该代码在没有并行线程的情况下更有效吗?

另外,通过并行运行线程使其更快地使其更快?

我的下一个计划是使用python为桌面仪表板制作GUI,在那里我可以将此代码作为可运行的图标。我怎么能实现这一目标...任何想法都将不胜感激。感谢您的帮助。

import shutil
import os, time

start = time.time()
print("Sorting Downlaods Folder...")

download_folder = "C:/Users/muham/Downloads"
images = "C:/Users/muham/Pictures"
documents = "C:/Users/muham/Documents"

docs = ["docx", ".txt", ".doc"]
imgs = [".png", ".jpg", "jpeg"]
prgrms = [".exe", ".php", ".c", "java", ".msi"]
    
for file in os.listdir(download_folder):
    if file[-4:] in docs:
        shutil.move(download_folder + '/' + file, documents + '/Docs/' + file)
    elif file[-4:] == "xlsx" or file[-4:] == ".csv":
        shutil.move(download_folder + '/' + file, documents + '/Excel/' + file)
    elif file[-4:] == ".pdf" or file[-4:] == ".ppt" or file[-4:] == ".pptx":
        shutil.move(download_folder + '/' + file, documents + '/PDFs/' + file)
    elif file[-4:] in imgs:
        shutil.move(download_folder + '/' + file, images + '/' + file) 
    elif file[-4:] in prgrms or file[-2:] == ".c":
        shutil.move(download_folder + '/' + file, "D:/programs/" + file) 
    else:
        shutil.move(download_folder + '/' + file, "D:/zips/" + file)

print("Download folder sorted")
end = time.time()
print("Time taken:" , end - start)

Hi I am new to python and I made this simple program to sort out my downloads folder. When I run this program when i have decent number of files in download folder, it takes roughly around 10 to 15 seconds.

I heard python is slower with respect to C.

Best way to make this code more efficient without parallel threads?

Also, Best way to make it faster with parallel running threads?

My next plan is to use python to make a GUI for a desktop dashboard where I can have this code as runnable icon. How can I achieve that... any ideas would be greatly appreciated. Thanks for the help.

import shutil
import os, time

start = time.time()
print("Sorting Downlaods Folder...")

download_folder = "C:/Users/muham/Downloads"
images = "C:/Users/muham/Pictures"
documents = "C:/Users/muham/Documents"

docs = ["docx", ".txt", ".doc"]
imgs = [".png", ".jpg", "jpeg"]
prgrms = [".exe", ".php", ".c", "java", ".msi"]
    
for file in os.listdir(download_folder):
    if file[-4:] in docs:
        shutil.move(download_folder + '/' + file, documents + '/Docs/' + file)
    elif file[-4:] == "xlsx" or file[-4:] == ".csv":
        shutil.move(download_folder + '/' + file, documents + '/Excel/' + file)
    elif file[-4:] == ".pdf" or file[-4:] == ".ppt" or file[-4:] == ".pptx":
        shutil.move(download_folder + '/' + file, documents + '/PDFs/' + file)
    elif file[-4:] in imgs:
        shutil.move(download_folder + '/' + file, images + '/' + file) 
    elif file[-4:] in prgrms or file[-2:] == ".c":
        shutil.move(download_folder + '/' + file, "D:/programs/" + file) 
    else:
        shutil.move(download_folder + '/' + file, "D:/zips/" + file)

print("Download folder sorted")
end = time.time()
print("Time taken:" , end - start)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

余厌 2025-02-04 22:06:38

首先要做的就是介绍此代码,以找出花费时间的时间。按以下顺序优化操作:

  1. 频繁并需要很长时间
  2. ,并花费很短的时间;很少见,需要很长时间
  3. ,花了很短的时间

#2是两种类型的操作之间的平衡行为,但通常频繁和短暂的选择将是更好的选择。

如果大部分时间都花在shutil.move()中,则最多可以将运行时减少一半。实际上,您将无法减少它。在这种情况下,多线程是合适的,因为操作是I/O-BOND。


缺少个人资料信息,以下是我能做的最好的信息。

提高速度的最简单方法是更改​​docimgsprgms set> set s。通常,对于具有两个以上元素的集合,setlist要快。

from timeit import timeit
stmt = '''
for word in wordlist:
    word in keys
'''
timeit(stmt, setup='import random; keys=["foo", "bar", "baz"]; wordlist = random.choices(keys, k=100)')
timeit(stmt, setup='import random; keys={"foo", "bar", "baz"}; wordlist = random.choices(list(keys), k=100)')
3.8205395030090585
2.583121053990908

另一个机会在于使用str.rsplit()而不是切片以获取文件扩展名。切片是一个新的字符串,此代码每个循环多次可做。在条件块之前,在循环的开始时进行一次。这将需要从docsimgsprgms中的扩展中删除

docs = ["docx", "txt", "doc"]
imgs = ["png", "jpg", "jpeg"]
prgrms = ["exe", "php", "c", "java", "msi"]
    
for file in os.listdir(download_folder):
    extension = file.rsplit(".")
    if extension in docs:

根据实验数据重新排序条件,以便最常见的true比较首先,第二个最频繁的true比较是第二等。


使用time.time()来测量执行时间是不可靠的。此方法将不考虑其他过程使用的时间。因此,如果在其他过程产生较高的资源争夺时进行测量,则测量将大于您尝试测量的执行时间。错误的另一个来源是,系统时间可能在执行过程中发生变化(很可能是由于NTP同步引起的)。这些是我建议使用Cprofile的软件包进行分析的一些原因。


编辑:
我注意到某些sultil.move()调用将文件从C驱动器移动到D驱动器。这可以触发实际的副本而不是重命名,这可能会导致运行时大幅增加,具体取决于您的文件系统的设置方式。

The first thing to do is profile this code to find out where it spends the time. Optimize operations in this order:

  1. frequent and take a long time
  2. frequent and take a short time; infrequent and take a long time
  3. infrequent and take a short time

#2 is a balancing act between the two types of ops, but generally frequent and short will be the better option.

If the majority of the time is spent in shutil.move() you'll be able to reduce the runtime by at most half, theoretically. Practically, you won't be able to reduce it by this much. In this case, multithreading is suitable because the operation is I/O-bound.


Absent profile information, the following is the best I can do.

The easiest way to improve speed is to change docs, imgs and prgms to setS. Typically, a set is faster than a list for collections with more than two elements.

from timeit import timeit
stmt = '''
for word in wordlist:
    word in keys
'''
timeit(stmt, setup='import random; keys=["foo", "bar", "baz"]; wordlist = random.choices(keys, k=100)')
timeit(stmt, setup='import random; keys={"foo", "bar", "baz"}; wordlist = random.choices(list(keys), k=100)')
3.8205395030090585
2.583121053990908

Another opportunity lies in using str.rsplit() instead of slicing to get the file extension. A slice is a new string and this code does it multiple times per loop. Do it once at the start of the loop before the conditional block. This will require removing the .s from the extensions in docs, imgs and prgms.

docs = ["docx", "txt", "doc"]
imgs = ["png", "jpg", "jpeg"]
prgrms = ["exe", "php", "c", "java", "msi"]
    
for file in os.listdir(download_folder):
    extension = file.rsplit(".")
    if extension in docs:

Reorder your conditional based on experimental data so that the most frequently True comparison is first, the second most frequently True comparison is second and so forth.


Using time.time() to measure execution time is unreliable. This method will not take into account time used by other processes. So, if measuring while other processes are generating high resource contention, the measurement will be greater than the execution time of what you're attempting to measure. Another source of error is that the system time may change during execution (most likely due to a NTP synchronization). These are some of the reasons I recommend profiling with a package like cProfile.


Edit:
I noticed that some of the shutil.move() calls move files from the C drive to the D drive. This can trigger an actual copy rather than rename which can result in a significant increase in runtime, depending on how your filesystems are set up.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文