多线程不改善Python的结果？

发布于 2025-01-28 03:47:20 字数 2441 浏览 4 评论 0原文

我将多线程应用于Python脚本以提高其性能。我不明白为什么执行时间没有改善。

这是我实施的代码段：

from queue import Queue
from threading import Thread
from datetime import datetime
import time



class WP_TITLE_DOWNLOADER(Thread):
    def __init__(self, queue,name):
        Thread.__init__(self)
        self.queue = queue
        self.name = name
 
    
    def download_link(self,linkss):       
       ####some test function
       ###later some processing will be done on this list.
       #####this will be processed on CPU. 
       for idx,link in enumerate(linkss):
           ##time.sleep(0.01)
           test.append(idx)

       for idx,i in enumerate(testv):
           i=i.append(2)
      ##

    def run(self):
        while True:
            # Get the work from the queue
            linkss = self.queue.get()
            try:
                 self.download_link(linkss)
            finally:
                 self.queue.task_done()                


       
######with threading

testv=[[i for i in range(5000)] for j in range(20)]
links_list=[[i for i in range(100000)] for j in range(20)]
test=[]
start_time =time.time()
queue = Queue()
thread_count=8
for x in range(thread_count):
    worker = WP_TITLE_DOWNLOADER(queue,str(x))
    # Setting daemon to True will let the main thread exit even though the workers are blocking
    worker.daemon = True
    worker.start()




##FILL UP Queue for threads
for links in links_list: 
        queue.put(links)
        
        
        
##print("queing time={}".format(time.time()-start_time))        
#print(test)
#wait for all to end
j_time =time.time()
queue.join()
t_time = time.time()-start_time
print("With threading time={}".format(t_time))
           
    



#############without threading,  
###following function is same as the one in threading. 
test=[]
def download_link(links1):       
        for idx,link in enumerate(links1):
           ##time.sleep(0.01)
           test.append(idx)
           
        for idx,i in enumerate(testv):
           i=i.append(2)



start_time =time.time()
for links in links_list: 
        download_link(links)
       
        
t_time = time.time()-start_time
print("without threading time={}".format(t_time))

螺纹时间= 0.564049482345581 没有线程时间= 0.13332700729370117

注意：当我脱离 time.sleep 时，螺纹时间低于没有螺纹的时间。我的测试案例是我有一个列表，每个列表都有10000多个元素，使用多线程的想法是，可以同时处理多个列表，而不是处理单个列表项目，导致整个时间的减少。。但是结果不符合预期。

原文

I am applying Multi-threading to a python script to improve its performance. I don't understand why there is no improvement in the execution time.

This is the code snippet of my implementation:

from queue import Queue
from threading import Thread
from datetime import datetime
import time



class WP_TITLE_DOWNLOADER(Thread):
    def __init__(self, queue,name):
        Thread.__init__(self)
        self.queue = queue
        self.name = name
 
    
    def download_link(self,linkss):       
       ####some test function
       ###later some processing will be done on this list.
       #####this will be processed on CPU. 
       for idx,link in enumerate(linkss):
           ##time.sleep(0.01)
           test.append(idx)

       for idx,i in enumerate(testv):
           i=i.append(2)
      ##

    def run(self):
        while True:
            # Get the work from the queue
            linkss = self.queue.get()
            try:
                 self.download_link(linkss)
            finally:
                 self.queue.task_done()                


       
######with threading

testv=[[i for i in range(5000)] for j in range(20)]
links_list=[[i for i in range(100000)] for j in range(20)]
test=[]
start_time =time.time()
queue = Queue()
thread_count=8
for x in range(thread_count):
    worker = WP_TITLE_DOWNLOADER(queue,str(x))
    # Setting daemon to True will let the main thread exit even though the workers are blocking
    worker.daemon = True
    worker.start()




##FILL UP Queue for threads
for links in links_list: 
        queue.put(links)
        
        
        
##print("queing time={}".format(time.time()-start_time))        
#print(test)
#wait for all to end
j_time =time.time()
queue.join()
t_time = time.time()-start_time
print("With threading time={}".format(t_time))
           
    



#############without threading,  
###following function is same as the one in threading. 
test=[]
def download_link(links1):       
        for idx,link in enumerate(links1):
           ##time.sleep(0.01)
           test.append(idx)
           
        for idx,i in enumerate(testv):
           i=i.append(2)



start_time =time.time()
for links in links_list: 
        download_link(links)
       
        
t_time = time.time()-start_time
print("without threading time={}".format(t_time))

With threading time=0.564049482345581
without threading time=0.13332700729370117

NOTE: When I uncomment time.sleep, with threading time is lower than without threading.
My test case is I have a list of lists, each list has more than 10000s elements, the idea of using multi-threading is that instead of processing a single list item, multiple lists can be processed simultaneously, resulting in a decrease in overall time. But the results are not as expected.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

濫情▎り 2025-02-04 03:47:20

一般而言（总会有例外），多线程最适合IO结合处理（包括网络）。多处理非常适合CPU密集型活动。

因此，您的测试是有缺陷的。

您的意图显然是要进行某种网络爬行，但这并不是在测试代码中发生的，这意味着您的测试是CPU密集型的，因此不适合多线程。鉴于，一旦添加了网络代码，您可能会发现，只要您使用了合适的技术，就可以改善。

查看consturrent.futures中的threadpoolexecutor。您可能会发现这特别有用，因为您可以通过简单地用ProcessPoolExecutor替换ThreadPoolExecutor来交换多处理，这将使您的实验更容易量化

回复收藏 0 原文

幸福％小乖 2025-02-04 03:47:20

Python的概念称为“ GIL（全球解释器锁）”。此锁可确保在运行时只有一个线程。因此，即使您催生了多个线程来处理多个列表，一次仅处理一个线程。您可以尝试进行多处理以进行并行执行。

回复收藏 0 原文

茶色山野 2025-02-04 03:47:20

由于GIL（全局解释器锁），Python的线程很尴尬。线程必须竞争以使主要解释器能够计算。仅当线程中的代码不需要全局解释器即，即，在Python中的线程才是有益的。将计算卸载到硬件加速器时，执行I/O绑定计算或调用非Python库时。对于Python中的真实并发，请改用多处理。这有点麻烦了，因为您必须专门共享变量或复制它们并经常序列化通信。

回复收藏 0 原文

~没有更多了~