Python,我需要以下代码才能更快地完成

发布于 2024-12-26 09:33:25 字数 1382 浏览 0 评论 0原文

我需要以下代码来更快地完成,无需线程或多处理。如果有人知道任何技巧,将不胜感激。也许 for i in enumerate() 或者在计算之前将列表更改为字符串,我不确定。
对于下面的示例,我尝试使用随机序列重新创建变量,但这使得循环内的一些条件变得无用......这对于这个示例来说是可以的,它只是意味着代码的“真实”应用需要的时间会稍长一些。 目前在我的 i7 上,下面的示例(主要会绕过一些条件)在 1 秒内完成,我想尽可能地缩短它。

import random
import time
import collections
import cProfile


def random_string(length=7):
    """Return a random string of given length"""
    return "".join([chr(random.randint(65, 90)) for i in range(length)])

LIST_LEN = 18400
original = [[random_string() for i in range(LIST_LEN)] for j in range(6)]
LIST_LEN = 5
SufxList = [random_string() for i in range(LIST_LEN)]
LIST_LEN = 28
TerminateHook = [random_string() for i in range(LIST_LEN)]
#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Exclude above from benchmark


ListVar = original[:]
for b in range(len(ListVar)):
   for c in range(len(ListVar[b])):

       #If its an int ... remove
       try:
           int(ListVar[b][c].replace(' ', ''))
           ListVar[b][c] = ''
       except: pass

       #if any second sufxList delete
       for d in range(len(SufxList)):
           if ListVar[b][c].find(SufxList[d]) != -1: ListVar[b][c] = ''

       for d in range(len(TerminateHook)):
           if ListVar[b][c].find(TerminateHook[d]) != -1: ListVar[b][c] = ''
   #remove all '' from list
   while '' in ListVar[b]: ListVar[b].remove('')

print(ListVar[b])

I need the following code to finish quicker without threads or multiprocessing. If anyone knows of any tricks that would be greatly appreciated. maybe for i in enumerate() or changing the list to a string before calculating, I'm not sure.
For the example below, I have attempted to recreate the variables using a random sequence, however this has rendered some of the conditions inside the loop useless ... which is ok for this example, it just means the 'true' application for the code will take slightly longer.
Currently on my i7, the example below (which will mostly bypass some of its conditions) completes in 1 second, I would like to get this down as much as possible.

import random
import time
import collections
import cProfile


def random_string(length=7):
    """Return a random string of given length"""
    return "".join([chr(random.randint(65, 90)) for i in range(length)])

LIST_LEN = 18400
original = [[random_string() for i in range(LIST_LEN)] for j in range(6)]
LIST_LEN = 5
SufxList = [random_string() for i in range(LIST_LEN)]
LIST_LEN = 28
TerminateHook = [random_string() for i in range(LIST_LEN)]
#^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Exclude above from benchmark


ListVar = original[:]
for b in range(len(ListVar)):
   for c in range(len(ListVar[b])):

       #If its an int ... remove
       try:
           int(ListVar[b][c].replace(' ', ''))
           ListVar[b][c] = ''
       except: pass

       #if any second sufxList delete
       for d in range(len(SufxList)):
           if ListVar[b][c].find(SufxList[d]) != -1: ListVar[b][c] = ''

       for d in range(len(TerminateHook)):
           if ListVar[b][c].find(TerminateHook[d]) != -1: ListVar[b][c] = ''
   #remove all '' from list
   while '' in ListVar[b]: ListVar[b].remove('')

print(ListVar[b])

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

玩心态 2025-01-02 09:33:25
ListVar = original[:]

这会生成 ListVar 的浅表副本,因此对第二级列表的更改也会影响原始列表。你确定这就是你想要的吗?更好的办法是从头开始构建新的修改列表。

for b in range(len(ListVar)):
   for c in range(len(ListVar[b])):

恶心:只要有可能就直接迭代列表。

       #If its an int ... remove
       try:
           int(ListVar[b][c].replace(' ', ''))
           ListVar[b][c] = ''
       except: pass

您想忽略数字中间的空格吗?这听起来不对。如果数字可以为负数,您可能需要使用try.. except,但如果它们只是正数,则只需使用.isdigit()

       #if any second sufxList delete
       for d in range(len(SufxList)):
           if ListVar[b][c].find(SufxList[d]) != -1: ListVar[b][c] = ''

这只是不好的命名吗? SufxList 意味着您正在寻找后缀,如果是这样,只需使用 .endswith() (请注意,您可以传入一个元组以避免循环)。如果您确实想查找字符串中任意位置的后缀,请使用 in 运算符。

       for d in range(len(TerminateHook)):
           if ListVar[b][c].find(TerminateHook[d]) != -1: ListVar[b][c] = ''

再次使用 in 运算符。 any() 在这里也很有用。

   #remove all '' from list
   while '' in ListVar[b]: ListVar[b].remove('')

while 是 O(n^2) 即它会很慢。您可以使用列表理解来去掉空格,但最好从一开始就构建干净的列表。

print(ListVar[b])

我想也许你那张印刷品上的缩进是错误的。

将这些建议放在一起会得到如下结果:

suffixes = tuple(SufxList)
newListVar = []
for row in original:
   newRow = []
   newListVar.append(newRow)
   for value in row:
       if (not value.isdigit() and 
           not value.endswith(suffixes) and
           not any(th in value for th in TerminateHook)):
           newRow.append(value)

    print(newRow)
ListVar = original[:]

That makes a shallow copy of ListVar, so your changes to the second level lists are going to affect the original also. Are you sure that is what you want? Much better would be to build the new modified list from scratch.

for b in range(len(ListVar)):
   for c in range(len(ListVar[b])):

Yuck: whenever possible iterate directly over lists.

       #If its an int ... remove
       try:
           int(ListVar[b][c].replace(' ', ''))
           ListVar[b][c] = ''
       except: pass

You want to ignore spaces in the middle of numbers? That doesn't sound right. If the numbers can be negative you may want to use the try..except but if they are only positive just use .isdigit().

       #if any second sufxList delete
       for d in range(len(SufxList)):
           if ListVar[b][c].find(SufxList[d]) != -1: ListVar[b][c] = ''

Is that just bad naming? SufxList implies you are looking for suffixes, if so just use .endswith() (and note that you can pass a tuple in to avoid the loop). If you really do want to find the the suffix is anywhere in the string use the in operator.

       for d in range(len(TerminateHook)):
           if ListVar[b][c].find(TerminateHook[d]) != -1: ListVar[b][c] = ''

Again use the in operator. Also any() is useful here.

   #remove all '' from list
   while '' in ListVar[b]: ListVar[b].remove('')

and that while is O(n^2) i.e. it will be slow. You could use a list comprehension instead to strip out the blanks, but better just to build clean lists to begin with.

print(ListVar[b])

I think maybe your indentation was wrong on that print.

Putting these suggestions together gives something like:

suffixes = tuple(SufxList)
newListVar = []
for row in original:
   newRow = []
   newListVar.append(newRow)
   for value in row:
       if (not value.isdigit() and 
           not value.endswith(suffixes) and
           not any(th in value for th in TerminateHook)):
           newRow.append(value)

    print(newRow)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文