从Python中的列表中提取与条件不匹配的元素的最快方法

发布于 2024-12-10 14:50:51 字数 2479 浏览 1 评论 0原文

我正在寻找最快的方法在条件下从列表中提取所有元组成员。

例子: 从元组列表(例如 [(0,0,4),(1,0,3),(1,2,1),(4,0,0)])我需要提取所有具有更多的成员在第一个元组位置中大于 3,然后在第二个元组位置中大于 2,然后在最后一个元组位置中大于 1。 在这个例子中应该提取 (4,0,0) (->第一个条件), 什么都没有 (->第二个条件) 和 (0,0,4),(1,0,3) (->最后一个健康)状况)。这个例子非常小,我需要在数千个元组的列表上执行该操作。

根据我根据您的答案生成的代码,以下是秒内的结果:

my_naive1,就像 Emil Vikström 提出的那样? 13.0360000134

my_naive2 110.727999926

蒂姆·皮茨克 9.8329999446

唐 12.5640001297

import itertools, operator, time, copy
from operator import itemgetter


def combinations_with_replacement_counts(n, r):  #(A, N) in our example.N individuals/balls in A genotypes/boxes
   size = n + r - 1
   for indices in itertools.combinations(range(size), n-1):
       #print indices
       starts = [0] + [index+1 for index in indices]
       stops = indices + (size,)
       yield tuple(map(operator.sub, stops, starts))


xp = list(combinations_with_replacement_counts(3,20))  # a very small case

a1=time.time()
temp=[]
for n in xp:
    for n1 in xp:

        for i in xp:
            if i[0] <= min(n1[0],n[0]) or i[1] <= min(n1[1],n[1]) or i[2] <= min(n1[2],n[2]):
                temp.append(i)


a2=time.time()
for n in xp:
    for n1 in xp:
        xp_copy = copy.deepcopy(xp)
        for i in xp:
            if i[0] > min(n[0],n[0]) or i[1] > min(n[1],n[1]) or i[2] > min(n[2],n[2]):
                xp_copy.remove(i)

a3=time.time()
for n in xp:
    for n1 in xp:
        output = [t for t in xp if t[0]<=min(n[0],n[0]) or t[1]<=min(n[1],n[1]) or t[2]<=min(n[2],n[2])]
a4=time.time()

for n in xp:
    for n1 in xp:
        l1 = sorted(xp, key=itemgetter(0), reverse=True)
        l1_fitered = []
        for item in l1:
            if item[0] <= min(n[0],n[0]):
                break
            l1_fitered.append(item)

        l2 = sorted(l1_fitered, key=itemgetter(1), reverse=True)
        l2_fitered = []
        for item in l2:
            if item[1] <= min(n[1],n[1]):
                break
            l2_fitered.append(item)

        l3 = sorted(l2_fitered, key=itemgetter(2), reverse=True)
        l3_fitered = []
        for item in l3:
            if item[2] <= min(n[2],n[2]):
                break
            l3_fitered.append(item)
a5=time.time()            



print "soluce my_naive1, like proposed by Emil Vikström?",a2-a1
print "soluce my_naive2",a3-a2
print "soluce Tim Pietzcker",a4-a3
print "soluce Don",a5-a4

I'm seeking the fastest way to extract all tuple members from a list under condition(s).

Example:
From a list of tuple (e.g. [(0,0,4),(1,0,3),(1,2,1),(4,0,0)]) I need to extract all members that have more than 3 in first tuple position, then more than 2 in second tuple position, and then more than 1 in last tuple position.
Which should extract in this example (4,0,0) (->first condition), nothing (->second condition) and (0,0,4),(1,0,3) (->last condition). This example is very small, I need to perform that on list of thousands of tuples.

From the code I produced from your answers, here are the results in sec:

my_naive1, like proposed by Emil Vikström? 13.0360000134

my_naive2 110.727999926

Tim Pietzcker 9.8329999446

Don 12.5640001297

import itertools, operator, time, copy
from operator import itemgetter


def combinations_with_replacement_counts(n, r):  #(A, N) in our example.N individuals/balls in A genotypes/boxes
   size = n + r - 1
   for indices in itertools.combinations(range(size), n-1):
       #print indices
       starts = [0] + [index+1 for index in indices]
       stops = indices + (size,)
       yield tuple(map(operator.sub, stops, starts))


xp = list(combinations_with_replacement_counts(3,20))  # a very small case

a1=time.time()
temp=[]
for n in xp:
    for n1 in xp:

        for i in xp:
            if i[0] <= min(n1[0],n[0]) or i[1] <= min(n1[1],n[1]) or i[2] <= min(n1[2],n[2]):
                temp.append(i)


a2=time.time()
for n in xp:
    for n1 in xp:
        xp_copy = copy.deepcopy(xp)
        for i in xp:
            if i[0] > min(n[0],n[0]) or i[1] > min(n[1],n[1]) or i[2] > min(n[2],n[2]):
                xp_copy.remove(i)

a3=time.time()
for n in xp:
    for n1 in xp:
        output = [t for t in xp if t[0]<=min(n[0],n[0]) or t[1]<=min(n[1],n[1]) or t[2]<=min(n[2],n[2])]
a4=time.time()

for n in xp:
    for n1 in xp:
        l1 = sorted(xp, key=itemgetter(0), reverse=True)
        l1_fitered = []
        for item in l1:
            if item[0] <= min(n[0],n[0]):
                break
            l1_fitered.append(item)

        l2 = sorted(l1_fitered, key=itemgetter(1), reverse=True)
        l2_fitered = []
        for item in l2:
            if item[1] <= min(n[1],n[1]):
                break
            l2_fitered.append(item)

        l3 = sorted(l2_fitered, key=itemgetter(2), reverse=True)
        l3_fitered = []
        for item in l3:
            if item[2] <= min(n[2],n[2]):
                break
            l3_fitered.append(item)
a5=time.time()            



print "soluce my_naive1, like proposed by Emil Vikström?",a2-a1
print "soluce my_naive2",a3-a2
print "soluce Tim Pietzcker",a4-a3
print "soluce Don",a5-a4

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

独木成林 2024-12-17 14:50:51
>>> l = [(0,0,4), (1,0,3), (1,2,1), (4,0,0)]
>>> output = [t for t in l if t[0]>3 or t[1]>2 or t[2]>1]
>>> output
[(0, 0, 4), (1, 0, 3), (4, 0, 0)]

这很快,因为仅当 t[0]>3False 时才会评估 t[1]>2(第三个也相同)健康)状况)。因此,在您的示例列表中,只需要 8 次比较。

如果您使用生成器表达式,则可能会节省时间和内存(取决于您对过滤后的数据执行的操作):

>>> l = [(0,0,4), (1,0,3), (1,2,1), (4,0,0)]
>>> for item in (t for t in l if t[0]>3 or t[1]>2 or t[2]>1):
>>>     # do something with that item
>>> l = [(0,0,4), (1,0,3), (1,2,1), (4,0,0)]
>>> output = [t for t in l if t[0]>3 or t[1]>2 or t[2]>1]
>>> output
[(0, 0, 4), (1, 0, 3), (4, 0, 0)]

This is fast because t[1]>2 is only evaluated if t[0]>3 is False (same for the third condition). So in your example list, only 8 comparisons are necessary.

You might save time and memory (depending on what you're doing with the filtered data) if you use a generator expression instead:

>>> l = [(0,0,4), (1,0,3), (1,2,1), (4,0,0)]
>>> for item in (t for t in l if t[0]>3 or t[1]>2 or t[2]>1):
>>>     # do something with that item
恋竹姑娘 2024-12-17 14:50:51

有三个列表,每个条件一个,然后使用 for 循环遍历输入集,将每个元组排序到正确的目标列表中。这将在 O(n)(线性)时间内执行,这是该问题的最快渐近运行时间。它也只会循环列表一次。

Have three lists, one for each condition, and just loop through the input set with a for loop, sorting each tuple into the correct target list. This will perform in O(n) (linear) time, which is the fastest possible asymptotic runtime for this problem. It will also only loop over the list once.

请止步禁区 2024-12-17 14:50:51

如果您不关心结果项目的顺序,我建议在排序列表中查找,并在第一个不匹配的项目上使用中断条件:
这将跳过列表尾部。

from operator import itemgetter
l = [(..., ..., ...), (...)]
l1_source = sorted(l, key=itemgetter(0), reverse=True)
l1_fitered = []
for item in l1:
    if item[0] <= 3:
        break
    l1_fitered .append(item)

l2 = sorted(l, key=itemgetter(1), reverse=True)
...

If you do not care the order of resulting items, I suggest a lookup in sorted list, with break condition on first non-matching item:
this would skip list tails.

from operator import itemgetter
l = [(..., ..., ...), (...)]
l1_source = sorted(l, key=itemgetter(0), reverse=True)
l1_fitered = []
for item in l1:
    if item[0] <= 3:
        break
    l1_fitered .append(item)

l2 = sorted(l, key=itemgetter(1), reverse=True)
...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文