从Python中的列表中提取与条件不匹配的元素的最快方法
我正在寻找最快的方法在条件下从列表中提取所有元组成员。
例子: 从元组列表(例如 [(0,0,4),(1,0,3),(1,2,1),(4,0,0)])我需要提取所有具有更多的成员在第一个元组位置中大于 3,然后在第二个元组位置中大于 2,然后在最后一个元组位置中大于 1。 在这个例子中应该提取 (4,0,0) (->第一个条件), 什么都没有 (->第二个条件) 和 (0,0,4),(1,0,3) (->最后一个健康)状况)。这个例子非常小,我需要在数千个元组的列表上执行该操作。
根据我根据您的答案生成的代码,以下是秒内的结果:
my_naive1,就像 Emil Vikström 提出的那样? 13.0360000134
my_naive2 110.727999926
蒂姆·皮茨克 9.8329999446
唐 12.5640001297
import itertools, operator, time, copy
from operator import itemgetter
def combinations_with_replacement_counts(n, r): #(A, N) in our example.N individuals/balls in A genotypes/boxes
size = n + r - 1
for indices in itertools.combinations(range(size), n-1):
#print indices
starts = [0] + [index+1 for index in indices]
stops = indices + (size,)
yield tuple(map(operator.sub, stops, starts))
xp = list(combinations_with_replacement_counts(3,20)) # a very small case
a1=time.time()
temp=[]
for n in xp:
for n1 in xp:
for i in xp:
if i[0] <= min(n1[0],n[0]) or i[1] <= min(n1[1],n[1]) or i[2] <= min(n1[2],n[2]):
temp.append(i)
a2=time.time()
for n in xp:
for n1 in xp:
xp_copy = copy.deepcopy(xp)
for i in xp:
if i[0] > min(n[0],n[0]) or i[1] > min(n[1],n[1]) or i[2] > min(n[2],n[2]):
xp_copy.remove(i)
a3=time.time()
for n in xp:
for n1 in xp:
output = [t for t in xp if t[0]<=min(n[0],n[0]) or t[1]<=min(n[1],n[1]) or t[2]<=min(n[2],n[2])]
a4=time.time()
for n in xp:
for n1 in xp:
l1 = sorted(xp, key=itemgetter(0), reverse=True)
l1_fitered = []
for item in l1:
if item[0] <= min(n[0],n[0]):
break
l1_fitered.append(item)
l2 = sorted(l1_fitered, key=itemgetter(1), reverse=True)
l2_fitered = []
for item in l2:
if item[1] <= min(n[1],n[1]):
break
l2_fitered.append(item)
l3 = sorted(l2_fitered, key=itemgetter(2), reverse=True)
l3_fitered = []
for item in l3:
if item[2] <= min(n[2],n[2]):
break
l3_fitered.append(item)
a5=time.time()
print "soluce my_naive1, like proposed by Emil Vikström?",a2-a1
print "soluce my_naive2",a3-a2
print "soluce Tim Pietzcker",a4-a3
print "soluce Don",a5-a4
I'm seeking the fastest way to extract all tuple members from a list under condition(s).
Example:
From a list of tuple (e.g. [(0,0,4),(1,0,3),(1,2,1),(4,0,0)]) I need to extract all members that have more than 3 in first tuple position, then more than 2 in second tuple position, and then more than 1 in last tuple position.
Which should extract in this example (4,0,0) (->first condition), nothing (->second condition) and (0,0,4),(1,0,3) (->last condition). This example is very small, I need to perform that on list of thousands of tuples.
From the code I produced from your answers, here are the results in sec:
my_naive1, like proposed by Emil Vikström? 13.0360000134
my_naive2 110.727999926
Tim Pietzcker 9.8329999446
Don 12.5640001297
import itertools, operator, time, copy
from operator import itemgetter
def combinations_with_replacement_counts(n, r): #(A, N) in our example.N individuals/balls in A genotypes/boxes
size = n + r - 1
for indices in itertools.combinations(range(size), n-1):
#print indices
starts = [0] + [index+1 for index in indices]
stops = indices + (size,)
yield tuple(map(operator.sub, stops, starts))
xp = list(combinations_with_replacement_counts(3,20)) # a very small case
a1=time.time()
temp=[]
for n in xp:
for n1 in xp:
for i in xp:
if i[0] <= min(n1[0],n[0]) or i[1] <= min(n1[1],n[1]) or i[2] <= min(n1[2],n[2]):
temp.append(i)
a2=time.time()
for n in xp:
for n1 in xp:
xp_copy = copy.deepcopy(xp)
for i in xp:
if i[0] > min(n[0],n[0]) or i[1] > min(n[1],n[1]) or i[2] > min(n[2],n[2]):
xp_copy.remove(i)
a3=time.time()
for n in xp:
for n1 in xp:
output = [t for t in xp if t[0]<=min(n[0],n[0]) or t[1]<=min(n[1],n[1]) or t[2]<=min(n[2],n[2])]
a4=time.time()
for n in xp:
for n1 in xp:
l1 = sorted(xp, key=itemgetter(0), reverse=True)
l1_fitered = []
for item in l1:
if item[0] <= min(n[0],n[0]):
break
l1_fitered.append(item)
l2 = sorted(l1_fitered, key=itemgetter(1), reverse=True)
l2_fitered = []
for item in l2:
if item[1] <= min(n[1],n[1]):
break
l2_fitered.append(item)
l3 = sorted(l2_fitered, key=itemgetter(2), reverse=True)
l3_fitered = []
for item in l3:
if item[2] <= min(n[2],n[2]):
break
l3_fitered.append(item)
a5=time.time()
print "soluce my_naive1, like proposed by Emil Vikström?",a2-a1
print "soluce my_naive2",a3-a2
print "soluce Tim Pietzcker",a4-a3
print "soluce Don",a5-a4
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这很快,因为仅当
t[0]>3
为False
时才会评估t[1]>2
(第三个也相同)健康)状况)。因此,在您的示例列表中,只需要 8 次比较。如果您使用生成器表达式,则可能会节省时间和内存(取决于您对过滤后的数据执行的操作):
This is fast because
t[1]>2
is only evaluated ift[0]>3
isFalse
(same for the third condition). So in your example list, only 8 comparisons are necessary.You might save time and memory (depending on what you're doing with the filtered data) if you use a generator expression instead:
有三个列表,每个条件一个,然后使用 for 循环遍历输入集,将每个元组排序到正确的目标列表中。这将在 O(n)(线性)时间内执行,这是该问题的最快渐近运行时间。它也只会循环列表一次。
Have three lists, one for each condition, and just loop through the input set with a for loop, sorting each tuple into the correct target list. This will perform in O(n) (linear) time, which is the fastest possible asymptotic runtime for this problem. It will also only loop over the list once.
如果您不关心结果项目的顺序,我建议在排序列表中查找,并在第一个不匹配的项目上使用中断条件:
这将跳过列表尾部。
If you do not care the order of resulting items, I suggest a lookup in sorted list, with break condition on first non-matching item:
this would skip list tails.