用于同时过滤和转换的列表理解中的中间变量
我有一个想要标准化的向量列表(在 Python 中),同时删除最初具有较小范数的向量。
输入列表是,例如
a = [(1,1),(1,2),(2,2),(3,4)]
,我需要输出为 (x*n, y*n)
,其中 n = (x**2+y**2)**-0.5
例如,如果我只需要规范,那么使用列表理解就很容易:
an = [ (x**2+y**2)**0.5 for x,y in a ]
例如,仅存储标准化的 x 也很容易,但我想要的是这个临时变量“n”,用于两次计算,然后将其丢弃。
我也不能只使用 lambda 函数,因为我还需要 n 来过滤列表。那么最好的方法是什么?
现在我在这里使用这个嵌套列表理解(在内部列表中使用表达式):
a = [(1,1),(1,2),(2,2),(3,4)]
[(x*n,y*n) for (n,x,y) in (( (x**2.+y**2.)**-0.5 ,x,y) for x,y in a) if n < 0.4]
# Out[14]:
# [(0.70710678118654757, 0.70710678118654757),
# (0.60000000000000009, 0.80000000000000004)]
内部列表生成具有额外值(n)的元组,然后我使用这些值进行计算和过滤。这真的是最好的方法吗?我应该注意哪些严重的低效率问题吗?
I have a list of vectors (in Python) that I want to normalize, while at the same time removing the vectors that originally had small norms.
The input list is, e.g.
a = [(1,1),(1,2),(2,2),(3,4)]
And I need the output to be (x*n, y*n)
with n = (x**2+y**2)**-0.5
If I just needed the norms, for example, that would be easy with a list comprehension:
an = [ (x**2+y**2)**0.5 for x,y in a ]
It would be also easy to store just a normalized x, too, for example, but what I want is to have this temporary variable "n", to use in two calculations, and then throw it away.
I can't just use a lambda function too because I also need the n to filter the list. So what is the best way?
Right now I am using this nested list comprehension here (with an expression in the inner list):
a = [(1,1),(1,2),(2,2),(3,4)]
[(x*n,y*n) for (n,x,y) in (( (x**2.+y**2.)**-0.5 ,x,y) for x,y in a) if n < 0.4]
# Out[14]:
# [(0.70710678118654757, 0.70710678118654757),
# (0.60000000000000009, 0.80000000000000004)]
The inner list generates tuples with an extra value (n), and then I use these values for the calculations and filtering. Is this really the best way? Are there any terrible inefficiencies I should be aware of?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
嗯,它确实有效地工作,如果你真的非常想写一行行,那么这是你能做的最好的事情。
另一方面,一个简单的 4 行函数可以更清晰地完成相同的操作:
顺便说一句,您的代码或描述中有一个错误 - 您说您过滤掉了短向量,但您的代码执行了相反的操作:p
Well, it does work efficiently and if you really, really want to write oneliners then it's the best you can do.
On the other hand, a simple 4 line function would do the same much clearer:
Btw, there is a bug in your code or description - you say you filter out short vectors but your code does the opposite :p
从
Python 3.8
开始,并引入赋值表达式 (PEP 572) (:=
运算符),可以在列表理解中使用局部变量,以避免多次调用相同的表达式:在我们的例子中,我们可以命名计算
(x**2.+y**2.)**-.5
作为变量n
,同时使用表达式的结果来过滤列表 if < code>n 低于0.4
;从而重新使用n
来生成映射值:Starting
Python 3.8
, and the introduction of assignment expressions (PEP 572) (:=
operator), it's possible to use a local variable within a list comprehension in order to avoid calling multiple times the same expression:In our case, we can name the evaluation of
(x**2.+y**2.)**-.5
as a variablen
while using the result of the expression to filter the list ifn
is inferior than0.4
; and thus re-usen
to produce the mapped value:这表明使用 forloop 可能是最快的方法。请务必在您自己的计算机上检查 timeit 结果,因为这些结果可能会因多种因素(硬件、操作系统、Python 版本、
a
的长度等)而有所不同。产生这些 timeit 结果:
This suggests using a forloop might be the fastest way. Be sure to check the timeit results on your own machine, as these results can vary depending on a number of factors (hardware, OS, Python version, length of
a
, etc.).yields these timeit results:
从unutbu窃取代码,这里是一个更大的测试,包括numpy版本和迭代器版本。请注意,将列表转换为 numpy 可能需要一些时间。
结果...
Stealing the code from unutbu, here is a larger test including a numpy version and the iterator version. Notice that converting the list to numpy can cost some time.
and the result...