列表推导式：对组件的引用

发布于 2024-10-18 12:32:32 字数 925 浏览 8 评论 0原文

总之：我需要编写一个列表理解，其中我引用列表理解正在创建的列表。

这可能不是你每天都需要做的事情，但我也不认为这有什么不寻常。

也许这里没有答案——不过，请不要告诉我应该使用for循环。这可能是正确的，但没有帮助。原因在于问题域：这行代码是 ETL 模块的一部分，因此性能是相关的，因此需要避免创建临时容器 - 因此我希望在信用证中编写此步骤。如果 for 循环在这里对我有用，我就只编写一个。

无论如何，我无法编写这个特定的列表理解。原因：我需要编写的表达式具有以下形式：

[ some_function(s) for s in raw_data if s not in this_list ]

在该伪代码中，“this_list”指的是通过评估该列表理解而创建的列表。这就是我陷入困境的原因 - 因为在评估我的列表理解之前 this_list 不会构建，并且因为在我需要引用它时该列表尚未构建，所以我不知道如何引用它。

到目前为止我所考虑的（这可能基于一个或多个错误的假设，尽管我不知道具体在哪里）：

Python解释器没有给出这个正在建设中的列表一个名字？我想是的
那个临时名称可能已被占用来自用于构建的某些绑定方法我的列表（'sum'？）
但即使我不厌其烦地找到该绑定方法并假设这确实是临时名称由 python 解释器用来请参阅下面的列表建筑，我很确定你无法引用绑定方法直接地;我不知道这样的明确的规则，但那些方法（在至少我实际上拥有的少数几个查看）不是有效的 python 句法。我猜有一个原因这样我们就不会将它们写入我们的代码。

这就是我所谓的推理链条，它让我得出结论，或者至少猜测，我已经把自己逼到了一个角落。尽管如此，我仍然认为在转身走向不同的方向之前我应该向社区核实这一点。

原文

In sum: I need to write a List Comprehension in which i refer to list that is being created by the List Comprehension.

This might not be something you need to do every day, but i don't think it's unusual either.

Maybe there's no answer here--still, please don't tell me i ought to use a for loop. That might be correct, but it's not helpful. The reason is the problem domain: this line of code is part of an ETL module, so performance is relevant, and so is the need to avoid creating a temporary container--hence my wish to code this step in a L/C. If a for loop would work for me here, i would just code one.

In any event, i am unable to write this particular list comprehension. The reason: the expression i need to write has this form:

[ some_function(s) for s in raw_data if s not in this_list ]

In that pseudo-code, "this_list" refers to the list created by evaluating that list comprehension. And that's why i'm stuck--because this_list isn't built until my list comprehension is evaluated, and because this list isn't yet built by the time i need to refer to it, i don't know how to refer to it.

What i have considered so far (and which might be based on one or more false assumptions, though i don't know exactly where):

doesn't the python interpreter have
to give this list-under-construction
a name? i think so
that temporary name is probably taken
from some bound method used to build
my list ('sum'?)
but even if i went to the trouble to
find that bound method and assuming
that it is indeed the temporary name
used by the python interpreter to
refer to the list while it is under
construction, i am pretty sure you
can't refer to bound methods
directly; i'm not aware of such an
explicit rule, but those methods (at
least the few that i've actually
looked at) are not valid python
syntax. I'm guessing one reason why
is so that we do not write them into
our code.

so that's the chain of my so-called reasoning, and which has led me to conclude, or at least guess, that i have coded myself into a corner. Still i thought i ought to verify this with the Community before turning around and going a different direction.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

蛮可爱 2024-10-25 12:32:32

曾经有一种方法可以使用未记录的事实来实现此目的，即在构建列表时，其值存储在名为 _[1].__self__ 的局部变量中。然而，它在 Python 2.7 中停止工作（也许更早，我没有密切关注）。

如果您首先设置外部数据结构，则可以在单个列表理解中执行您想要的操作。由于您所有的伪代码似乎都与 this_list 一起检查它以查看每个 s 是否已经在其中 - 即成员资格测试 - 我已经更改了它作为优化，将其放入名为 seen 的 set 中（如果列表很大，则检查 list 中的成员资格可能会非常慢）。我的意思是：

raw_data = [c for c in 'abcdaebfc']

seen = set()
def some_function(s):
    seen.add(s)
    return s

print [ some_function(s) for s in raw_data if s not in seen ]
# ['a', 'b', 'c', 'd', 'e', 'f']

如果您无权访问 some_function，您可以在自己的包装函数中调用它，并将其返回值添加到 seen 集中在归还之前。

即使它不是列表理解，我也会将整个事情封装在一个函数中，以便更容易重用：

def some_function(s):
    # do something with or to 's'...
    return s

def add_unique(function, data):
    result = []
    seen = set(result) # init to empty set
    for s in data:
        if s not in seen:
            t = function(s)
            result.append(t)
            seen.add(t)
    return result

print add_unique(some_function, raw_data)
# ['a', 'b', 'c', 'd', 'e', 'f']

在任何一种情况下，我都觉得奇怪的是，在您想要引用的伪代码中构建的列表不是' t 由 raw_data 值的子集组成，而是对每个值调用 some_function 的结果——即转换后的数据——这自然会让人想知道some_function 的作用是使其返回值可能与现有 raw_data 项的值匹配。

There used to be a way to do this using the undocumented fact that while the list was being built its value was stored in a local variable named _[1].__self__. However that quit working in Python 2.7 (maybe earlier, I wasn't paying close attention).

You can do what you want in a single list comprehension if you set up an external data structure first. Since all your pseudo code seemed to be doing with this_list was checking it to see if each s was already in it -- i.e. a membership test -- I've changed it into a set named seen as an optimization (checking for membership in a list can be very slow if the list is large). Here's what I mean:

raw_data = [c for c in 'abcdaebfc']

seen = set()
def some_function(s):
    seen.add(s)
    return s

print [ some_function(s) for s in raw_data if s not in seen ]
# ['a', 'b', 'c', 'd', 'e', 'f']

If you don't have access to some_function, you could put a call to it in your own wrapper function that added its return value to the seen set before returning it.

Even though it wouldn't be a list comprehension, I'd encapsulate the whole thing in a function to make reuse easier:

def some_function(s):
    # do something with or to 's'...
    return s

def add_unique(function, data):
    result = []
    seen = set(result) # init to empty set
    for s in data:
        if s not in seen:
            t = function(s)
            result.append(t)
            seen.add(t)
    return result

print add_unique(some_function, raw_data)
# ['a', 'b', 'c', 'd', 'e', 'f']

In either case, I find it odd that the list being built in your pseudo code that you want to reference isn't comprised of a subset of raw_data values, but rather the result of calling some_function on each of them -- i.e. transformed data -- which naturally makes one wonder what some_function does such that its return value might match an existing raw_data item's value.

回复收藏 0 原文

束缚ｍ 2024-10-25 12:32:32

我不明白为什么你需要一次性执行此操作。首先迭代初始数据以消除重复项 - 或者更好的是，按照 KennyTM 建议将其转换为集合 - 然后进行列表理解。

请注意，即使您可以引用“正在构建的列表”，您的方法仍然会失败，因为 s 无论如何都不在列表中 - some_function(s) 的结果是。

回复收藏 0 原文

浊酒尽余欢 2024-10-25 12:32:32

据我所知，在构建列表理解时无法访问它。

正如 KennyTM 提到的（如果条目的顺序不相关），那么您可以使用 set 代替。如果您使用的是 Python 2.7/3.1 及更高版本，您甚至可以获得集合推导式：

{ some_function(s) for s in raw_data }

否则， for 循环也没有那么糟糕（尽管它会扩展得非常厉害）

l = []
for s in raw_data:
    item = somefunction(s)
    if item not in l:
        l.append(item)

As far as I know, there is no way to access a list comprehension as it's being built.

As KennyTM mentioned (and if the order of the entries is not relevant), then you can use a set instead. If you're on Python 2.7/3.1 and above, you even get set comprehensions:

{ some_function(s) for s in raw_data }

Otherwise, a for loop isn't that bad either (although it will scale terribly)

l = []
for s in raw_data:
    item = somefunction(s)
    if item not in l:
        l.append(item)

回复收藏 0 原文

戈亓 2024-10-25 12:32:32

你为什么不简单地这样做：[ some_function(s) for s in set(raw_data) ]

这应该可以满足你的要求。除非您需要保留前一个列表的顺序。

回复收藏 0 原文

~没有更多了~

关于作者

韬韬不绝

暂无简介

文章

808 人气

关注发私信

友情链接

文江博客

列表推导式：对组件的引用

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

夢野间

百度③文鱼

小草泠泠

zhuwenyan

weirdo

坚持沉默

友情链接

列表推导式：对组件的引用

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

夢野间

百度③文鱼

小草泠泠

zhuwenyan

weirdo

坚持沉默

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。