嵌套Python列表推导式来构造列表列表
我是一个 python 新手,在摸索嵌套列表理解时遇到了麻烦。我正在尝试编写一些代码来读取文件并为每行的每个字符构建一个列表。
因此,如果文件包含
xxxcd
cdcdjkhjasld
asdasdxasda
结果列表将为:
[
['x','x','x','c','d']
['c','d','c','d','j','k','h','j','a','s','l','d']< br> ['a','s','d','a','s','d','x','a','s','d','a']
]
我已经编写了以下代码,它可以工作,但我有一种挥之不去的感觉,我应该能够编写一个嵌套列表理解来用更少的代码行来完成此操作。任何建议将不胜感激。
data = []
f = open(file,'r')
for line in f:
line = line.strip().upper()
list = []
for c in line:
list.append(c)
data.append(list)
I'm a python newb and am having trouble groking nested list comprehensions. I'm trying to write some code to read in a file and construct a list for each character for each line.
so if the file contains
xxxcd
cdcdjkhjasld
asdasdxasda
The resulting list would be:
[
['x','x','x','c','d']
['c','d','c','d','j','k','h','j','a','s','l','d']
['a','s','d','a','s','d','x','a','s','d','a']
]
I have written the following code, and it works, but I have a nagging feeling that I should be able to write a nested list comprehension to do this in fewer lines of code. any suggestions would be appreciated.
data = []
f = open(file,'r')
for line in f:
line = line.strip().upper()
list = []
for c in line:
list.append(c)
data.append(list)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
这应该会有所帮助(您可能需要使用它来删除换行符或按照您想要的方式格式化它,但基本思想应该可行):
This should help (you'll probably have to play around with it to strip the newlines or format it however you want, but the basic idea should work):
在您的情况下,您可以使用
list
构造函数来处理内部循环,并使用列表理解来处理外部循环。类似于:给定一个字符串作为输入,列表构造函数将创建一个列表,其中字符串的每个字符都是列表中的单个元素。
列表理解在功能上等同于:
In your case, you can use the
list
constructor to handle the inner loop and use list comprehension for the outer loop. Something like:Given a string as input, the list constructor will create a list where each character of the string is a single element in the list.
The list comprehension is functionally equivalent to:
这是列表理解的一个级别。
但我们可以一次性完成整个事情:
这是使用
list()
将字符串转换为单字符字符串列表。我们还可以使用嵌套列表推导式,并将open()
内联:不过,在这一点上,我认为列表推导式降低了正在发生的事情的易读性在。
对于复杂的处理,例如列表中的列表,您可能需要对外层使用 for 循环,对内层循环使用列表理解。
此外,正如 Chris Lutz 在评论中所说,在这种情况下,确实没有理由将每一行显式分割为字符列表;您始终可以将字符串视为列表,并且可以对字符串使用字符串方法,但不能对列表使用字符串方法。 (好吧,您可以使用
''.join()
将列表重新连接回字符串,但为什么不将其保留为字符串呢?)Here is one level of list comprehension.
But we can do the whole thing on one go:
This is using
list()
to convert a string to a list of single-character strings. We could also use nested list comprehensions, and put theopen()
inline:At this point, though, I think the list comprehensions are detracting from easy readability of what is going on.
For complicated processing, such as lists inside lists, you might want to use a
for
loop for the outer layer and a list comprehension for the inner loop.Also, as Chris Lutz said in a comment, in this case there really isn't a reason to explicitly split each line into character lists; you can always treat a string as a list, and you can use string methods with a string, but you can't use string methods with a list. (Well, you could use
''.join()
to rejoin the list back to a string, but why not just leave it as a string?)字符串和字符列表之间唯一真正显着的区别是字符串是不可变的。您可以像列表一样迭代和切片字符串。将字符串作为字符串处理要方便得多,因为它们支持字符串方法,而列表则不支持。
因此,对于大多数应用程序,我不会费心将
data
中的项目转换为列表;我只是这样做:当我需要将
data
中的字符串作为可变列表进行操作时,我会使用list
来转换它们,然后join
将它们放回去,例如:当然,如果您要对这些字符串做的只是修改它们,那么继续,将它们存储为列表。
The only really significant difference between strings and lists of characters is that strings are immutable. You can iterate over and slice strings just as you would lists. And it's much more convenient to handle strings as strings, since they support string methods and lists don't.
So for most applications, I wouldn't bother converting the items in
data
to a list; I'd just do:When I needed to manipulate strings in
data
as mutable lists, I'd uselist
to convert them, andjoin
to put them back, e.g.:Of course, if all you're going to do with these strings is modify them, then go ahead, store them as lists.
首先,您可以将 line.strip().upper() 部分与外部 for 循环结合起来,如下所示:
然后您可以将字符迭代转换为列表理解,但它不会更短或更清晰。在那里做你所做的事情的最巧妙的方法是这样的:
因此你可以这样做:
但我不知道它是否很好地表达了你的意图。错误处理也是一个问题,如果途中出现问题,整个表达式就会死掉。
如果您不需要将整个文件和所有行存储在内存中,您可以将其放入生成器表达式中。当处理大文件并且您一次只需要处理一个块时,这非常有用。生成器表达式使用括号,如下所示:
data
将成为一个生成器,为文件中的每一行运行表达式,但仅当您对其进行迭代时;将其与列表理解进行比较,列表理解将在内存中创建一个巨大的列表。请注意,data
不是列表,而是生成器,更类似于 C++ 中的迭代器或 C# 中的 IEnumerator。生成器可以很容易地输入到列表中:
list(someGenerator)
这会在某种程度上违背目的,但有时是必要的。First off you could combine the line.strip().upper() part with your outer for-loop, like this:
Then you could make the iteration over the characters into a list comprehension, but it wouldn't be shorter or clearer. The neatest way to do what you do there is this:
Thus you could do:
I don't know if it states your intentions that well though. Error handling is also an issue, the whole expression will die if there is a problem on the way.
If you don't need to store the whole file and all the lines in memory, you could make it into a generator expression. This is very useful when processing huge files and you only need to process a chunk at a time. Generator expressions use parentheses instead, like so:
data
will become a generator which runs the expression for each line in the file, but only when you iterate over it; compare that to a list comprehension which will create a huge list in memory. Note thatdata
is not a list, but a generator, and more a kin to a iterator in C++ or IEnumerator in C#.A generator can be fed into a list easily:
list(someGenerator)
That would defeat the purpose somewhat but is sometimes a necessity.