使用正则表达式从行列表中返回单词列表
我在字符串列表上运行以下代码以返回其单词列表:
words = [re.split('\\s+', line) for line in lines]
但是,我最终得到类似的内容:
[['import', 're', ''], ['', ''], ['def', 'word_count(filename):', ''], ...]
与所需的相反:
['import', 're', '', '', '', 'def', 'word_count(filename):', '', ...]
如何解压列表 re.split('\ \s+', line)
在上面的列表理解中产生?天真地,我尝试使用 *
但这不起作用。
(我正在寻找一种简单且Python式的方法;我很想编写一个函数,但我确信该语言可以解决这个问题。)
I'm running the following code on a list of strings to return a list of its words:
words = [re.split('\\s+', line) for line in lines]
However, I end up getting something like:
[['import', 're', ''], ['', ''], ['def', 'word_count(filename):', ''], ...]
As opposed to the desired:
['import', 're', '', '', '', 'def', 'word_count(filename):', '', ...]
How can I unpack the lists re.split('\\s+', line)
produces in the above list comprehension? Naïvely, I tried using *
but that doesn't work.
(I'm looking for a simple and Pythonic way of doing; I was tempted to write a function but I'm sure the language accommodates for this issue.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这将为您提供一个可用于循环遍历所有单词的迭代器:
创建列表而不是迭代器只需将迭代器包装在
list
调用中即可:This will give you an iterator that can be used for looping through all words:
Creating a list instead of an iterator is just a matter of wrapping the iterator in a
list
call:您获得列表列表的原因是因为 re.split() 返回一个列表,然后将其“附加”到列表理解输出。
目前尚不清楚为什么要使用它(或者可能只是一个不好的例子),但如果您可以将完整内容(所有行)作为字符串获取,则可以使用
iflines 的乘积:
使用
代替。
The reason why you get a list of lists is because re.split() returns a list which then in 'appended' to the list comprehension output.
It's unclear why you are using that (or probably just a bad example) but if you can get the full content (all lines) as a string you can just do
if lines is the product of:
use
instead.
你总是可以这样做:
它不像单行列表理解那么优雅,但它可以完成工作。
You can always do this:
It's not nearly as elegant as a one-liner list comprehension, but it gets the job done.
刚刚偶然发现了这个老问题,我想我有更好的解决方案。通常,如果你想嵌套一个列表理解(“附加”每个列表),你会向后思考(非for循环)。这不是你想要的:
但是,如果你想“扩展”而不是“追加”你正在生成的列表,只需省略额外的方括号集并反转你的 for 循环(将它们放回“右”) “ 命令)。
对我来说,这似乎是一个更 Pythonic 的解决方案,因为它基于列表处理逻辑,而不是一些 random-ass 内置函数。每个程序员都应该知道如何做到这一点(尤其是那些试图学习 Lisp 的程序员!)
Just stumbled across this old question, and I think I have a better solution. Normally if you want to nest a list comprehension ("append" each list), you think backwards (un-for-loop-like). This is not what you want:
However if you want to "extend" instead of "append" the lists you're generating, just leave out the extra set of square brackets and reverse your for-loops (putting them back in the "right" order).
This seems like a more Pythonic solution to me since it is based in list-processing logic rather than some random-ass built-in function. Every programmer should know how to do this (especially ones trying to learn Lisp!)