展平不规则(任意嵌套)的列表列表

发布于 2024-08-19 16:46:47 字数 921 浏览 6 评论 0原文

是的,我知道这个主题之前已经讨论过:

但据我所知,所有解决方案,除了一个,在像 [[[1, 2, 3], [4, 5]], 6] 这样的列表上失败,其中所需的输出是 [1, 2, 3, 4, 5, 6](或者更好的是迭代器)。

我看到的唯一适用于任意嵌套的解决方案是在这个问题中找到的:

def flatten(x):
    result = []
    for el in x:
        if hasattr(el, "__iter__") and not isinstance(el, basestring):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

这是最好的方法吗?我是不是忽略了什么?有什么问题吗?

Yes, I know this subject has been covered before:

but as far as I know, all solutions, except for one, fail on a list like [[[1, 2, 3], [4, 5]], 6], where the desired output is [1, 2, 3, 4, 5, 6] (or perhaps even better, an iterator).

The only solution I saw that works for an arbitrary nesting is found in this question:

def flatten(x):
    result = []
    for el in x:
        if hasattr(el, "__iter__") and not isinstance(el, basestring):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

Is this the best approach? Did I overlook something? Any problems?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(30

尸血腥色 2024-08-26 16:46:47

使用生成器函数可以使您的示例更易于阅读并提高性能。

Python 2

使用 Iterable ABC 2.6 中添加的

from collections import Iterable

def flatten(xs):
    for x in xs:
        if isinstance(x, Iterable) and not isinstance(x, basestring):
            for item in flatten(x):
                yield item
        else:
            yield x

Python 3

在 Python 3 中,basestring 不再存在,但元组 (str, bytes) 具有相同的效果。此外, yield from 运算符一次从生成器返回一个项目。

from collections.abc import Iterable

def flatten(xs):
    for x in xs:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            yield from flatten(x)
        else:
            yield x

Using generator functions can make your example easier to read and improve performance.

Python 2

Using the Iterable ABC added in 2.6:

from collections import Iterable

def flatten(xs):
    for x in xs:
        if isinstance(x, Iterable) and not isinstance(x, basestring):
            for item in flatten(x):
                yield item
        else:
            yield x

Python 3

In Python 3, basestring is no more, but the tuple (str, bytes) gives the same effect. Also, the yield from operator returns an item from a generator one at a time.

from collections.abc import Iterable

def flatten(xs):
    for x in xs:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            yield from flatten(x)
        else:
            yield x
玉环 2024-08-26 16:46:47

我的解决方案:

import collections


def flatten(x):
    if isinstance(x, collections.Iterable):
        return [a for i in x for a in flatten(i)]
    else:
        return [x]

更简洁一点,但几乎相同。

My solution:

import collections


def flatten(x):
    if isinstance(x, collections.Iterable):
        return [a for i in x for a in flatten(i)]
    else:
        return [x]

A little more concise, but pretty much the same.

新雨望断虹 2024-08-26 16:46:47

使用递归和鸭子类型的生成器(针对 Python 3 进行了更新):

def flatten(L):
    for item in L:
        try:
            yield from flatten(item)
        except TypeError:
            yield item

list(flatten([[[1, 2, 3], [4, 5]], 6]))
>>>[1, 2, 3, 4, 5, 6]

Generator using recursion and duck typing (updated for Python 3):

def flatten(L):
    for item in L:
        try:
            yield from flatten(item)
        except TypeError:
            yield item

list(flatten([[[1, 2, 3], [4, 5]], 6]))
>>>[1, 2, 3, 4, 5, 6]

这是我的递归展平的函数版本,它可以处理元组和列表,并允许您添加任意位置参数的组合。返回一个生成器,它按 arg by arg 的顺序生成整个序列:

flatten = lambda *n: (e for a in n
    for e in (flatten(*a) if isinstance(a, (tuple, list)) else (a,)))

用法:

l1 = ['a', ['b', ('c', 'd')]]
l2 = [0, 1, (2, 3), [[4, 5, (6, 7, (8,), [9]), 10]], (11,)]
print list(flatten(l1, -2, -1, l2))
['a', 'b', 'c', 'd', -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

Here is my functional version of recursive flatten which handles both tuples and lists, and lets you throw in any mix of positional arguments. Returns a generator which produces the entire sequence in order, arg by arg:

flatten = lambda *n: (e for a in n
    for e in (flatten(*a) if isinstance(a, (tuple, list)) else (a,)))

Usage:

l1 = ['a', ['b', ('c', 'd')]]
l2 = [0, 1, (2, 3), [[4, 5, (6, 7, (8,), [9]), 10]], (11,)]
print list(flatten(l1, -2, -1, l2))
['a', 'b', 'c', 'd', -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
小猫一只 2024-08-26 16:46:47

@unutbu 的非递归解决方案的生成器版本,按照 @Andrew 在评论中的要求:

def genflat(l, ltypes=collections.Sequence):
    l = list(l)
    i = 0
    while i < len(l):
        while isinstance(l[i], ltypes):
            if not l[i]:
                l.pop(i)
                i -= 1
                break
            else:
                l[i:i + 1] = l[i]
        yield l[i]
        i += 1

该生成器的稍微简化的版本:

def genflat(l, ltypes=collections.Sequence):
    l = list(l)
    while l:
        while l and isinstance(l[0], ltypes):
            l[0:1] = l[0]
        if l: yield l.pop(0)

Generator version of @unutbu's non-recursive solution, as requested by @Andrew in a comment:

def genflat(l, ltypes=collections.Sequence):
    l = list(l)
    i = 0
    while i < len(l):
        while isinstance(l[i], ltypes):
            if not l[i]:
                l.pop(i)
                i -= 1
                break
            else:
                l[i:i + 1] = l[i]
        yield l[i]
        i += 1

Slightly simplified version of this generator:

def genflat(l, ltypes=collections.Sequence):
    l = list(l)
    while l:
        while l and isinstance(l[0], ltypes):
            l[0:1] = l[0]
        if l: yield l.pop(0)
情何以堪。 2024-08-26 16:46:47

这个版本的flatten避免了Python的递归限制(因此可以处理任意深度的嵌套迭代)。它是一个可以处理字符串和任意迭代(甚至无限迭代)的生成器。

import itertools
import collections

def flatten(iterable, ltypes=collections.Iterable):
    remainder = iter(iterable)
    while True:
        try:
            first = next(remainder)
        except StopIteration:
            break
        if isinstance(first, ltypes) and not isinstance(first, (str, bytes)):
            remainder = itertools.chain(first, remainder)
        else:
            yield first

下面是一些演示其用法的示例:

print(list(itertools.islice(flatten(itertools.repeat(1)),10)))
# [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

print(list(itertools.islice(flatten(itertools.chain(itertools.repeat(2,3),
                                       {10,20,30},
                                       'foo bar'.split(),
                                       itertools.repeat(1),)),10)))
# [2, 2, 2, 10, 20, 30, 'foo', 'bar', 1, 1]

print(list(flatten([[1,2,[3,4]]])))
# [1, 2, 3, 4]

seq = ([[chr(i),chr(i-32)] for i in range(ord('a'), ord('z')+1)] + list(range(0,9)))
print(list(flatten(seq)))
# ['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D', 'e', 'E', 'f', 'F', 'g', 'G', 'h', 'H',
# 'i', 'I', 'j', 'J', 'k', 'K', 'l', 'L', 'm', 'M', 'n', 'N', 'o', 'O', 'p', 'P',
# 'q', 'Q', 'r', 'R', 's', 'S', 't', 'T', 'u', 'U', 'v', 'V', 'w', 'W', 'x', 'X',
# 'y', 'Y', 'z', 'Z', 0, 1, 2, 3, 4, 5, 6, 7, 8]

虽然 flatten 可以处理无限生成器,但它无法处理无限嵌套:

def infinitely_nested():
    while True:
        yield itertools.chain(infinitely_nested(), itertools.repeat(1))

print(list(itertools.islice(flatten(infinitely_nested()), 10)))
# hangs

This version of flatten avoids python's recursion limit (and thus works with arbitrarily deep, nested iterables). It is a generator which can handle strings and arbitrary iterables (even infinite ones).

import itertools
import collections

def flatten(iterable, ltypes=collections.Iterable):
    remainder = iter(iterable)
    while True:
        try:
            first = next(remainder)
        except StopIteration:
            break
        if isinstance(first, ltypes) and not isinstance(first, (str, bytes)):
            remainder = itertools.chain(first, remainder)
        else:
            yield first

Here are some examples demonstrating its use:

print(list(itertools.islice(flatten(itertools.repeat(1)),10)))
# [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

print(list(itertools.islice(flatten(itertools.chain(itertools.repeat(2,3),
                                       {10,20,30},
                                       'foo bar'.split(),
                                       itertools.repeat(1),)),10)))
# [2, 2, 2, 10, 20, 30, 'foo', 'bar', 1, 1]

print(list(flatten([[1,2,[3,4]]])))
# [1, 2, 3, 4]

seq = ([[chr(i),chr(i-32)] for i in range(ord('a'), ord('z')+1)] + list(range(0,9)))
print(list(flatten(seq)))
# ['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D', 'e', 'E', 'f', 'F', 'g', 'G', 'h', 'H',
# 'i', 'I', 'j', 'J', 'k', 'K', 'l', 'L', 'm', 'M', 'n', 'N', 'o', 'O', 'p', 'P',
# 'q', 'Q', 'r', 'R', 's', 'S', 't', 'T', 'u', 'U', 'v', 'V', 'w', 'W', 'x', 'X',
# 'y', 'Y', 'z', 'Z', 0, 1, 2, 3, 4, 5, 6, 7, 8]

Although flatten can handle infinite generators, it can not handle infinite nesting:

def infinitely_nested():
    while True:
        yield itertools.chain(infinitely_nested(), itertools.repeat(1))

print(list(itertools.islice(flatten(infinitely_nested()), 10)))
# hangs
情域 2024-08-26 16:46:47

Pandas 有一个函数可以做到这一点。正如您提到的,它返回一个迭代器。

In [1]: import pandas
In [2]: pandas.core.common.flatten([[[1, 2, 3], [4, 5]], 6])
Out[2]: <generator object flatten at 0x7f12ade66200>
In [3]: list(pandas.core.common.flatten([[[1, 2, 3], [4, 5]], 6]))
Out[3]: [1, 2, 3, 4, 5, 6]

Pandas has a function that does this. It returns an iterator as you mentioned.

In [1]: import pandas
In [2]: pandas.core.common.flatten([[[1, 2, 3], [4, 5]], 6])
Out[2]: <generator object flatten at 0x7f12ade66200>
In [3]: list(pandas.core.common.flatten([[[1, 2, 3], [4, 5]], 6]))
Out[3]: [1, 2, 3, 4, 5, 6]
月隐月明月朦胧 2024-08-26 16:46:47
def flatten(xs):
    res = []
    def loop(ys):
        for i in ys:
            if isinstance(i, list):
                loop(i)
            else:
                res.append(i)
    loop(xs)
    return res
def flatten(xs):
    res = []
    def loop(ys):
        for i in ys:
            if isinstance(i, list):
                loop(i)
            else:
                res.append(i)
    loop(xs)
    return res
不念旧人 2024-08-26 16:46:47

这是另一个更有趣的答案......

import re

def Flatten(TheList):
    a = str(TheList)
    b,_Anon = re.subn(r'[\[,\]]', ' ', a)
    c = b.split()
    d = [int(x) for x in c]

    return(d)

基本上,它将嵌套列表转换为字符串,使用正则表达式去除嵌套语法,然后将结果转换回(展平的)列表。

Here's another answer that is even more interesting...

import re

def Flatten(TheList):
    a = str(TheList)
    b,_Anon = re.subn(r'[\[,\]]', ' ', a)
    c = b.split()
    d = [int(x) for x in c]

    return(d)

Basically, it converts the nested list to a string, uses a regex to strip out the nested syntax, and then converts the result back to a (flattened) list.

深海少女心 2024-08-26 16:46:47

您可以使用 deepflatten第三方包 iteration_utilities

>>> from iteration_utilities import deepflatten
>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> list(deepflatten(L))
[1, 2, 3, 4, 5, 6]

>>> list(deepflatten(L, types=list))  # only flatten "inner" lists
[1, 2, 3, 4, 5, 6]

它是一个迭代器所以你需要迭代它(例如用 list 包装它或在循环中使用它)。它在内部使用迭代方法而不是递归方法,并且它被编写为 C 扩展,因此它比纯 Python 方法更快:

>>> %timeit list(deepflatten(L))
12.6 µs ± 298 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit list(deepflatten(L, types=list))
8.7 µs ± 139 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

>>> %timeit list(flatten(L))   # Cristian - Python 3.x approach from https://stackoverflow.com/a/2158532/5393381
86.4 µs ± 4.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit list(flatten(L))   # Josh Lee - https://stackoverflow.com/a/2158522/5393381
107 µs ± 2.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit list(genflat(L, list))  # Alex Martelli - https://stackoverflow.com/a/2159079/5393381
23.1 µs ± 710 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

我是 iteration_utilities 库的作者。

You could use deepflatten from the 3rd party package iteration_utilities:

>>> from iteration_utilities import deepflatten
>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> list(deepflatten(L))
[1, 2, 3, 4, 5, 6]

>>> list(deepflatten(L, types=list))  # only flatten "inner" lists
[1, 2, 3, 4, 5, 6]

It's an iterator so you need to iterate it (for example by wrapping it with list or using it in a loop). Internally it uses an iterative approach instead of an recursive approach and it's written as C extension so it can be faster than pure python approaches:

>>> %timeit list(deepflatten(L))
12.6 µs ± 298 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit list(deepflatten(L, types=list))
8.7 µs ± 139 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

>>> %timeit list(flatten(L))   # Cristian - Python 3.x approach from https://stackoverflow.com/a/2158532/5393381
86.4 µs ± 4.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit list(flatten(L))   # Josh Lee - https://stackoverflow.com/a/2158522/5393381
107 µs ± 2.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit list(genflat(L, list))  # Alex Martelli - https://stackoverflow.com/a/2159079/5393381
23.1 µs ± 710 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

I'm the author of the iteration_utilities library.

梦与时光遇 2024-08-26 16:46:47

尝试在 Python 中创建一个可以展平不规则列表的函数很有趣,但这当然就是 Python 的用途(让编程变得有趣)。以下生成器工作得相当好,但有一些注意事项:

def flatten(iterable):
    try:
        for item in iterable:
            yield from flatten(item)
    except TypeError:
        yield iterable

它将展平您可能希望单独保留的数据类型(例如 bytearraybytesstr 对象)。此外,该代码还依赖于这样一个事实:从不可迭代对象请求迭代器会引发 TypeError

>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> def flatten(iterable):
    try:
        for item in iterable:
            yield from flatten(item)
    except TypeError:
        yield iterable


>>> list(flatten(L))
[1, 2, 3, 4, 5, 6]
>>>

编辑:

我不同意之前的实现。问题是你不应该能够展平不可迭代的东西。它令人困惑,并且给论证留下了错误的印象。

>>> list(flatten(123))
[123]
>>>

下面的生成器几乎与第一个生成器相同,但不存在尝试展平不可迭代对象的问题。当给予它一个不适当的参数时,它就会像人们所期望的那样失败。

def flatten(iterable):
    for item in iterable:
        try:
            yield from flatten(item)
        except TypeError:
            yield item

使用提供的列表测试生成器可以正常工作。然而,当给新代码一个不可迭代的对象时,它会引发一个TypeError。下面显示了新行为的示例。

>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> list(flatten(L))
[1, 2, 3, 4, 5, 6]
>>> list(flatten(123))
Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    list(flatten(123))
  File "<pyshell#27>", line 2, in flatten
    for item in iterable:
TypeError: 'int' object is not iterable
>>>

It was fun trying to create a function that could flatten irregular list in Python, but of course that is what Python is for (to make programming fun). The following generator works fairly well with some caveats:

def flatten(iterable):
    try:
        for item in iterable:
            yield from flatten(item)
    except TypeError:
        yield iterable

It will flatten datatypes that you might want left alone (like bytearray, bytes, and str objects). Also, the code relies on the fact that requesting an iterator from a non-iterable raises a TypeError.

>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> def flatten(iterable):
    try:
        for item in iterable:
            yield from flatten(item)
    except TypeError:
        yield iterable


>>> list(flatten(L))
[1, 2, 3, 4, 5, 6]
>>>

Edit:

I disagree with the previous implementation. The problem is that you should not be able to flatten something that is not an iterable. It is confusing and gives the wrong impression of the argument.

>>> list(flatten(123))
[123]
>>>

The following generator is almost the same as the first but does not have the problem of trying to flatten a non-iterable object. It fails as one would expect when an inappropriate argument is given to it.

def flatten(iterable):
    for item in iterable:
        try:
            yield from flatten(item)
        except TypeError:
            yield item

Testing the generator works fine with the list that was provided. However, the new code will raise a TypeError when a non-iterable object is given to it. Example are shown below of the new behavior.

>>> L = [[[1, 2, 3], [4, 5]], 6]
>>> list(flatten(L))
[1, 2, 3, 4, 5, 6]
>>> list(flatten(123))
Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    list(flatten(123))
  File "<pyshell#27>", line 2, in flatten
    for item in iterable:
TypeError: 'int' object is not iterable
>>>
再浓的妆也掩不了殇 2024-08-26 16:46:47

这是一个简单的函数,可以展平任意深度的列表。没有递归,避免堆栈溢出。

from copy import deepcopy

def flatten_list(nested_list):
    """Flatten an arbitrarily nested list, without recursion (to avoid
    stack overflows). Returns a new list, the original list is unchanged.

    >> list(flatten_list([1, 2, 3, [4], [], [[[[[[[[[5]]]]]]]]]]))
    [1, 2, 3, 4, 5]
    >> list(flatten_list([[1, 2], 3]))
    [1, 2, 3]

    """
    nested_list = deepcopy(nested_list)

    while nested_list:
        sublist = nested_list.pop(0)

        if isinstance(sublist, list):
            nested_list = sublist + nested_list
        else:
            yield sublist

Here's a simple function that flattens lists of arbitrary depth. No recursion, to avoid stack overflow.

from copy import deepcopy

def flatten_list(nested_list):
    """Flatten an arbitrarily nested list, without recursion (to avoid
    stack overflows). Returns a new list, the original list is unchanged.

    >> list(flatten_list([1, 2, 3, [4], [], [[[[[[[[[5]]]]]]]]]]))
    [1, 2, 3, 4, 5]
    >> list(flatten_list([[1, 2], 3]))
    [1, 2, 3]

    """
    nested_list = deepcopy(nested_list)

    while nested_list:
        sublist = nested_list.pop(0)

        if isinstance(sublist, list):
            nested_list = sublist + nested_list
        else:
            yield sublist
草莓味的萝莉 2024-08-26 16:46:47

当试图回答这样的问题时,您确实需要给出您提出的解决方案代码的局限性。如果只是关于性能,我不会太介意,但是作为解决方案提出的大多数代码(包括接受的答案)都无法压平任何深度大于 1000 的列表。

当我说大多数代码 我的意思是所有使用任何形式的递归的代码(或调用递归的标准库函数)。所有这些代码都会失败,因为对于每一次递归调用,(调用)堆栈都会增长一个单位,并且(默认)Python 调用堆栈的大小为 1000。

如果您对调用堆栈不太熟悉,那么也许以下内容会有所帮助(否则您可以滚动到实现)。

调用堆栈大小和递归编程(地牢类比)

寻找宝藏并退出

想象一下,您进入一个带有编号房间的巨大地牢,正在寻找宝藏。你不知道这个地方,但你有一些关于如何找到宝藏的指示。每个指示都是一个谜语(难度各不相同,但你无法预测它们有多难)。您决定考虑一下节省时间的策略,您做出了两个观察:

  1. 找到宝藏很难(很长),因为您必须解决(可能很难)谜语才能到达那里。
  2. 一旦找到宝藏,返回入口可能很容易,你只需在另一个方向使用相同的路径即可(尽管这需要一点记忆来回忆你的路径)。

进入地牢时,你会注意到这里有一个小笔记本。你决定用它来记下解开谜语后(进入新房间时)退出的每个房间,这样你就可以返回入口。这是一个天才的想法,您甚至不会花一分钱来实施您的策略。

你进入了地牢,并成功解开了前 1001 个谜语,但出现了一些你没有计划的事情,你借来的笔记本里已经没有空间了。您决定放弃您的任务,因为您宁愿没有宝藏也不愿永远迷失在地牢中(这看起来确实很聪明)。

执行递归程序

基本上,这与寻找宝藏完全相同。地牢是计算机的内存,你现在的目标不是找到宝藏,而是计算一些函数(找到f(x)给定x)。这些指示只是帮助您求解 f(x) 的子例程。你的策略与调用堆栈策略相同,笔记本是堆栈,房间是函数的返回地址:

x = ["over here", "am", "I"]
y = sorted(x) # You're about to enter a room named `sorted`, note down the current room address here so you can return back: 0x4004f4 (that room address looks weird)
# Seems like you went back from your quest using the return address 0x4004f4
# Let's see what you've collected 
print(' '.join(y))

你在地牢中遇到的问题在这里也是一样的,调用堆栈有有限的大小(此处为 1000),因此,如果您输入太多函数而不返回,那么您将填满调用堆栈并出现类似 的错误“亲爱的冒险家,我很抱歉,但您的笔记本是full"RecursionError:超出最大递归深度。请注意,您不需要递归来填充调用堆栈,但非递归程序不太可能调用 1000 个函数而不返回。还需要了解的是,一旦从函数返回,调用堆栈就会从所使用的地址中释放(因此名称为“堆栈”,返回地址在进入函数之前被压入,在返回时被拉出)。在简单递归的特殊情况下(函数 f 调用自身一次——一遍又一遍——),您将一遍又一遍地输入 f 直到计算完成(直到找到宝藏)并从 f 返回,直到回到最初调用 f 的地方。调用堆栈永远不会从任何内容中释放,直到最后它会一个接一个地从所有返回地址中释放。

如何避免这个问题?

这实际上非常简单:“如果您不知道递归可以有多深,就不要使用递归”。这并不总是正确的,因为在某些情况下,可以优化尾调用递归 (TCO) 。但在Python中,情况并非如此,即使是“写得好的”递归函数也不会优化堆栈的使用。 Guido 有一篇关于这个问题的有趣文章: 尾递归消除

您可以使用一种技术来使任何递归函数迭代,我们可以将这种技术称为自带笔记本。例如,在我们的特定情况下,我们只是探索一个列表,进入一个房间相当于进入一个子列表,您应该问自己的问题是如何从列表返回到其父列表? 答案并不复杂,重复以下操作,直到 stack 为空:

  1. 将当前列表 addressindex 压入 stack 当进入一个新的子列表时(注意列表地址+索引也是一个地址,因此我们只使用与调用堆栈使用的完全相同的技术);
  2. 每次找到一个项目时,yield它(或将它们添加到列表中);
  3. 一旦列表被完全探索,使用堆栈返回父列表返回地址(和索引

另请注意,这相当于树中的 DFS,其中一些节点是子列表 A = [1, 2],一些是简单项:0, 1, 2, 3, 4< /code> (对于 L = [0, [1,2], 3, 4])。该树如下所示:

                    L
                    |
           -------------------
           |     |     |     |
           0   --A--   3     4
               |   |
               1   2

DFS 遍历前序为:L、0、A、1、2、3、4。请记住,为了实现迭代 DFS,您还“需要”堆栈。我之前提出的实现会导致以下状态(对于 stackflat_list):

init.:  stack=[(L, 0)]
**0**:  stack=[(L, 0)],         flat_list=[0]
**A**:  stack=[(L, 1), (A, 0)], flat_list=[0]
**1**:  stack=[(L, 1), (A, 0)], flat_list=[0, 1]
**2**:  stack=[(L, 1), (A, 1)], flat_list=[0, 1, 2]
**3**:  stack=[(L, 2)],         flat_list=[0, 1, 2, 3]
**3**:  stack=[(L, 3)],         flat_list=[0, 1, 2, 3, 4]
return: stack=[],               flat_list=[0, 1, 2, 3, 4]

在本例中,堆栈最大大小为 2,因为输入列表 (因此树)的深度为 2。

实现

对于实现,在 python 中,您可以通过使用迭代器而不是简单的列表来简化一点。对(子)迭代器的引用将用于存储子列表返回地址(而不是同时具有列表地址和索引)。这并不是一个很大的区别,但我觉得这更具可读性(而且也更快一点):

def flatten(iterable):
    return list(items_from(iterable))

def items_from(iterable):
    cursor_stack = [iter(iterable)]
    while cursor_stack:
        sub_iterable = cursor_stack[-1]
        try:
            item = next(sub_iterable)
        except StopIteration:   # post-order
            cursor_stack.pop()
            continue
        if is_list_like(item):  # pre-order
            cursor_stack.append(iter(item))
        elif item is not None:
            yield item          # in-order

def is_list_like(item):
    return isinstance(item, list)

另外,请注意,在 is_list_like 中,我有 isinstance(item, list),可以更改它以处理更多输入类型,在这里我只想拥有最简单的版本,其中 (iterable) 只是一个列表。但您也可以这样做:

def is_list_like(item):
    try:
        iter(item)
        return not isinstance(item, str)  # strings are not lists (hmm...) 
    except TypeError:
        return False

这将字符串视为“简单项”,因此 flatten_iter([["test", "a"], "b]) 将返回 ["test" , "a", "b"] 而不是 ["t", "e", "s", "t", "a", "b"]。 上被调用两次,让我们假设这是读者的一个练习,以使其更简洁。

在这种情况下,iter(item) 在每个项目

最后,请记住,您可以'。 t 使用 print(L) 打印无限嵌套列表 L 因为在内部它将使用对 __repr__ 的递归调用 (RecursionError: max recursion 相同的错误消息。

出于同样的原因,涉及 strflatten 解决方案将失败,并显示 要测试您的解决方案,您可以使用此函数生成一个简单的嵌套列表:

def build_deep_list(depth):
    """Returns a list of the form $l_{depth} = [depth-1, l_{depth-1}]$
    with $depth > 1$ and $l_0 = [0]$.
    """
    sub_list = [0]
    for d in range(1, depth):
        sub_list = [d, sub_list]
    return sub_list

其中给出: build_deep_list(5) >>> [4, [3, [2, [ 1、[0]]]]]

When trying to answer such a question you really need to give the limitations of the code you propose as a solution. If it was only about performances I wouldn't mind too much, but most of the codes proposed as solution (including the accepted answer) fail to flatten any list that has a depth greater than 1000.

When I say most of the codes I mean all codes that use any form of recursion (or call a standard library function that is recursive). All these codes fail because for every of the recursive call made, the (call) stack grow by one unit, and the (default) python call stack has a size of 1000.

If you're not too familiar with the call stack, then maybe the following will help (otherwise you can just scroll to the Implementation).

Call stack size and recursive programming (dungeon analogy)

Finding the treasure and exit

Imagine you enter a huge dungeon with numbered rooms, looking for a treasure. You don't know the place but you have some indications on how to find the treasure. Each indication is a riddle (difficulty varies, but you can't predict how hard they will be). You decide to think a little bit about a strategy to save time, you make two observations:

  1. It's hard (long) to find the treasure as you'll have to solve (potentially hard) riddles to get there.
  2. Once the treasure found, returning to the entrance may be easy, you just have to use the same path in the other direction (though this needs a bit of memory to recall your path).

When entering the dungeon, you notice a small notebook here. You decide to use it to write down every room you exit after solving a riddle (when entering a new room), this way you'll be able to return back to the entrance. That's a genius idea, you won't even spend a cent implementing your strategy.

You enter the dungeon, solving with great success the first 1001 riddles, but here comes something you hadn't planed, you have no space left in the notebook you borrowed. You decide to abandon your quest as you prefer not having the treasure than being lost forever inside the dungeon (that looks smart indeed).

Executing a recursive program

Basically, it's the exact same thing as finding the treasure. The dungeon is the computer's memory, your goal now is not to find a treasure but to compute some function (find f(x) for a given x). The indications simply are sub-routines that will help you solving f(x). Your strategy is the same as the call stack strategy, the notebook is the stack, the rooms are the functions' return addresses:

x = ["over here", "am", "I"]
y = sorted(x) # You're about to enter a room named `sorted`, note down the current room address here so you can return back: 0x4004f4 (that room address looks weird)
# Seems like you went back from your quest using the return address 0x4004f4
# Let's see what you've collected 
print(' '.join(y))

The problem you encountered in the dungeon will be the same here, the call stack has a finite size (here 1000) and therefore, if you enter too many functions without returning back then you'll fill the call stack and have an error that look like "Dear adventurer, I'm very sorry but your notebook is full": RecursionError: maximum recursion depth exceeded. Note that you don't need recursion to fill the call stack, but it's very unlikely that a non-recursive program call 1000 functions without ever returning. It's important to also understand that once you returned from a function, the call stack is freed from the address used (hence the name "stack", return address are pushed in before entering a function and pulled out when returning). In the special case of a simple recursion (a function f that call itself once -- over and over --) you will enter f over and over until the computation is finished (until the treasure is found) and return from f until you go back to the place where you called f in the first place. The call stack will never be freed from anything until the end where it will be freed from all return addresses one after the other.

How to avoid this issue?

That's actually pretty simple: "don't use recursion if you don't know how deep it can go". That's not always true as in some cases, Tail Call recursion can be Optimized (TCO). But in python, this is not the case, and even "well written" recursive function will not optimize stack use. There is an interesting post from Guido about this question: Tail Recursion Elimination.

There is a technique that you can use to make any recursive function iterative, this technique we could call bring your own notebook. For example, in our particular case we simply are exploring a list, entering a room is equivalent to entering a sublist, the question you should ask yourself is how can I get back from a list to its parent list? The answer is not that complex, repeat the following until the stack is empty:

  1. push the current list address and index in a stack when entering a new sublist (note that a list address+index is also an address, therefore we just use the exact same technique used by the call stack);
  2. every time an item is found, yield it (or add them in a list);
  3. once a list is fully explored, go back to the parent list using the stack return address (and index).

Also note that this is equivalent to a DFS in a tree where some nodes are sublists A = [1, 2] and some are simple items: 0, 1, 2, 3, 4 (for L = [0, [1,2], 3, 4]). The tree looks like this:

                    L
                    |
           -------------------
           |     |     |     |
           0   --A--   3     4
               |   |
               1   2

The DFS traversal pre-order is: L, 0, A, 1, 2, 3, 4. Remember, in order to implement an iterative DFS you also "need" a stack. The implementation I proposed before result in having the following states (for the stack and the flat_list):

init.:  stack=[(L, 0)]
**0**:  stack=[(L, 0)],         flat_list=[0]
**A**:  stack=[(L, 1), (A, 0)], flat_list=[0]
**1**:  stack=[(L, 1), (A, 0)], flat_list=[0, 1]
**2**:  stack=[(L, 1), (A, 1)], flat_list=[0, 1, 2]
**3**:  stack=[(L, 2)],         flat_list=[0, 1, 2, 3]
**3**:  stack=[(L, 3)],         flat_list=[0, 1, 2, 3, 4]
return: stack=[],               flat_list=[0, 1, 2, 3, 4]

In this example, the stack maximum size is 2, because the input list (and therefore the tree) have depth 2.

Implementation

For the implementation, in python you can simplify a little bit by using iterators instead of simple lists. References to the (sub)iterators will be used to store sublists return addresses (instead of having both the list address and the index). This is not a big difference but I feel this is more readable (and also a bit faster):

def flatten(iterable):
    return list(items_from(iterable))

def items_from(iterable):
    cursor_stack = [iter(iterable)]
    while cursor_stack:
        sub_iterable = cursor_stack[-1]
        try:
            item = next(sub_iterable)
        except StopIteration:   # post-order
            cursor_stack.pop()
            continue
        if is_list_like(item):  # pre-order
            cursor_stack.append(iter(item))
        elif item is not None:
            yield item          # in-order

def is_list_like(item):
    return isinstance(item, list)

Also, notice that in is_list_like I have isinstance(item, list), which could be changed to handle more input types, here I just wanted to have the simplest version where (iterable) is just a list. But you could also do that:

def is_list_like(item):
    try:
        iter(item)
        return not isinstance(item, str)  # strings are not lists (hmm...) 
    except TypeError:
        return False

This considers strings as "simple items" and therefore flatten_iter([["test", "a"], "b]) will return ["test", "a", "b"] and not ["t", "e", "s", "t", "a", "b"]. Remark that in that case, iter(item) is called twice on each item, let's pretend it's an exercise for the reader to make this cleaner.

Testing and remarks on other implementations

In the end, remember that you can't print a infinitely nested list L using print(L) because internally it will use recursive calls to __repr__ (RecursionError: maximum recursion depth exceeded while getting the repr of an object). For the same reason, solutions to flatten involving str will fail with the same error message.

If you need to test your solution, you can use this function to generate a simple nested list:

def build_deep_list(depth):
    """Returns a list of the form $l_{depth} = [depth-1, l_{depth-1}]$
    with $depth > 1$ and $l_0 = [0]$.
    """
    sub_list = [0]
    for d in range(1, depth):
        sub_list = [d, sub_list]
    return sub_list

Which gives: build_deep_list(5) >>> [4, [3, [2, [1, [0]]]]].

清浅ˋ旧时光 2024-08-26 16:46:47

尽管已经选择了一个优雅且非常Pythonic的答案,但我将提出我的解决方案以供审查:

def flat(l):
    ret = []
    for i in l:
        if isinstance(i, list) or isinstance(i, tuple):
            ret.extend(flat(i))
        else:
            ret.append(i)
    return ret

请告诉我这段代码有多好或多坏?

Although an elegant and very pythonic answer has been selected I would present my solution just for the review:

def flat(l):
    ret = []
    for i in l:
        if isinstance(i, list) or isinstance(i, tuple):
            ret.extend(flat(i))
        else:
            ret.append(i)
    return ret

Please tell how good or bad this code is?

酒废 2024-08-26 16:46:47

我更喜欢简单的答案。没有发电机。没有递归或递归限制。只是迭代:

def flatten(TheList):
    listIsNested = True

    while listIsNested:                 #outer loop
        keepChecking = False
        Temp = []

        for element in TheList:         #inner loop
            if isinstance(element,list):
                Temp.extend(element)
                keepChecking = True
            else:
                Temp.append(element)

        listIsNested = keepChecking     #determine if outer loop exits
        TheList = Temp[:]

    return TheList

这适用于两个列表:内部 for 循环和外部 while 循环。

内部 for 循环遍历列表。如果它找到一个列表元素,它 (1) 使用 list.extend() 将该部分展平为一级嵌套,并且 (2) 将 keepChecking 切换为 True。 keepchecking 用于控制外层 while 循环。如果外循环设置为 true,则会触发内循环进行另一遍。

这些传递不断发生,直到找不到更多嵌套列表为止。当最终在没有找到任何内容的情况下发生传递时,keepChecking 永远不会跳到 true,这意味着 listIsNested 保持 false 并且外部 while 循环退出。

然后返回展平的列表。

测试运行

flatten([1,2,3,4,[100,200,300,[1000,2000,3000]]])

[1, 2, 3, 4, 100, 200, 300, 1000, 2000, 3000]

I prefer simple answers. No generators. No recursion or recursion limits. Just iteration:

def flatten(TheList):
    listIsNested = True

    while listIsNested:                 #outer loop
        keepChecking = False
        Temp = []

        for element in TheList:         #inner loop
            if isinstance(element,list):
                Temp.extend(element)
                keepChecking = True
            else:
                Temp.append(element)

        listIsNested = keepChecking     #determine if outer loop exits
        TheList = Temp[:]

    return TheList

This works with two lists: an inner for loop and an outer while loop.

The inner for loop iterates through the list. If it finds a list element, it (1) uses list.extend() to flatten that part one level of nesting and (2) switches keepChecking to True. keepchecking is used to control the outer while loop. If the outer loop gets set to true, it triggers the inner loop for another pass.

Those passes keep happening until no more nested lists are found. When a pass finally occurs where none are found, keepChecking never gets tripped to true, which means listIsNested stays false and the outer while loop exits.

The flattened list is then returned.

Test-run

flatten([1,2,3,4,[100,200,300,[1000,2000,3000]]])

[1, 2, 3, 4, 100, 200, 300, 1000, 2000, 3000]

无需解释 2024-08-26 16:46:47

我没有在这里查看所有已经可用的答案,但这是我想出的一个行,借用 lisp 的第一个和其余列表处理方式,

def flatten(l): return flatten(l[0]) + (flatten(l[1:]) if len(l) > 1 else []) if type(l) is list else [l]

这里是一个简单的和一个不那么简单的情况 -

>>> flatten([1,[2,3],4])
[1, 2, 3, 4]

>>> flatten([1, [2, 3], 4, [5, [6, {'name': 'some_name', 'age':30}, 7]], [8, 9, [10, [11, [12, [13, {'some', 'set'}, 14, [15, 'some_string'], 16], 17, 18], 19], 20], 21, 22, [23, 24], 25], 26, 27, 28, 29, 30])
[1, 2, 3, 4, 5, 6, {'age': 30, 'name': 'some_name'}, 7, 8, 9, 10, 11, 12, 13, set(['set', 'some']), 14, 15, 'some_string', 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
>>> 

I didn't go through all the already available answers here, but here is a one liner I came up with, borrowing from lisp's way of first and rest list processing

def flatten(l): return flatten(l[0]) + (flatten(l[1:]) if len(l) > 1 else []) if type(l) is list else [l]

here is one simple and one not-so-simple case -

>>> flatten([1,[2,3],4])
[1, 2, 3, 4]

>>> flatten([1, [2, 3], 4, [5, [6, {'name': 'some_name', 'age':30}, 7]], [8, 9, [10, [11, [12, [13, {'some', 'set'}, 14, [15, 'some_string'], 16], 17, 18], 19], 20], 21, 22, [23, 24], 25], 26, 27, 28, 29, 30])
[1, 2, 3, 4, 5, 6, {'age': 30, 'name': 'some_name'}, 7, 8, 9, 10, 11, 12, 13, set(['set', 'some']), 14, 15, 'some_string', 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
>>> 
荒岛晴空 2024-08-26 16:46:47

我经常使用 more_itertools.collapse

# pip install more-itertools
from more_itertools import collapse

out = list(collapse([[[1, 2, 3], [4, 5]], 6]))
# [1, 2, 3, 4, 5, 6]

它直接处理字符串/字节,并且不会将它们解包为单个字符:

list(collapse([[[1, 2, 3, 'abc'], [4, 5]], 6, 'def']))
# [1, 2, 3, 'abc', 4, 5, 6, 'def']

它还可以在给定的嵌套级别停止:

list(collapse([[[1, 2, 3], [4, 5]], 6], levels=1))
# [[1, 2, 3], [4, 5], 6]

完整的代码相对简单,并且还使用递归方法:

def collapse(iterable, base_type=None, levels=None):
    def walk(node, level):
        if (
            ((levels is not None) and (level > levels))
            or isinstance(node, (str, bytes))
            or ((base_type is not None) and isinstance(node, base_type))
        ):
            yield node
            return

        try:
            tree = iter(node)
        except TypeError:
            yield node
            return
        else:
            for child in tree:
                yield from walk(child, level + 1)

    yield from walk(iterable, 0)

I often use more_itertools.collapse:

# pip install more-itertools
from more_itertools import collapse

out = list(collapse([[[1, 2, 3], [4, 5]], 6]))
# [1, 2, 3, 4, 5, 6]

It handles strings/bytes out-of-the-box and doesn't unpack them into single characters:

list(collapse([[[1, 2, 3, 'abc'], [4, 5]], 6, 'def']))
# [1, 2, 3, 'abc', 4, 5, 6, 'def']

It can also stop at a given nesting level:

list(collapse([[[1, 2, 3], [4, 5]], 6], levels=1))
# [[1, 2, 3], [4, 5], 6]

The full code is relatively straighforward and it also uses a recursive approach:

def collapse(iterable, base_type=None, levels=None):
    def walk(node, level):
        if (
            ((levels is not None) and (level > levels))
            or isinstance(node, (str, bytes))
            or ((base_type is not None) and isinstance(node, base_type))
        ):
            yield node
            return

        try:
            tree = iter(node)
        except TypeError:
            yield node
            return
        else:
            for child in tree:
                yield from walk(child, level + 1)

    yield from walk(iterable, 0)
原来分手还会想你 2024-08-26 16:46:47

我不确定这是否一定更快或更有效,但这就是我所做的:

def flatten(lst):
    return eval('[' + str(lst).replace('[', '').replace(']', '') + ']')

L = [[[1, 2, 3], [4, 5]], 6]
print(flatten(L))

这里的 flatten 函数将列表转换为字符串,取出 all方括号,将方括号重新附加到两端,然后将其变回列表。

不过,如果您知道字符串列表中会有方括号,例如 [[1, 2], "[3, 4] 和 [5]"],您将不得不做一些事情别的。

I'm not sure if this is necessarily quicker or more effective, but this is what I do:

def flatten(lst):
    return eval('[' + str(lst).replace('[', '').replace(']', '') + ']')

L = [[[1, 2, 3], [4, 5]], 6]
print(flatten(L))

The flatten function here turns the list into a string, takes out all of the square brackets, attaches square brackets back onto the ends, and turns it back into a list.

Although, if you knew you would have square brackets in your list in strings, like [[1, 2], "[3, 4] and [5]"], you would have to do something else.

压抑⊿情绪 2024-08-26 16:46:47

只需使用 funcy 库:
pip 安装功能

import funcy


funcy.flatten([[[[1, 1], 1], 2], 3]) # returns generator
funcy.lflatten([[[[1, 1], 1], 2], 3]) # returns list

Just use a funcy library:
pip install funcy

import funcy


funcy.flatten([[[[1, 1], 1], 2], 3]) # returns generator
funcy.lflatten([[[[1, 1], 1], 2], 3]) # returns list
本宫微胖 2024-08-26 16:46:47

没有递归或嵌套循环。几行。格式良好且易于阅读:

def flatten_deep(arr: list):
    """ Flattens arbitrarily-nested list `arr` into single-dimensional. """

    while arr:
        if isinstance(arr[0], list):  # Checks whether first element is a list
            arr = arr[0] + arr[1:]  # If so, flattens that first element one level
        else:
            yield arr.pop(0)  # Otherwise yield as part of the flat array

flatten_deep(L)

来自我自己的代码 https://github.com/jorgeorpinel/ flatten_nested_lists/blob/master/flatten.py

No recursion or nested loops. A few lines. Well formatted and easy to read:

def flatten_deep(arr: list):
    """ Flattens arbitrarily-nested list `arr` into single-dimensional. """

    while arr:
        if isinstance(arr[0], list):  # Checks whether first element is a list
            arr = arr[0] + arr[1:]  # If so, flattens that first element one level
        else:
            yield arr.pop(0)  # Otherwise yield as part of the flat array

flatten_deep(L)

From my own code at https://github.com/jorgeorpinel/flatten_nested_lists/blob/master/flatten.py

递刀给你 2024-08-26 16:46:47

这是 2.7.5 中的 compiler.ast.flatten 实现:

def flatten(seq):
    l = []
    for elt in seq:
        t = type(elt)
        if t is tuple or t is list:
            for elt2 in flatten(elt):
                l.append(elt2)
        else:
            l.append(elt)
    return l

还有更好、更快的方法(如果您已经到达这里,您已经看到它们)

另请注意:

自版本 2.6 起已弃用:编译器包已在 Python 3 中删除。

Here's the compiler.ast.flatten implementation in 2.7.5:

def flatten(seq):
    l = []
    for elt in seq:
        t = type(elt)
        if t is tuple or t is list:
            for elt2 in flatten(elt):
                l.append(elt2)
        else:
            l.append(elt)
    return l

There are better, faster methods (If you've reached here, you have seen them already)

Also note:

Deprecated since version 2.6: The compiler package has been removed in Python 3.

ま昔日黯然 2024-08-26 16:46:47

我很惊讶没有人想到这一点。该死的递归我没有得到这里的高级人员所做的递归答案。无论如何,这是我对此的尝试。需要注意的是,它非常特定于 OP 的用例

import re

L = [[[1, 2, 3], [4, 5]], 6]
flattened_list = re.sub("[\[\]]", "", str(L)).replace(" ", "").split(",")
new_list = list(map(int, flattened_list))
print(new_list)

输出:

[1, 2, 3, 4, 5, 6]

I'm surprised no one has thought of this. Damn recursion I don't get the recursive answers that the advanced people here made. anyway here is my attempt on this. caveat is it's very specific to the OP's use case

import re

L = [[[1, 2, 3], [4, 5]], 6]
flattened_list = re.sub("[\[\]]", "", str(L)).replace(" ", "").split(",")
new_list = list(map(int, flattened_list))
print(new_list)

output:

[1, 2, 3, 4, 5, 6]
·深蓝 2024-08-26 16:46:47

我知道已经有很多很棒的答案,但我想添加一个使用函数式编程方法来解决问题的答案。在这个答案中,我使用双重递归:

def flatten_list(seq):
    if not seq:
        return []
    elif isinstance(seq[0],list):
        return (flatten_list(seq[0])+flatten_list(seq[1:]))
    else:
        return [seq[0]]+flatten_list(seq[1:])

print(flatten_list([1,2,[3,[4],5],[6,7]]))

输出:

[1, 2, 3, 4, 5, 6, 7]

I am aware that there are already many awesome answers but i wanted to add an answer that uses the functional programming method of solving the question. In this answer i make use of double recursion :

def flatten_list(seq):
    if not seq:
        return []
    elif isinstance(seq[0],list):
        return (flatten_list(seq[0])+flatten_list(seq[1:]))
    else:
        return [seq[0]]+flatten_list(seq[1:])

print(flatten_list([1,2,[3,[4],5],[6,7]]))

output:

[1, 2, 3, 4, 5, 6, 7]
秉烛思 2024-08-26 16:46:47

我在这里没有看到类似的内容,只是从同一主题的封闭问题中得到这里,但为什么不做这样的事情(如果您知道要拆分的列表的类型):

>>> a = [1, 2, 3, 5, 10, [1, 25, 11, [1, 0]]]    
>>> g = str(a).replace('[', '').replace(']', '')    
>>> b = [int(x) for x in g.split(',') if x.strip()]

您需要了解元素的类型,但我认为这可以概括,并且就速度而言,我认为它会更快。

I don't see anything like this posted around here and just got here from a closed question on the same subject, but why not just do something like this(if you know the type of the list you want to split):

>>> a = [1, 2, 3, 5, 10, [1, 25, 11, [1, 0]]]    
>>> g = str(a).replace('[', '').replace(']', '')    
>>> b = [int(x) for x in g.split(',') if x.strip()]

You would need to know the type of the elements but I think this can be generalised and in terms of speed I think it would be faster.

七禾 2024-08-26 16:46:47

完全hacky,但我认为它会起作用(取决于你的数据类型)

flat_list = ast.literal_eval("[%s]"%re.sub("[\[\]]","",str(the_list)))

totally hacky but I think it would work (depending on your data_type)

flat_list = ast.literal_eval("[%s]"%re.sub("[\[\]]","",str(the_list)))
枕梦 2024-08-26 16:46:47

这是另一种 py2 方法,我不确定它是否是最快或最优雅或最安全的...

from collections import Iterable
from itertools import imap, repeat, chain


def flat(seqs, ignore=(int, long, float, basestring)):
    return repeat(seqs, 1) if any(imap(isinstance, repeat(seqs), ignore)) or not isinstance(seqs, Iterable) else chain.from_iterable(imap(flat, seqs))

它可以忽略您想要的任何特定(或派生)类型,它返回一个迭代器,因此您可以将其转换为任何特定容器例如列表、元组、字典,或者只是消耗它以减少内存占用,无论好坏,它都可以处理初始的不可迭代对象,例如 int ...

请注意,大部分繁重的工作都是在 C 中完成的,因为到目前为止据我所知,这就是 itertools 的实现方式,所以虽然它是递归的,但据我所知,它不受 python 递归深度的限制,因为函数调用发生在 C 中,尽管这并不意味着你受到内存的限制,特别是在 OS X 中截至目前,其堆栈大小有硬性限制(OS X Mavericks)...

有一种稍微快一点的方法,但可移植性较差,仅当您可以假设可以明确确定输入的基本元素时才使用它,你会得到一个无限递归,而 OS X 的堆栈大小有限,会很快抛出分段错误......

def flat(seqs, ignore={int, long, float, str, unicode}):
    return repeat(seqs, 1) if type(seqs) in ignore or not isinstance(seqs, Iterable) else chain.from_iterable(imap(flat, seqs))

这里我们使用集合来检查类型,所以它需要 O(1) vs O(number of types) 来检查是否应忽略某个元素,当然,任何具有指定忽略类型的派生类型的值都会失败,这就是为什么它使用 str, unicode 因此请谨慎使用...

测试:

import random

def test_flat(test_size=2000):
    def increase_depth(value, depth=1):
        for func in xrange(depth):
            value = repeat(value, 1)
        return value

    def random_sub_chaining(nested_values):
        for values in nested_values:
            yield chain((values,), chain.from_iterable(imap(next, repeat(nested_values, random.randint(1, 10)))))

    expected_values = zip(xrange(test_size), imap(str, xrange(test_size)))
    nested_values = random_sub_chaining((increase_depth(value, depth) for depth, value in enumerate(expected_values)))
    assert not any(imap(cmp, chain.from_iterable(expected_values), flat(chain(((),), nested_values, ((),)))))

>>> test_flat()
>>> list(flat([[[1, 2, 3], [4, 5]], 6]))
[1, 2, 3, 4, 5, 6]
>>>  

$ uname -a
Darwin Samys-MacBook-Pro.local 13.3.0 Darwin Kernel Version 13.3.0: Tue Jun  3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 x86_64
$ python --version
Python 2.7.5

Here is another py2 approach, Im not sure if its the fastest or the most elegant nor safest ...

from collections import Iterable
from itertools import imap, repeat, chain


def flat(seqs, ignore=(int, long, float, basestring)):
    return repeat(seqs, 1) if any(imap(isinstance, repeat(seqs), ignore)) or not isinstance(seqs, Iterable) else chain.from_iterable(imap(flat, seqs))

It can ignore any specific (or derived) type you would like, it returns an iterator, so you can convert it to any specific container such as list, tuple, dict or simply consume it in order to reduce memory footprint, for better or worse it can handle initial non-iterable objects such as int ...

Note most of the heavy lifting is done in C, since as far as I know thats how itertools are implemented, so while it is recursive, AFAIK it isn't bounded by python recursion depth since the function calls are happening in C, though this doesn't mean you are bounded by memory, specially in OS X where its stack size has a hard limit as of today (OS X Mavericks) ...

there is a slightly faster approach, but less portable method, only use it if you can assume that the base elements of the input can be explicitly determined otherwise, you'll get an infinite recursion, and OS X with its limited stack size, will throw a segmentation fault fairly quickly ...

def flat(seqs, ignore={int, long, float, str, unicode}):
    return repeat(seqs, 1) if type(seqs) in ignore or not isinstance(seqs, Iterable) else chain.from_iterable(imap(flat, seqs))

here we are using sets to check for the type so it takes O(1) vs O(number of types) to check whether or not an element should be ignored, though of course any value with derived type of the stated ignored types will fail, this is why its using str, unicode so use it with caution ...

tests:

import random

def test_flat(test_size=2000):
    def increase_depth(value, depth=1):
        for func in xrange(depth):
            value = repeat(value, 1)
        return value

    def random_sub_chaining(nested_values):
        for values in nested_values:
            yield chain((values,), chain.from_iterable(imap(next, repeat(nested_values, random.randint(1, 10)))))

    expected_values = zip(xrange(test_size), imap(str, xrange(test_size)))
    nested_values = random_sub_chaining((increase_depth(value, depth) for depth, value in enumerate(expected_values)))
    assert not any(imap(cmp, chain.from_iterable(expected_values), flat(chain(((),), nested_values, ((),)))))

>>> test_flat()
>>> list(flat([[[1, 2, 3], [4, 5]], 6]))
[1, 2, 3, 4, 5, 6]
>>>  

$ uname -a
Darwin Samys-MacBook-Pro.local 13.3.0 Darwin Kernel Version 13.3.0: Tue Jun  3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 x86_64
$ python --version
Python 2.7.5
痴梦一场 2024-08-26 16:46:47

不使用任何库:

def flat(l):
    def _flat(l, r):    
        if type(l) is not list:
            r.append(l)
        else:
            for i in l:
                r = r + flat(i)
        return r
    return _flat(l, [])



# example
test = [[1], [[2]], [3], [['a','b','c'] , [['z','x','y']], ['d','f','g']], 4]    
print flat(test) # prints [1, 2, 3, 'a', 'b', 'c', 'z', 'x', 'y', 'd', 'f', 'g', 4]

Without using any library:

def flat(l):
    def _flat(l, r):    
        if type(l) is not list:
            r.append(l)
        else:
            for i in l:
                r = r + flat(i)
        return r
    return _flat(l, [])



# example
test = [[1], [[2]], [3], [['a','b','c'] , [['z','x','y']], ['d','f','g']], 4]    
print flat(test) # prints [1, 2, 3, 'a', 'b', 'c', 'z', 'x', 'y', 'd', 'f', 'g', 4]
情痴 2024-08-26 16:46:47

使用itertools.chain:

import itertools
from collections import Iterable

def list_flatten(lst):
    flat_lst = []
    for item in itertools.chain(lst):
        if isinstance(item, Iterable):
            item = list_flatten(item)
            flat_lst.extend(item)
        else:
            flat_lst.append(item)
    return flat_lst

或者不使用链接:

def flatten(q, final):
    if not q:
        return
    if isinstance(q, list):
        if not isinstance(q[0], list):
            final.append(q[0])
        else:
            flatten(q[0], final)
        flatten(q[1:], final)
    else:
        final.append(q)

Using itertools.chain:

import itertools
from collections import Iterable

def list_flatten(lst):
    flat_lst = []
    for item in itertools.chain(lst):
        if isinstance(item, Iterable):
            item = list_flatten(item)
            flat_lst.extend(item)
        else:
            flat_lst.append(item)
    return flat_lst

Or without chaining:

def flatten(q, final):
    if not q:
        return
    if isinstance(q, list):
        if not isinstance(q[0], list):
            final.append(q[0])
        else:
            flatten(q[0], final)
        flatten(q[1:], final)
    else:
        final.append(q)
独自←快乐 2024-08-26 16:46:47

我使用递归来解决任意深度的嵌套列表

def combine_nlist(nlist,init=0,combiner=lambda x,y: x+y):
    '''
    apply function: combiner to a nested list element by element(treated as flatten list)
    '''
    current_value=init
    for each_item in nlist:
        if isinstance(each_item,list):
            current_value =combine_nlist(each_item,current_value,combiner)
        else:
            current_value = combiner(current_value,each_item)
    return current_value

所以在我定义函数combine_nlist之后,很容易使用这个函数进行扁平化。或者您可以将其合并为一个函数。我喜欢我的解决方案,因为它可以应用于任何嵌套列表。

def flatten_nlist(nlist):
    return combine_nlist(nlist,[],lambda x,y:x+[y])

结果

In [379]: flatten_nlist([1,2,3,[4,5],[6],[[[7],8],9],10])
Out[379]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

I used recursive to solve nested list with any depth

def combine_nlist(nlist,init=0,combiner=lambda x,y: x+y):
    '''
    apply function: combiner to a nested list element by element(treated as flatten list)
    '''
    current_value=init
    for each_item in nlist:
        if isinstance(each_item,list):
            current_value =combine_nlist(each_item,current_value,combiner)
        else:
            current_value = combiner(current_value,each_item)
    return current_value

So after i define function combine_nlist, it is easy to use this function do flatting. Or you can combine it into one function. I like my solution because it can be applied to any nested list.

def flatten_nlist(nlist):
    return combine_nlist(nlist,[],lambda x,y:x+[y])

result

In [379]: flatten_nlist([1,2,3,[4,5],[6],[[[7],8],9],10])
Out[379]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
拥醉 2024-08-26 16:46:47

最简单的方法是使用 pip install morph 来使用 morph 库。

代码是:

import morph

list = [[[1, 2, 3], [4, 5]], 6]
flattened_list = morph.flatten(list)  # returns [1, 2, 3, 4, 5, 6]

The easiest way is to use the morph library using pip install morph.

The code is:

import morph

list = [[[1, 2, 3], [4, 5]], 6]
flattened_list = morph.flatten(list)  # returns [1, 2, 3, 4, 5, 6]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文