展平嵌套字典,压缩键

发布于 2024-11-07 20:34:57 字数 241 浏览 0 评论 0原文

假设您有一本类似的字典:

{'a': 1,
 'c': {'a': 2,
       'b': {'x': 5,
             'y' : 10}},
 'd': [1, 2, 3]}

您如何将其扁平化为:

{'a': 1,
 'c_a': 2,
 'c_b_x': 5,
 'c_b_y': 10,
 'd': [1, 2, 3]}

Suppose you have a dictionary like:

{'a': 1,
 'c': {'a': 2,
       'b': {'x': 5,
             'y' : 10}},
 'd': [1, 2, 3]}

How would you go about flattening that into something like:

{'a': 1,
 'c_a': 2,
 'c_b_x': 5,
 'c_b_y': 10,
 'd': [1, 2, 3]}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(30

请爱~陌生人 2024-11-14 20:34:57

基本上与展平嵌套列表的方式相同,您只需要做额外的工作即可按键/值迭代字典,为新字典创建新键并在最后一步创建字典。

from collections.abc import MutableMapping

def flatten(dictionary, parent_key='', separator='_'):
    items = []
    for key, value in dictionary.items():
        new_key = parent_key + separator + key if parent_key else key
        if isinstance(value, MutableMapping):
            items.extend(flatten(value, new_key, separator=separator).items())
        else:
            items.append((new_key, value))
    return dict(items)

>>> flatten({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]})
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}

Basically the same way you would flatten a nested list, you just have to do the extra work for iterating the dict by key/value, creating new keys for your new dictionary and creating the dictionary at final step.

from collections.abc import MutableMapping

def flatten(dictionary, parent_key='', separator='_'):
    items = []
    for key, value in dictionary.items():
        new_key = parent_key + separator + key if parent_key else key
        if isinstance(value, MutableMapping):
            items.extend(flatten(value, new_key, separator=separator).items())
        else:
            items.append((new_key, value))
    return dict(items)

>>> flatten({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]})
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}
落叶缤纷 2024-11-14 20:34:57

如果您已经在使用 pandas,则可以使用 json_normalize 来完成() 像这样:

import pandas as pd

d = {'a': 1,
     'c': {'a': 2, 'b': {'x': 5, 'y' : 10}},
     'd': [1, 2, 3]}

df = pd.json_normalize(d, sep='_')
d_flat = df.to_dict(orient='records')[0]

print(d_flat)

输出:

{'a': 1, 'c_a': 2, 'c_b_x': 5, 'c_b_y': 10, 'd': [1, 2, 3]}

If you are already using pandas, you can do it with json_normalize() like so:

import pandas as pd

d = {'a': 1,
     'c': {'a': 2, 'b': {'x': 5, 'y' : 10}},
     'd': [1, 2, 3]}

df = pd.json_normalize(d, sep='_')
d_flat = df.to_dict(orient='records')[0]

print(d_flat)

Output:

{'a': 1, 'c_a': 2, 'c_b_x': 5, 'c_b_y': 10, 'd': [1, 2, 3]}
云醉月微眠 2024-11-14 20:34:57

原始发布者需要考虑两个重要因素:

  1. 是否存在密钥空间破坏问题?例如, {'a_b':{'c':1}, 'a':{'b_c':2}} 将导致 {'a_b_c':???}。下面的解决方案通过返回可迭代的对来避免这个问题。
  2. 如果性能是一个问题,那么 key-reducer 函数(我在此将其称为“join”)是否需要访问整个 key-path,或者它是否可以在树中的每个节点上执行 O(1) 工作?如果您希望能够说 joinedKey = '_'.join(*keys),这将花费您 O(N^2) 的运行时间。但是,如果您愿意说 nextKey = previousKey+'_'+thisKey,那么您将获得 O(N) 时间。下面的解决方案允许您同时执行这两项操作(因为您只需连接所有键,然后对它们进行后处理)。

(性能不太可能是一个问题,但我将详细说明第二点,以防其他人关心:在实现这一点时,有许多危险的选择。如果您递归地执行此操作并产生并重新产生,或者任何东西 相当于多次接触节点(这很容易意外发生),您正在做潜在的 O(N^2) 工作,而不是 O(N) 这是因为您可能正在计算一个键。 >a 然后a_1 然后 a_1_i...,然后计算 a 然后 a_1 然后 a_1_ii ...,但实际上您不必再次计算 a_1 即使您没有重新计算它,重新生成它(“逐级”方法)也一样。一个很好的例子就是思考一下。 {1:{1:{1:{1:...(N times)...{1:SOME_LARGE_DICTIONARY_OF_SIZE_N}...}}}}

上的性能编写了 flattenDict(d, join=..., lift=...) ,它可以适应多种用途,并且可以做您想做的事情。遗憾的是,要在不产生上述性能损失的情况下制作这个函数的惰性版本是相当困难的(许多像 chain.from_iterable 这样的 python 内置函数实际上并不高效,我只是在对这个代码的三个不同版本进行了广泛测试之后才意识到这一点,然后才决定使用这个)。

from collections import Mapping
from itertools import chain
from operator import add

_FLAG_FIRST = object()

def flattenDict(d, join=add, lift=lambda x:(x,)):
    results = []
    def visit(subdict, results, partialKey):
        for k,v in subdict.items():
            newKey = lift(k) if partialKey==_FLAG_FIRST else join(partialKey,lift(k))
            if isinstance(v,Mapping):
                visit(v, results, newKey)
            else:
                results.append((newKey,v))
    visit(d, results, _FLAG_FIRST)
    return results

为了更好地理解发生了什么,下面为那些不熟悉 reduce(左)(也称为“向左折叠”)的人提供了一个图表。有时它会用初始值代替 k0(不是列表的一部分,传递到函数中)来绘制。这里,J 是我们的 join 函数。我们用 lift(k) 预处理每个 kn

               [k0,k1,...,kN].foldleft(J)
                           /    \
                         ...    kN
                         /
       J(k0,J(k1,J(k2,k3)))
                       /  \
                      /    \
           J(J(k0,k1),k2)   k3
                    /   \
                   /     \
             J(k0,k1)    k2
                 /  \
                /    \
               k0     k1

这实际上与 functools.reduce 相同,但我们的函数对树的所有关键路径执行此操作。

>>> reduce(lambda a,b:(a,b), range(5))
((((0, 1), 2), 3), 4)

演示(否则我会放入文档字符串中):

>>> testData = {
        'a':1,
        'b':2,
        'c':{
            'aa':11,
            'bb':22,
            'cc':{
                'aaa':111
            }
        }
    }
from pprint import pprint as pp

>>> pp(dict( flattenDict(testData) ))
{('a',): 1,
 ('b',): 2,
 ('c', 'aa'): 11,
 ('c', 'bb'): 22,
 ('c', 'cc', 'aaa'): 111}

>>> pp(dict( flattenDict(testData, join=lambda a,b:a+'_'+b, lift=lambda x:x) ))
{'a': 1, 'b': 2, 'c_aa': 11, 'c_bb': 22, 'c_cc_aaa': 111}    

>>> pp(dict( (v,k) for k,v in flattenDict(testData, lift=hash, join=lambda a,b:hash((a,b))) ))
{1: 12416037344,
 2: 12544037731,
 11: 5470935132935744593,
 22: 4885734186131977315,
 111: 3461911260025554326}

性能:

from functools import reduce
def makeEvilDict(n):
    return reduce(lambda acc,x:{x:acc}, [{i:0 for i in range(n)}]+range(n))

import timeit
def time(runnable):
    t0 = timeit.default_timer()
    _ = runnable()
    t1 = timeit.default_timer()
    print('took {:.2f} seconds'.format(t1-t0))

>>> pp(makeEvilDict(8))
{7: {6: {5: {4: {3: {2: {1: {0: {0: 0,
                                 1: 0,
                                 2: 0,
                                 3: 0,
                                 4: 0,
                                 5: 0,
                                 6: 0,
                                 7: 0}}}}}}}}}

import sys
sys.setrecursionlimit(1000000)

forget = lambda a,b:''

>>> time(lambda: dict(flattenDict(makeEvilDict(10000), join=forget)) )
took 0.10 seconds
>>> time(lambda: dict(flattenDict(makeEvilDict(100000), join=forget)) )
[1]    12569 segmentation fault  python

...叹息,不要认为这是我的错...


[由于审核问题而产生的不重要的历史注释]

关于所谓的重复 压平字典的字典(2层深)列表

该问题的解决方案可以通过执行 sorted( sum(flatten(...),[]) ) 来实现。相反的情况是不可能的:虽然确实可以通过映射高阶累加器从所谓的重复项中恢复 flatten(...),但无法恢复密钥。 (编辑:此外,事实证明,所谓的重复所有者的问题完全不同,因为它只处理恰好 2 层深度的字典,尽管该页面上的答案之一给出了通用解决方案。)

There are two big considerations that the original poster needs to consider:

  1. Are there keyspace clobbering issues? For example, {'a_b':{'c':1}, 'a':{'b_c':2}} would result in {'a_b_c':???}. The below solution evades the problem by returning an iterable of pairs.
  2. If performance is an issue, does the key-reducer function (which I hereby refer to as 'join') require access to the entire key-path, or can it just do O(1) work at every node in the tree? If you want to be able to say joinedKey = '_'.join(*keys), that will cost you O(N^2) running time. However if you're willing to say nextKey = previousKey+'_'+thisKey, that gets you O(N) time. The solution below lets you do both (since you could merely concatenate all the keys, then postprocess them).

(Performance is not likely an issue, but I'll elaborate on the second point in case anyone else cares: In implementing this, there are numerous dangerous choices. If you do this recursively and yield and re-yield, or anything equivalent which touches nodes more than once (which is quite easy to accidentally do), you are doing potentially O(N^2) work rather than O(N). This is because maybe you are calculating a key a then a_1 then a_1_i..., and then calculating a then a_1 then a_1_ii..., but really you shouldn't have to calculate a_1 again. Even if you aren't recalculating it, re-yielding it (a 'level-by-level' approach) is just as bad. A good example is to think about the performance on {1:{1:{1:{1:...(N times)...{1:SOME_LARGE_DICTIONARY_OF_SIZE_N}...}}}})

Below is a function I wrote flattenDict(d, join=..., lift=...) which can be adapted to many purposes and can do what you want. Sadly it is fairly hard to make a lazy version of this function without incurring the above performance penalties (many python builtins like chain.from_iterable aren't actually efficient, which I only realized after extensive testing of three different versions of this code before settling on this one).

from collections import Mapping
from itertools import chain
from operator import add

_FLAG_FIRST = object()

def flattenDict(d, join=add, lift=lambda x:(x,)):
    results = []
    def visit(subdict, results, partialKey):
        for k,v in subdict.items():
            newKey = lift(k) if partialKey==_FLAG_FIRST else join(partialKey,lift(k))
            if isinstance(v,Mapping):
                visit(v, results, newKey)
            else:
                results.append((newKey,v))
    visit(d, results, _FLAG_FIRST)
    return results

To better understand what's going on, below is a diagram for those unfamiliar with reduce(left), otherwise known as "fold left". Sometimes it is drawn with an initial value in place of k0 (not part of the list, passed into the function). Here, J is our join function. We preprocess each kn with lift(k).

               [k0,k1,...,kN].foldleft(J)
                           /    \
                         ...    kN
                         /
       J(k0,J(k1,J(k2,k3)))
                       /  \
                      /    \
           J(J(k0,k1),k2)   k3
                    /   \
                   /     \
             J(k0,k1)    k2
                 /  \
                /    \
               k0     k1

This is in fact the same as functools.reduce, but where our function does this to all key-paths of the tree.

>>> reduce(lambda a,b:(a,b), range(5))
((((0, 1), 2), 3), 4)

Demonstration (which I'd otherwise put in docstring):

>>> testData = {
        'a':1,
        'b':2,
        'c':{
            'aa':11,
            'bb':22,
            'cc':{
                'aaa':111
            }
        }
    }
from pprint import pprint as pp

>>> pp(dict( flattenDict(testData) ))
{('a',): 1,
 ('b',): 2,
 ('c', 'aa'): 11,
 ('c', 'bb'): 22,
 ('c', 'cc', 'aaa'): 111}

>>> pp(dict( flattenDict(testData, join=lambda a,b:a+'_'+b, lift=lambda x:x) ))
{'a': 1, 'b': 2, 'c_aa': 11, 'c_bb': 22, 'c_cc_aaa': 111}    

>>> pp(dict( (v,k) for k,v in flattenDict(testData, lift=hash, join=lambda a,b:hash((a,b))) ))
{1: 12416037344,
 2: 12544037731,
 11: 5470935132935744593,
 22: 4885734186131977315,
 111: 3461911260025554326}

Performance:

from functools import reduce
def makeEvilDict(n):
    return reduce(lambda acc,x:{x:acc}, [{i:0 for i in range(n)}]+range(n))

import timeit
def time(runnable):
    t0 = timeit.default_timer()
    _ = runnable()
    t1 = timeit.default_timer()
    print('took {:.2f} seconds'.format(t1-t0))

>>> pp(makeEvilDict(8))
{7: {6: {5: {4: {3: {2: {1: {0: {0: 0,
                                 1: 0,
                                 2: 0,
                                 3: 0,
                                 4: 0,
                                 5: 0,
                                 6: 0,
                                 7: 0}}}}}}}}}

import sys
sys.setrecursionlimit(1000000)

forget = lambda a,b:''

>>> time(lambda: dict(flattenDict(makeEvilDict(10000), join=forget)) )
took 0.10 seconds
>>> time(lambda: dict(flattenDict(makeEvilDict(100000), join=forget)) )
[1]    12569 segmentation fault  python

... sigh, don't think that one is my fault...


[unimportant historical note due to moderation issues]

Regarding the alleged duplicate of Flatten a dictionary of dictionaries (2 levels deep) of lists

That question's solution can be implemented in terms of this one by doing sorted( sum(flatten(...),[]) ). The reverse is not possible: while it is true that the values of flatten(...) can be recovered from the alleged duplicate by mapping a higher-order accumulator, one cannot recover the keys. (edit: Also it turns out that the alleged duplicate owner's question is completely different, in that it only deals with dictionaries exactly 2-level deep, though one of the answers on that page gives a general solution.)

叶落知秋 2024-11-14 20:34:57

如果您使用的是 pandaspandas.io.json._normalize1 中隐藏着一个名为 nested_to_record 的函数这正是这样做的。

from pandas.io.json._normalize import nested_to_record    

flat = nested_to_record(my_dict, sep='_')

1 在 pandas 版本 0.24.x 及更早版本中使用 pandas.io.json.normalize (不带 _

If you're using pandas there is a function hidden in pandas.io.json._normalize1 called nested_to_record which does this exactly.

from pandas.io.json._normalize import nested_to_record    

flat = nested_to_record(my_dict, sep='_')

1 In pandas versions 0.24.x and older use pandas.io.json.normalize (without the _)

凉宸 2024-11-14 20:34:57

不完全是OP所要求的,但很多人来这里寻找方法来展平现实世界的嵌套JSON数据,这些数据可以嵌套键值json对象和数组以及数组内的json对象等等。 JSON 不包含元组,因此我们不必担心这些。

我找到了列表包含 @roneo 对 < href="https://stackoverflow.com/a/6027615/4355695">发布者的答案@Imran:

https://github.com/ ScriptSmith/socialreaper/blob/master/socialreaper/tools.py#L8

import collections
def flatten(dictionary, parent_key=False, separator='.'):
    """
    Turn a nested dictionary into a flattened dictionary
    :param dictionary: The dictionary to flatten
    :param parent_key: The string to prepend to dictionary's keys
    :param separator: The string used to separate flattened keys
    :return: A flattened dictionary
    """

    items = []
    for key, value in dictionary.items():
        new_key = str(parent_key) + separator + key if parent_key else key
        if isinstance(value, collections.abc.MutableMapping):
            items.extend(flatten(value, new_key, separator).items())
        elif isinstance(value, list):
            for k, v in enumerate(value):
                items.extend(flatten({str(k): v}, new_key).items())
        else:
            items.append((new_key, value))
    return dict(items)

测试一下:

flatten({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3] })

>> {'a': 1, 'c.a': 2, 'c.b.x': 5, 'c.b.y': 10, 'd.0': 1, 'd.1': 2, 'd.2': 3}

这完成了我需要完成的工作:我将任何复杂的 json 扔到这里,它会为我将其展平。

所有功劳均归功于 https://github.com/ScriptSmith

2023-06-14 python 更新 >= 3.10

自 Python 3.10 起,collections.MutableMapping 已更改为 collections.abc.MutableMapping。因此,上面的代码经过编辑以反映相同的情况。如果您的Python版本是3.10之前,请将其改回您身边的collections.MutableMapping
参考:https://stackoverflow.com/a/71902541/4355695

Not exactly what the OP asked, but lots of folks are coming here looking for ways to flatten real-world nested JSON data which can have nested key-value json objects and arrays and json objects inside the arrays and so on. JSON doesn't include tuples, so we don't have to fret over those.

I found an implementation of the list-inclusion comment by @roneo to the answer posted by @Imran :

https://github.com/ScriptSmith/socialreaper/blob/master/socialreaper/tools.py#L8

import collections
def flatten(dictionary, parent_key=False, separator='.'):
    """
    Turn a nested dictionary into a flattened dictionary
    :param dictionary: The dictionary to flatten
    :param parent_key: The string to prepend to dictionary's keys
    :param separator: The string used to separate flattened keys
    :return: A flattened dictionary
    """

    items = []
    for key, value in dictionary.items():
        new_key = str(parent_key) + separator + key if parent_key else key
        if isinstance(value, collections.abc.MutableMapping):
            items.extend(flatten(value, new_key, separator).items())
        elif isinstance(value, list):
            for k, v in enumerate(value):
                items.extend(flatten({str(k): v}, new_key).items())
        else:
            items.append((new_key, value))
    return dict(items)

Test it:

flatten({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3] })

>> {'a': 1, 'c.a': 2, 'c.b.x': 5, 'c.b.y': 10, 'd.0': 1, 'd.1': 2, 'd.2': 3}

Annd that does the job I need done: I throw any complicated json at this and it flattens it out for me.

All credits to https://github.com/ScriptSmith .

2023-06-14 Update for python >= 3.10

Since Python 3.10, collections.MutableMapping has changed to collections.abc.MutableMapping. Hence code above is edited to reflect the same. If your python version is before 3.10, please change it back to collections.MutableMapping at your side.
Ref: https://stackoverflow.com/a/71902541/4355695

野の 2024-11-14 20:34:57

这是一种“功能性”、“单行”实现。它是递归的,基于条件表达式和字典理解。

def flatten_dict(dd, separator='_', prefix=''):
    return { prefix + separator + k if prefix else k : v
             for kk, vv in dd.items()
             for k, v in flatten_dict(vv, separator, kk).items()
             } if isinstance(dd, dict) else { prefix : dd }

测试:

In [2]: flatten_dict({'abc':123, 'hgf':{'gh':432, 'yu':433}, 'gfd':902, 'xzxzxz':{"432":{'0b0b0b':231}, "43234":1321}}, '.')
Out[2]: 
{'abc': 123,
 'gfd': 902,
 'hgf.gh': 432,
 'hgf.yu': 433,
 'xzxzxz.432.0b0b0b': 231,
 'xzxzxz.43234': 1321}

Here is a kind of a "functional", "one-liner" implementation. It is recursive, and based on a conditional expression and a dict comprehension.

def flatten_dict(dd, separator='_', prefix=''):
    return { prefix + separator + k if prefix else k : v
             for kk, vv in dd.items()
             for k, v in flatten_dict(vv, separator, kk).items()
             } if isinstance(dd, dict) else { prefix : dd }

Test:

In [2]: flatten_dict({'abc':123, 'hgf':{'gh':432, 'yu':433}, 'gfd':902, 'xzxzxz':{"432":{'0b0b0b':231}, "43234":1321}}, '.')
Out[2]: 
{'abc': 123,
 'gfd': 902,
 'hgf.gh': 432,
 'hgf.yu': 433,
 'xzxzxz.432.0b0b0b': 231,
 'xzxzxz.43234': 1321}
凉城已无爱 2024-11-14 20:34:57

代码:

test = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]}

def parse_dict(init, lkey=''):
    ret = {}
    for rkey,val in init.items():
        key = lkey+rkey
        if isinstance(val, dict):
            ret.update(parse_dict(val, key+'_'))
        else:
            ret[key] = val
    return ret

print(parse_dict(test,''))

结果:

$ python test.py
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}

我使用的是python3.2,更新为你的python版本。

Code:

test = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]}

def parse_dict(init, lkey=''):
    ret = {}
    for rkey,val in init.items():
        key = lkey+rkey
        if isinstance(val, dict):
            ret.update(parse_dict(val, key+'_'))
        else:
            ret[key] = val
    return ret

print(parse_dict(test,''))

Results:

$ python test.py
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}

I am using python3.2, update for your version of python.

乄_柒ぐ汐 2024-11-14 20:34:57

这不仅限于字典,还包括实现 .items() 的每个映射类型。进一步更快,因为它避免了 if 条件。尽管如此,功劳归于伊姆兰:

def flatten(d, parent_key=''):
    items = []
    for k, v in d.items():
        try:
            items.extend(flatten(v, '%s%s_' % (parent_key, k)).items())
        except AttributeError:
            items.append(('%s%s' % (parent_key, k), v))
    return dict(items)

This is not restricted to dictionaries, but every mapping type that implements .items(). Further ist faster as it avoides an if condition. Nevertheless credits go to Imran:

def flatten(d, parent_key=''):
    items = []
    for k, v in d.items():
        try:
            items.extend(flatten(v, '%s%s_' % (parent_key, k)).items())
        except AttributeError:
            items.append(('%s%s' % (parent_key, k), v))
    return dict(items)
黑凤梨 2024-11-14 20:34:57

Python3.5 中的功能性和高性能解决方案怎么样?

from functools import reduce


def _reducer(items, key, val, pref):
    if isinstance(val, dict):
        return {**items, **flatten(val, pref + key)}
    else:
        return {**items, pref + key: val}

def flatten(d, pref=''):
    return(reduce(
        lambda new_d, kv: _reducer(new_d, *kv, pref), 
        d.items(), 
        {}
    ))

这甚至更加高效:

def flatten(d, pref=''):
    return(reduce(
        lambda new_d, kv: \
            isinstance(kv[1], dict) and \
            {**new_d, **flatten(kv[1], pref + kv[0])} or \
            {**new_d, pref + kv[0]: kv[1]}, 
        d.items(), 
        {}
    ))

使用中:

my_obj = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y': 10}}, 'd': [1, 2, 3]}

print(flatten(my_obj)) 
# {'d': [1, 2, 3], 'cby': 10, 'cbx': 5, 'ca': 2, 'a': 1}

How about a functional and performant solution in Python3.5?

from functools import reduce


def _reducer(items, key, val, pref):
    if isinstance(val, dict):
        return {**items, **flatten(val, pref + key)}
    else:
        return {**items, pref + key: val}

def flatten(d, pref=''):
    return(reduce(
        lambda new_d, kv: _reducer(new_d, *kv, pref), 
        d.items(), 
        {}
    ))

This is even more performant:

def flatten(d, pref=''):
    return(reduce(
        lambda new_d, kv: \
            isinstance(kv[1], dict) and \
            {**new_d, **flatten(kv[1], pref + kv[0])} or \
            {**new_d, pref + kv[0]: kv[1]}, 
        d.items(), 
        {}
    ))

In use:

my_obj = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y': 10}}, 'd': [1, 2, 3]}

print(flatten(my_obj)) 
# {'d': [1, 2, 3], 'cby': 10, 'cbx': 5, 'ca': 2, 'a': 1}
忘你却要生生世世 2024-11-14 20:34:57

如果您是pythonic oneliners的粉丝:

my_dict={'a': 1,'c': {'a': 2,'b': {'x': 5,'y' : 10}},'d': [1, 2, 3]}

list(pd.json_normalize(my_dict).T.to_dict().values())[0]

返回:

{'a': 1, 'c.a': 2, 'c.b.x': 5, 'c.b.y': 10, 'd': [1, 2, 3]}

如果您有一个字典列表而不仅仅是一个字典,则可以从末尾保留[0]

If you are a fan of pythonic oneliners:

my_dict={'a': 1,'c': {'a': 2,'b': {'x': 5,'y' : 10}},'d': [1, 2, 3]}

list(pd.json_normalize(my_dict).T.to_dict().values())[0]

returns:

{'a': 1, 'c.a': 2, 'c.b.x': 5, 'c.b.y': 10, 'd': [1, 2, 3]}

You can leave the [0] from the end, if you have a list of dictionaries and not just a single dictionary.

山有枢 2024-11-14 20:34:57

我使用生成器的 Python 3.3 解决方案:

def flattenit(pyobj, keystring=''):
   if type(pyobj) is dict:
     if (type(pyobj) is dict):
         keystring = keystring + "_" if keystring else keystring
         for k in pyobj:
             yield from flattenit(pyobj[k], keystring + k)
     elif (type(pyobj) is list):
         for lelm in pyobj:
             yield from flatten(lelm, keystring)
   else:
      yield keystring, pyobj

my_obj = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y': 10}}, 'd': [1, 2, 3]}

#your flattened dictionary object
flattened={k:v for k,v in flattenit(my_obj)}
print(flattened)

# result: {'c_b_y': 10, 'd': [1, 2, 3], 'c_a': 2, 'a': 1, 'c_b_x': 5}

My Python 3.3 Solution using generators:

def flattenit(pyobj, keystring=''):
   if type(pyobj) is dict:
     if (type(pyobj) is dict):
         keystring = keystring + "_" if keystring else keystring
         for k in pyobj:
             yield from flattenit(pyobj[k], keystring + k)
     elif (type(pyobj) is list):
         for lelm in pyobj:
             yield from flatten(lelm, keystring)
   else:
      yield keystring, pyobj

my_obj = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y': 10}}, 'd': [1, 2, 3]}

#your flattened dictionary object
flattened={k:v for k,v in flattenit(my_obj)}
print(flattened)

# result: {'c_b_y': 10, 'd': [1, 2, 3], 'c_a': 2, 'a': 1, 'c_b_x': 5}
塔塔猫 2024-11-14 20:34:57

利用递归,保持简单和人类可读:

def flatten_dict(dictionary, accumulator=None, parent_key=None, separator="."):
    if accumulator is None:
        accumulator = {}

    for k, v in dictionary.items():
        k = f"{parent_key}{separator}{k}" if parent_key else k
        if isinstance(v, dict):
            flatten_dict(dictionary=v, accumulator=accumulator, parent_key=k)
            continue

        accumulator[k] = v

    return accumulator

调用很简单:

new_dict = flatten_dict(dictionary)

或者

new_dict = flatten_dict(dictionary, separator="_")

如果我们想更改默认分隔符。

一点小细节:

首次调用该函数时,仅传递我们想要展平的字典。 Accumulator 参数在这里支持递归,我们稍后会看到。因此,我们将累加器实例化为一个空字典,将原始字典中的所有嵌套值放入其中。

if accumulator is None:
    accumulator = {}

当我们迭代字典的值时,我们为每个值构造一个键。对于第一次调用,parent_key 参数将为 None,而对于每个嵌套字典,它将包含指向它的键,因此我们在前面添加该键。

k = f"{parent_key}{separator}{k}" if parent_key else k

如果值 v 的键 k 指向的是一个字典,则该函数调用自身,传递嵌套字典、累加器(其中通过引用传递,因此对它所做的所有更改都是在同一个实例上完成的)和键 k ,以便我们可以构造串联键。请注意 continue 语句。我们希望跳过 if 块之外的下一行,以便嵌套字典不会最终出现在键 kaccumulator 中>。

if isinstance(v, dict):
    flatten_dict(dict=v, accumulator=accumulator, parent_key=k)
    continue

那么,如果值 v 不是字典,我们该怎么办?只需将其原封不动地放入累加器中即可。

accumulator[k] = v

完成后,我们只需返回累加器,而原始字典参数保持不变。

注意

这仅适用于以字符串作为键的字典。它将与实现 __repr__ 方法的可哈希对象一起使用,但会产生不需要的结果。

Utilizing recursion, keeping it simple and human readable:

def flatten_dict(dictionary, accumulator=None, parent_key=None, separator="."):
    if accumulator is None:
        accumulator = {}

    for k, v in dictionary.items():
        k = f"{parent_key}{separator}{k}" if parent_key else k
        if isinstance(v, dict):
            flatten_dict(dictionary=v, accumulator=accumulator, parent_key=k)
            continue

        accumulator[k] = v

    return accumulator

Call is simple:

new_dict = flatten_dict(dictionary)

or

new_dict = flatten_dict(dictionary, separator="_")

if we want to change the default separator.

A little breakdown:

When the function is first called, it is called only passing the dictionary we want to flatten. The accumulator parameter is here to support recursion, which we see later. So, we instantiate accumulator to an empty dictionary where we will put all of the nested values from the original dictionary.

if accumulator is None:
    accumulator = {}

As we iterate over the dictionary's values, we construct a key for every value. The parent_key argument will be None for the first call, while for every nested dictionary, it will contain the key pointing to it, so we prepend that key.

k = f"{parent_key}{separator}{k}" if parent_key else k

In case the value v the key k is pointing to is a dictionary, the function calls itself, passing the nested dictionary, the accumulator (which is passed by reference, so all changes done to it are done on the same instance) and the key k so that we can construct the concatenated key. Notice the continue statement. We want to skip the next line, outside of the if block, so that the nested dictionary doesn't end up in the accumulator under key k.

if isinstance(v, dict):
    flatten_dict(dict=v, accumulator=accumulator, parent_key=k)
    continue

So, what do we do in case the value v is not a dictionary? Just put it unchanged inside the accumulator.

accumulator[k] = v

Once we're done we just return the accumulator, leaving the original dictionary argument untouched.

NOTE

This will work only with dictionaries that have strings as keys. It will work with hashable objects implementing the __repr__ method, but will yield unwanted results.

桃气十足 2024-11-14 20:34:57

这是使用堆栈的解决方案。没有递归。

def flatten_nested_dict(nested):
    stack = list(nested.items())
    ans = {}
    while stack:
        key, val = stack.pop()
        if isinstance(val, dict):
            for sub_key, sub_val in val.items():
                stack.append((f"{key}_{sub_key}", sub_val))
        else:
            ans[key] = val
    return ans

here's a solution using a stack. No recursion.

def flatten_nested_dict(nested):
    stack = list(nested.items())
    ans = {}
    while stack:
        key, val = stack.pop()
        if isinstance(val, dict):
            for sub_key, sub_val in val.items():
                stack.append((f"{key}_{sub_key}", sub_val))
        else:
            ans[key] = val
    return ans
玻璃人 2024-11-14 20:34:57

用于展平嵌套字典的简单函数。对于 Python 3,将 .iteritems() 替换为 .items()

def flatten_dict(init_dict):
    res_dict = {}
    if type(init_dict) is not dict:
        return res_dict

    for k, v in init_dict.iteritems():
        if type(v) == dict:
            res_dict.update(flatten_dict(v))
        else:
            res_dict[k] = v

    return res_dict

想法/要求是:
获取扁平字典,无需保留父密钥。

使用示例:

dd = {'a': 3, 
      'b': {'c': 4, 'd': 5}, 
      'e': {'f': 
                 {'g': 1, 'h': 2}
           }, 
      'i': 9,
     }

flatten_dict(dd)

>> {'a': 3, 'c': 4, 'd': 5, 'g': 1, 'h': 2, 'i': 9}

保留父键也很简单。

Simple function to flatten nested dictionaries. For Python 3, replace .iteritems() with .items()

def flatten_dict(init_dict):
    res_dict = {}
    if type(init_dict) is not dict:
        return res_dict

    for k, v in init_dict.iteritems():
        if type(v) == dict:
            res_dict.update(flatten_dict(v))
        else:
            res_dict[k] = v

    return res_dict

The idea/requirement was:
Get flat dictionaries with no keeping parent keys.

Example of usage:

dd = {'a': 3, 
      'b': {'c': 4, 'd': 5}, 
      'e': {'f': 
                 {'g': 1, 'h': 2}
           }, 
      'i': 9,
     }

flatten_dict(dd)

>> {'a': 3, 'c': 4, 'd': 5, 'g': 1, 'h': 2, 'i': 9}

Keeping parent keys is simple as well.

淡紫姑娘! 2024-11-14 20:34:57

我正在考虑 UserDict 的子类来自动平放按键。

class FlatDict(UserDict):
    def __init__(self, *args, separator='.', **kwargs):
        self.separator = separator
        super().__init__(*args, **kwargs)

    def __setitem__(self, key, value):
        if isinstance(value, dict):
            for k1, v1 in FlatDict(value, separator=self.separator).items():
                super().__setitem__(f"{key}{self.separator}{k1}", v1)
        else:
            super().__setitem__(key, value)

‌‌
可以即时添加键或使用标准字典实例化的优点,毫不奇怪:

>>> fd = FlatDict(
...    {
...        'person': {
...            'sexe': 'male', 
...            'name': {
...                'first': 'jacques',
...                'last': 'dupond'
...            }
...        }
...    }
... )
>>> fd
{'person.sexe': 'male', 'person.name.first': 'jacques', 'person.name.last': 'dupond'}
>>> fd['person'] = {'name': {'nickname': 'Bob'}}
>>> fd
{'person.sexe': 'male', 'person.name.first': 'jacques', 'person.name.last': 'dupond', 'person.name.nickname': 'Bob'}
>>> fd['person.name'] = {'civility': 'Dr'}
>>> fd
{'person.sexe': 'male', 'person.name.first': 'jacques', 'person.name.last': 'dupond', 'person.name.nickname': 'Bob', 'person.name.civility': 'Dr'}

I was thinking of a subclass of UserDict to automagically flat the keys.

class FlatDict(UserDict):
    def __init__(self, *args, separator='.', **kwargs):
        self.separator = separator
        super().__init__(*args, **kwargs)

    def __setitem__(self, key, value):
        if isinstance(value, dict):
            for k1, v1 in FlatDict(value, separator=self.separator).items():
                super().__setitem__(f"{key}{self.separator}{k1}", v1)
        else:
            super().__setitem__(key, value)


The advantages it that keys can be added on the fly, or using standard dict instanciation, without surprise:

>>> fd = FlatDict(
...    {
...        'person': {
...            'sexe': 'male', 
...            'name': {
...                'first': 'jacques',
...                'last': 'dupond'
...            }
...        }
...    }
... )
>>> fd
{'person.sexe': 'male', 'person.name.first': 'jacques', 'person.name.last': 'dupond'}
>>> fd['person'] = {'name': {'nickname': 'Bob'}}
>>> fd
{'person.sexe': 'male', 'person.name.first': 'jacques', 'person.name.last': 'dupond', 'person.name.nickname': 'Bob'}
>>> fd['person.name'] = {'civility': 'Dr'}
>>> fd
{'person.sexe': 'male', 'person.name.first': 'jacques', 'person.name.last': 'dupond', 'person.name.nickname': 'Bob', 'person.name.civility': 'Dr'}
鼻尖触碰 2024-11-14 20:34:57

这与伊姆兰和拉鲁的回答相似。它不使用生成器,而是使用带有闭包的递归:

def flatten_dict(d, separator='_'):
  final = {}
  def _flatten_dict(obj, parent_keys=[]):
    for k, v in obj.iteritems():
      if isinstance(v, dict):
        _flatten_dict(v, parent_keys + [k])
      else:
        key = separator.join(parent_keys + [k])
        final[key] = v
  _flatten_dict(d)
  return final

>>> print flatten_dict({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]})
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}

This is similar to both imran's and ralu's answer. It does not use a generator, but instead employs recursion with a closure:

def flatten_dict(d, separator='_'):
  final = {}
  def _flatten_dict(obj, parent_keys=[]):
    for k, v in obj.iteritems():
      if isinstance(v, dict):
        _flatten_dict(v, parent_keys + [k])
      else:
        key = separator.join(parent_keys + [k])
        final[key] = v
  _flatten_dict(d)
  return final

>>> print flatten_dict({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]})
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}
梦一生花开无言 2024-11-14 20:34:57

上面的答案确实很好用。只是想我会添加我编写的 unflatten 函数:

def unflatten(d):
    ud = {}
    for k, v in d.items():
        context = ud
        for sub_key in k.split('_')[:-1]:
            if sub_key not in context:
                context[sub_key] = {}
            context = context[sub_key]
        context[k.split('_')[-1]] = v
    return ud

注意:这并不考虑键中已经存在的“_”,就像展平对应项一样。

The answers above work really well. Just thought I'd add the unflatten function that I wrote:

def unflatten(d):
    ud = {}
    for k, v in d.items():
        context = ud
        for sub_key in k.split('_')[:-1]:
            if sub_key not in context:
                context[sub_key] = {}
            context = context[sub_key]
        context[k.split('_')[-1]] = v
    return ud

Note: This doesn't account for '_' already present in keys, much like the flatten counterparts.

迷雾森÷林ヴ 2024-11-14 20:34:57

Davoud 的解决方案非常好,但当嵌套字典还包含字典列表时,不会给出令人满意的结果,但他的代码适合这种情况:

def flatten_dict(d):
    items = []
    for k, v in d.items():
        try:
            if (type(v)==type([])): 
                for l in v: items.extend(flatten_dict(l).items())
            else: 
                items.extend(flatten_dict(v).items())
        except AttributeError:
            items.append((k, v))
    return dict(items)

Davoud's solution is very nice but doesn't give satisfactory results when the nested dict also contains lists of dicts, but his code be adapted for that case:

def flatten_dict(d):
    items = []
    for k, v in d.items():
        try:
            if (type(v)==type([])): 
                for l in v: items.extend(flatten_dict(l).items())
            else: 
                items.extend(flatten_dict(v).items())
        except AttributeError:
            items.append((k, v))
    return dict(items)
东京女 2024-11-14 20:34:57
def flatten(unflattened_dict, separator='_'):
    flattened_dict = {}

    for k, v in unflattened_dict.items():
        if isinstance(v, dict):
            sub_flattened_dict = flatten(v, separator)
            for k2, v2 in sub_flattened_dict.items():
                flattened_dict[k + separator + k2] = v2
        else:
            flattened_dict[k] = v

    return flattened_dict
def flatten(unflattened_dict, separator='_'):
    flattened_dict = {}

    for k, v in unflattened_dict.items():
        if isinstance(v, dict):
            sub_flattened_dict = flatten(v, separator)
            for k2, v2 in sub_flattened_dict.items():
                flattened_dict[k + separator + k2] = v2
        else:
            flattened_dict[k] = v

    return flattened_dict
抚你发端 2024-11-14 20:34:57

事实上,我最近写了一个名为cherrypicker的包来处理这类事情,因为我不得不经常这样做!

我认为以下代码将为您提供所需的信息:

from cherrypicker import CherryPicker

dct = {
    'a': 1,
    'c': {
        'a': 2,
        'b': {
            'x': 5,
            'y' : 10
        }
    },
    'd': [1, 2, 3]
}

picker = CherryPicker(dct)
picker.flatten().get()

您可以使用以下命令安装软件包:

pip install cherrypicker

...并且在 https://cherrypicker.readthedocs.io

其他方法可能会更快,但此包的首要任务是使此类任务变得简单。如果您确实有大量对象需要展平,您也可以告诉 CherryPicker 使用并行处理来加快速度。

I actually wrote a package called cherrypicker recently to deal with this exact sort of thing since I had to do it so often!

I think the following code would give you exactly what you're after:

from cherrypicker import CherryPicker

dct = {
    'a': 1,
    'c': {
        'a': 2,
        'b': {
            'x': 5,
            'y' : 10
        }
    },
    'd': [1, 2, 3]
}

picker = CherryPicker(dct)
picker.flatten().get()

You can install the package with:

pip install cherrypicker

...and there's more docs and guidance at https://cherrypicker.readthedocs.io.

Other methods may be faster, but the priority of this package is to make such tasks easy. If you do have a large list of objects to flatten though, you can also tell CherryPicker to use parallel processing to speed things up.

逆流 2024-11-14 20:34:57

使用生成器:

def flat_dic_helper(prepand,d):
    if len(prepand) > 0:
        prepand = prepand + "_"
    for k in d:
        i = d[k]
        if isinstance(i, dict):
            r = flat_dic_helper(prepand + k,i)
            for j in r:
                yield j
        else:
            yield (prepand + k,i)

def flat_dic(d):
    return dict(flat_dic_helper("",d))

d = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]}
print(flat_dic(d))


>> {'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}

Using generators:

def flat_dic_helper(prepand,d):
    if len(prepand) > 0:
        prepand = prepand + "_"
    for k in d:
        i = d[k]
        if isinstance(i, dict):
            r = flat_dic_helper(prepand + k,i)
            for j in r:
                yield j
        else:
            yield (prepand + k,i)

def flat_dic(d):
    return dict(flat_dic_helper("",d))

d = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]}
print(flat_dic(d))


>> {'a': 1, 'c_a': 2, 'c_b_x': 5, 'd': [1, 2, 3], 'c_b_y': 10}
不再见 2024-11-14 20:34:57

使用 Flatdict 库:

dic={'a': 1,
 'c': {'a': 2,
       'b': {'x': 5,
             'y' : 10}},
 'd': [1, 2, 3]}

import flatdict
f =  flatdict.FlatDict(dic,delimiter='_')
print(f)
#output
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'c_b_y': 10, 'd': [1, 2, 3]}

Using flatdict library:

dic={'a': 1,
 'c': {'a': 2,
       'b': {'x': 5,
             'y' : 10}},
 'd': [1, 2, 3]}

import flatdict
f =  flatdict.FlatDict(dic,delimiter='_')
print(f)
#output
{'a': 1, 'c_a': 2, 'c_b_x': 5, 'c_b_y': 10, 'd': [1, 2, 3]}
静谧 2024-11-14 20:34:57

这是一种优雅的就地替换算法。使用 Python 2.7 和 Python 3.5 进行测试。使用点字符作为分隔符。

def flatten_json(json):
    if type(json) == dict:
        for k, v in list(json.items()):
            if type(v) == dict:
                flatten_json(v)
                json.pop(k)
                for k2, v2 in v.items():
                    json[k+"."+k2] = v2

示例:

d = {'a': {'b': 'c'}}                   
flatten_json(d)
print(d)
unflatten_json(d)
print(d)

输出:

{'a.b': 'c'}
{'a': {'b': 'c'}}

我在此处发布了此代码以及匹配的unflatten_json 功能。

Here's an algorithm for elegant, in-place replacement. Tested with Python 2.7 and Python 3.5. Using the dot character as a separator.

def flatten_json(json):
    if type(json) == dict:
        for k, v in list(json.items()):
            if type(v) == dict:
                flatten_json(v)
                json.pop(k)
                for k2, v2 in v.items():
                    json[k+"."+k2] = v2

Example:

d = {'a': {'b': 'c'}}                   
flatten_json(d)
print(d)
unflatten_json(d)
print(d)

Output:

{'a.b': 'c'}
{'a': {'b': 'c'}}

I published this code here along with the matching unflatten_json function.

禾厶谷欠 2024-11-14 20:34:57

如果您想要平面嵌套字典并想要所有唯一键列表,那么这里是解决方案:

def flat_dict_return_unique_key(data, unique_keys=set()):
    if isinstance(data, dict):
        [unique_keys.add(i) for i in data.keys()]
        for each_v in data.values():
            if isinstance(each_v, dict):
                flat_dict_return_unique_key(each_v, unique_keys)
    return list(set(unique_keys))

If you want to flat nested dictionary and want all unique keys list then here is the solution:

def flat_dict_return_unique_key(data, unique_keys=set()):
    if isinstance(data, dict):
        [unique_keys.add(i) for i in data.keys()]
        for each_v in data.values():
            if isinstance(each_v, dict):
                flat_dict_return_unique_key(each_v, unique_keys)
    return list(set(unique_keys))
乜一 2024-11-14 20:34:57

我总是更喜欢通过 .items() 访问 dict 对象,因此为了扁平化 dict,我使用以下递归生成器 flat_items(d)。如果您想再次使用 dict,只需将其包装如下:flat = dict(flat_items(d))

def flat_items(d, key_separator='.'):
    """
    Flattens the dictionary containing other dictionaries like here: https://stackoverflow.com/questions/6027558/flatten-nested-python-dictionaries-compressing-keys

    >>> example = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]}
    >>> flat = dict(flat_items(example, key_separator='_'))
    >>> assert flat['c_b_y'] == 10
    """
    for k, v in d.items():
        if type(v) is dict:
            for k1, v1 in flat_items(v, key_separator=key_separator):
                yield key_separator.join((k, k1)), v1
        else:
            yield k, v

I always prefer access dict objects via .items(), so for flattening dicts I use the following recursive generator flat_items(d). If you like to have dict again, simply wrap it like this: flat = dict(flat_items(d))

def flat_items(d, key_separator='.'):
    """
    Flattens the dictionary containing other dictionaries like here: https://stackoverflow.com/questions/6027558/flatten-nested-python-dictionaries-compressing-keys

    >>> example = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]}
    >>> flat = dict(flat_items(example, key_separator='_'))
    >>> assert flat['c_b_y'] == 10
    """
    for k, v in d.items():
        if type(v) is dict:
            for k1, v1 in flat_items(v, key_separator=key_separator):
                yield key_separator.join((k, k1)), v1
        else:
            yield k, v
浮华 2024-11-14 20:34:57
def flatten_nested_dict(_dict, _str=''):
    '''
    recursive function to flatten a nested dictionary json
    '''
    ret_dict = {}
    for k, v in _dict.items():
        if isinstance(v, dict):
            ret_dict.update(flatten_nested_dict(v, _str = '_'.join([_str, k]).strip('_')))
        elif isinstance(v, list):
            for index, item in enumerate(v):
                if isinstance(item, dict):
                    ret_dict.update(flatten_nested_dict(item,  _str= '_'.join([_str, k, str(index)]).strip('_')))
                else:
                    ret_dict['_'.join([_str, k, str(index)]).strip('_')] = item
        else:
            ret_dict['_'.join([_str, k]).strip('_')] = v
    return ret_dict
def flatten_nested_dict(_dict, _str=''):
    '''
    recursive function to flatten a nested dictionary json
    '''
    ret_dict = {}
    for k, v in _dict.items():
        if isinstance(v, dict):
            ret_dict.update(flatten_nested_dict(v, _str = '_'.join([_str, k]).strip('_')))
        elif isinstance(v, list):
            for index, item in enumerate(v):
                if isinstance(item, dict):
                    ret_dict.update(flatten_nested_dict(item,  _str= '_'.join([_str, k, str(index)]).strip('_')))
                else:
                    ret_dict['_'.join([_str, k, str(index)]).strip('_')] = item
        else:
            ret_dict['_'.join([_str, k]).strip('_')] = v
    return ret_dict
洋洋洒洒 2024-11-14 20:34:57

在简单的类似嵌套列表的递归中使用 dict.popitem() :

def flatten(d):
    if d == {}:
        return d
    else:
        k,v = d.popitem()
        if (dict != type(v)):
            return {k:v, **flatten(d)}
        else:
            flat_kv = flatten(v)
            for k1 in list(flat_kv.keys()):
                flat_kv[k + '_' + k1] = flat_kv[k1]
                del flat_kv[k1]
            return {**flat_kv, **flatten(d)}

Using dict.popitem() in straightforward nested-list-like recursion:

def flatten(d):
    if d == {}:
        return d
    else:
        k,v = d.popitem()
        if (dict != type(v)):
            return {k:v, **flatten(d)}
        else:
            flat_kv = flatten(v)
            for k1 in list(flat_kv.keys()):
                flat_kv[k + '_' + k1] = flat_kv[k1]
                del flat_kv[k1]
            return {**flat_kv, **flatten(d)}
月野兔 2024-11-14 20:34:57

如果您不介意递归函数,这里有一个解决方案。我还冒昧地添加了一个排除参数,以防您希望保留一个或多个值。

代码:

def flatten_dict(dictionary, exclude = [], delimiter ='_'):
    flat_dict = dict()
    for key, value in dictionary.items():
        if isinstance(value, dict) and key not in exclude:
            flatten_value_dict = flatten_dict(value, exclude, delimiter)
            for k, v in flatten_value_dict.items():
                flat_dict[f"{key}{delimiter}{k}"] = v
        else:
            flat_dict[key] = value
    return flat_dict

用法:

d = {'a':1, 'b':[1, 2], 'c':3, 'd':{'a':4, 'b':{'a':7, 'b':8}, 'c':6}, 'e':{'a':1,'b':2}}
flat_d = flatten_dict(dictionary=d, exclude=['e'], delimiter='.')
print(flat_d)

输出:

{'a': 1, 'b': [1, 2], 'c': 3, 'd.a': 4, 'd.b.a': 7, 'd.b.b': 8, 'd.c': 6, 'e': {'a': 1, 'b': 2}}

If you do not mind recursive functions, here is a solution. I have also taken the liberty to include an exclusion-parameter in case there are one or more values you wish to maintain.

Code:

def flatten_dict(dictionary, exclude = [], delimiter ='_'):
    flat_dict = dict()
    for key, value in dictionary.items():
        if isinstance(value, dict) and key not in exclude:
            flatten_value_dict = flatten_dict(value, exclude, delimiter)
            for k, v in flatten_value_dict.items():
                flat_dict[f"{key}{delimiter}{k}"] = v
        else:
            flat_dict[key] = value
    return flat_dict

Usage:

d = {'a':1, 'b':[1, 2], 'c':3, 'd':{'a':4, 'b':{'a':7, 'b':8}, 'c':6}, 'e':{'a':1,'b':2}}
flat_d = flatten_dict(dictionary=d, exclude=['e'], delimiter='.')
print(flat_d)

Output:

{'a': 1, 'b': [1, 2], 'c': 3, 'd.a': 4, 'd.b.a': 7, 'd.b.b': 8, 'd.c': 6, 'e': {'a': 1, 'b': 2}}
獨角戲 2024-11-14 20:34:57

我尝试了此页面上的一些解决方案 - 虽然不是全部 - 但我尝试的解决方案未能处理字典的嵌套列表。

考虑这样的字典:

d = {
        'owner': {
            'name': {'first_name': 'Steven', 'last_name': 'Smith'},
            'lottery_nums': [1, 2, 3, 'four', '11', None],
            'address': {},
            'tuple': (1, 2, 'three'),
            'tuple_with_dict': (1, 2, 'three', {'is_valid': False}),
            'set': {1, 2, 3, 4, 'five'},
            'children': [
                {'name': {'first_name': 'Jessica',
                          'last_name': 'Smith', },
                 'children': []
                 },
                {'name': {'first_name': 'George',
                          'last_name': 'Smith'},
                 'children': []
                 }
            ]
        }
    }

这是我的临时解决方案:

def flatten_dict(input_node: dict, key_: str = '', output_dict: dict = {}):
    if isinstance(input_node, dict):
        for key, val in input_node.items():
            new_key = f"{key_}.{key}" if key_ else f"{key}"
            flatten_dict(val, new_key, output_dict)
    elif isinstance(input_node, list):
        for idx, item in enumerate(input_node):
            flatten_dict(item, f"{key_}.{idx}", output_dict)
    else:
        output_dict[key_] = input_node
    return output_dict

它产生:

{
  owner.name.first_name: Steven,
  owner.name.last_name: Smith,
  owner.lottery_nums.0: 1,
  owner.lottery_nums.1: 2,
  owner.lottery_nums.2: 3,
  owner.lottery_nums.3: four,
  owner.lottery_nums.4: 11,
  owner.lottery_nums.5: None,
  owner.tuple: (1, 2, 'three'),
  owner.tuple_with_dict: (1, 2, 'three', {'is_valid': False}),
  owner.set: {1, 2, 3, 4, 'five'},
  owner.children.0.name.first_name: Jessica,
  owner.children.0.name.last_name: Smith,
  owner.children.1.name.first_name: George,
  owner.children.1.name.last_name: Smith,
}

一个临时解决方案,它并不完美。
注意:

  • 它不会保留空字典,例如 address: {} k/v 对。

  • 它不会展平嵌套元组中的字典 - 尽管使用Python元组的行为类似于列表这一事实很容易添加。

I tried some of the solutions on this page - though not all - but those I tried failed to handle the nested list of dict.

Consider a dict like this:

d = {
        'owner': {
            'name': {'first_name': 'Steven', 'last_name': 'Smith'},
            'lottery_nums': [1, 2, 3, 'four', '11', None],
            'address': {},
            'tuple': (1, 2, 'three'),
            'tuple_with_dict': (1, 2, 'three', {'is_valid': False}),
            'set': {1, 2, 3, 4, 'five'},
            'children': [
                {'name': {'first_name': 'Jessica',
                          'last_name': 'Smith', },
                 'children': []
                 },
                {'name': {'first_name': 'George',
                          'last_name': 'Smith'},
                 'children': []
                 }
            ]
        }
    }

Here's my makeshift solution:

def flatten_dict(input_node: dict, key_: str = '', output_dict: dict = {}):
    if isinstance(input_node, dict):
        for key, val in input_node.items():
            new_key = f"{key_}.{key}" if key_ else f"{key}"
            flatten_dict(val, new_key, output_dict)
    elif isinstance(input_node, list):
        for idx, item in enumerate(input_node):
            flatten_dict(item, f"{key_}.{idx}", output_dict)
    else:
        output_dict[key_] = input_node
    return output_dict

which produces:

{
  owner.name.first_name: Steven,
  owner.name.last_name: Smith,
  owner.lottery_nums.0: 1,
  owner.lottery_nums.1: 2,
  owner.lottery_nums.2: 3,
  owner.lottery_nums.3: four,
  owner.lottery_nums.4: 11,
  owner.lottery_nums.5: None,
  owner.tuple: (1, 2, 'three'),
  owner.tuple_with_dict: (1, 2, 'three', {'is_valid': False}),
  owner.set: {1, 2, 3, 4, 'five'},
  owner.children.0.name.first_name: Jessica,
  owner.children.0.name.last_name: Smith,
  owner.children.1.name.first_name: George,
  owner.children.1.name.last_name: Smith,
}

A makeshift solution and it's not perfect.
NOTE:

  • it doesn't keep empty dicts such as the address: {} k/v pair.

  • it won't flatten dicts in nested tuples - though it would be easy to add using the fact that python tuples act similar to lists.

许你一世情深 2024-11-14 20:34:57

使用 max_level 和自定义化简器压平嵌套字典、压缩键的变体。

  def flatten(d, max_level=None, reducer='tuple'):
      if reducer == 'tuple':
          reducer_seed = tuple()
          reducer_func = lambda x, y: (*x, y)
      else:
          raise ValueError(f'Unknown reducer: {reducer}')

      def impl(d, pref, level):
        return reduce(
            lambda new_d, kv:
                (max_level is None or level < max_level)
                and isinstance(kv[1], dict)
                and {**new_d, **impl(kv[1], reducer_func(pref, kv[0]), level + 1)}
                or {**new_d, reducer_func(pref, kv[0]): kv[1]},
                d.items(),
            {}
        )

      return impl(d, reducer_seed, 0)

Variation of this Flatten nested dictionaries, compressing keys with max_level and custom reducer.

  def flatten(d, max_level=None, reducer='tuple'):
      if reducer == 'tuple':
          reducer_seed = tuple()
          reducer_func = lambda x, y: (*x, y)
      else:
          raise ValueError(f'Unknown reducer: {reducer}')

      def impl(d, pref, level):
        return reduce(
            lambda new_d, kv:
                (max_level is None or level < max_level)
                and isinstance(kv[1], dict)
                and {**new_d, **impl(kv[1], reducer_func(pref, kv[0]), level + 1)}
                or {**new_d, reducer_func(pref, kv[0]): kv[1]},
                d.items(),
            {}
        )

      return impl(d, reducer_seed, 0)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文