如果值存在,则通过更新但不覆盖来合并字典

发布于 2024-11-15 12:30:40 字数 652 浏览 1 评论 0 原文

如果我有 2 个字典,如下所示:

d1 = {'a': 2, 'b': 4}
d2 = {'a': 2, 'b': ''}

为了“合并”它们:

dict(d1.items() + d2.items())

结果是

{'a': 2, 'b': ''}

但是如果我想比较两个字典的每个值并且只将 d2 更新为 ,我该怎么办d1 如果 d1 中的值为空/None/''

当存在相同的键时,我只想保留数值(来自 d1d2)而不是空值。如果两个值都为空,则保持空值没有问题。如果两者都有值,则应保留 d1-value。

ie

d1 = {'a': 2, 'b': 8, 'c': ''}
d2 = {'a': 2, 'b': '', 'c': ''}

应该导致

{'a': 2, 'b': 8, 'c': ''}

8 不被 '' 覆盖。

If I have 2 dicts as follows:

d1 = {'a': 2, 'b': 4}
d2 = {'a': 2, 'b': ''}

In order to 'merge' them:

dict(d1.items() + d2.items())

results in

{'a': 2, 'b': ''}

But what should I do if I would like to compare each value of the two dictionaries and only update d2 into d1 if values in d1 are empty/None/''?

When the same key exists, I would like to only maintain the numerical value (either from d1 or d2) instead of the empty value. If both values are empty, then no problems maintaining the empty value. If both have values, then d1-value should stay.

i.e.

d1 = {'a': 2, 'b': 8, 'c': ''}
d2 = {'a': 2, 'b': '', 'c': ''}

should result in

{'a': 2, 'b': 8, 'c': ''}

where 8 is not overwritten by ''.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

千紇 2024-11-22 12:30:40

只需切换顺序:

z = dict(d2.items() + d1.items())

顺便说一句,您可能还对可能更快的 <代码>更新方法。

在Python 3中,您必须首先将视图对象转换为列表:

z = dict(list(d2.items()) + list(d1.items())) 

如果您想要特殊情况的空字符串,您可以执行以下操作:

def mergeDictsOverwriteEmpty(d1, d2):
    res = d2.copy()
    for k,v in d2.items():
        if k not in d1 or d1[k] == '':
            res[k] = v
    return res

Just switch the order:

z = dict(d2.items() + d1.items())

By the way, you may also be interested in the potentially faster update method.

In Python 3, you have to cast the view objects to lists first:

z = dict(list(d2.items()) + list(d1.items())) 

If you want to special-case empty strings, you can do the following:

def mergeDictsOverwriteEmpty(d1, d2):
    res = d2.copy()
    for k,v in d2.items():
        if k not in d1 or d1[k] == '':
            res[k] = v
    return res
千笙结 2024-11-22 12:30:40

使用 d1 键/值对更新 d2,但前提是 d1 值不为 None' '(False):(

>>> d1 = dict(a=1, b=None, c=2)
>>> d2 = dict(a=None, b=2, c=1)
>>> d2.update({k: v for k, v in d1.items() if v})
>>> d2
{'a': 1, 'c': 2, 'b': 2}

在 Python 2 中使用 iteritems() 而不是 items()。)

Updates d2 with d1 key/value pairs, but only if d1 value is not None, '' (False):

>>> d1 = dict(a=1, b=None, c=2)
>>> d2 = dict(a=None, b=2, c=1)
>>> d2.update({k: v for k, v in d1.items() if v})
>>> d2
{'a': 1, 'c': 2, 'b': 2}

(Use iteritems() instead of items() in Python 2.)

古镇旧梦 2024-11-22 12:30:40

Python 3.5+ Literal Dict

除非使用过时版本的 python,否则最好使用它。

Pythonic 和字典解包的更快方法:

d1 = {'a':1, 'b':1}
d2 = {'a':2, 'c':2}
merged = {**d1, **d2}  # priority from right to left
print(merged)

{'a': 2, 'b': 1, 'c': 2}

它比 dict(list(d2.items()) + list(d1.items())) 更简单,也更快 替代方案:

d1 = {i: 1 for i in range(1000000)}
d2 = {i: 2 for i in range(2000000)}

%timeit dict(list(d1.items()) + list(d2.items())) 
402 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit {**d1, **d2}
144 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

更多信息来自 PEP448

字典中的键保持从右到左的优先顺序,因此 {**{'a': 1}, 'a': 2, **{'a': 3}} 的计算结果为 { “一”:3}。开箱数量、位置没有限制。

仅合并非零值

要执行此操作,我们只需创建一个不带空值的字典,然后以这种方式将它们合并在一起:

d1 = {'a':1, 'b':1, 'c': '', 'd': ''}
d2 = {'a':2, 'c':2, 'd': ''}
merged_non_zero = {
    k: (d1.get(k) or d2.get(k))
    for k in set(d1) | set(d2)
}
print(merged_non_zero)

outputs:

{'a': 1, 'b': 1, 'c': 2, 'd': ''}
  • a ->更喜欢 d1 中的第一个值,因为 d1 和 d2 上都存在“a”
  • b ->仅存在于 d1
  • c -> d2 d 上非零
  • ->两者均为空字符串

解释

上面的代码将使用 dict 理解创建一个字典。

如果d1具有该值及其非零值(即bool(val)为True),它将使用d1[k]值,否则需要d2[k]

请注意,我们还合并了两个字典的所有键,因为它们可能不具有完全相同的键,使用 set union - set(d1) |设置(d2)。

Python 3.5+ Literal Dict

unless using obsolete version of python you better off using this.

Pythonic & faster way for dict unpacking:

d1 = {'a':1, 'b':1}
d2 = {'a':2, 'c':2}
merged = {**d1, **d2}  # priority from right to left
print(merged)

{'a': 2, 'b': 1, 'c': 2}

its simpler and also faster than the dict(list(d2.items()) + list(d1.items())) alternative:

d1 = {i: 1 for i in range(1000000)}
d2 = {i: 2 for i in range(2000000)}

%timeit dict(list(d1.items()) + list(d2.items())) 
402 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit {**d1, **d2}
144 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

more on this from PEP448:

The keys in a dictionary remain in a right-to-left priority order, so {**{'a': 1}, 'a': 2, **{'a': 3}} evaluates to {'a': 3}. There is no restriction on the number or position of unpackings.

Merging Only Non-zero values

to do this we can just create a dict without the empty values and then merge them together this way:

d1 = {'a':1, 'b':1, 'c': '', 'd': ''}
d2 = {'a':2, 'c':2, 'd': ''}
merged_non_zero = {
    k: (d1.get(k) or d2.get(k))
    for k in set(d1) | set(d2)
}
print(merged_non_zero)

outputs:

{'a': 1, 'b': 1, 'c': 2, 'd': ''}
  • a -> prefer first value from d1 as 'a' exists on both d1 and d2
  • b -> only exists on d1
  • c -> non-zero on d2
  • d -> empty string on both

Explanation

The above code will create a dictionary using dict comprehension.

if d1 has the value and its non-zero value (i.e. bool(val) is True), it'll use d1[k] value, otherwise it'll take d2[k].

notice that we also merge all keys of the two dicts as they may not have the exact same keys using set union - set(d1) | set(d2).

ˉ厌 2024-11-22 12:30:40

d1 中不存在于 d2 中的键/值添加到 d2 中,而不覆盖 d2 中的任何现有键/值代码>:

temp = d2.copy()
d2.update(d1)
d2.update(temp)

To add to d2 keys/values from d1 which do not exist in d2 without overwriting any existing keys/values in d2:

temp = d2.copy()
d2.update(d1)
d2.update(temp)
萌酱 2024-11-22 12:30:40

这是一个就地解决方案(它修改了 d2):

# assumptions: d2 is a temporary dict that can be discarded
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.

def update_non_existing_inplace(original_dict, to_add):
    to_add.update(original_dict) # to_add now holds the "final result" (O(n))
    original_dict.clear() # erase original_dict in-place (O(1))
    original_dict.update(to_add) # original_dict now holds the "final result" (O(n))
    return

这是另一个就地解决方案,它不太优雅,但可能更高效,并且离开 d2未修改

# assumptions: d2 is can not be modified
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.

def update_non_existing_inplace(original_dict, to_add):
    for key in to_add.iterkeys():
        if key not in original_dict:
            original_dict[key] = to_add[key]

Here's an in-place solution (it modifies d2):

# assumptions: d2 is a temporary dict that can be discarded
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.

def update_non_existing_inplace(original_dict, to_add):
    to_add.update(original_dict) # to_add now holds the "final result" (O(n))
    original_dict.clear() # erase original_dict in-place (O(1))
    original_dict.update(to_add) # original_dict now holds the "final result" (O(n))
    return

Here's another in-place solution, which is less elegant but potentially more efficient, as well as leaving d2 unmodified:

# assumptions: d2 is can not be modified
# d1 is a dict that must be modified in place
# the modification is adding keys from d2 into d1 that do not exist in d1.

def update_non_existing_inplace(original_dict, to_add):
    for key in to_add.iterkeys():
        if key not in original_dict:
            original_dict[key] = to_add[key]
苍暮颜 2024-11-22 12:30:40

d2.update(d1) 而不是 dict(d2.items() + d1.items())

d2.update(d1) instead of dict(d2.items() + d1.items())

早茶月光 2024-11-22 12:30:40

如果您有相同大小和键的字典,您可以使用以下代码:

dict((k,v if k in d2 and d2[k] in [None, ''] else d2[k]) for k,v in d1.iteritems())

In case when you have dictionaries with the same size and keys you can use the following code:

dict((k,v if k in d2 and d2[k] in [None, ''] else d2[k]) for k,v in d1.iteritems())
幸福不弃 2024-11-22 12:30:40

如果您想忽略空格,例如合并:

a = {"a": 1, "b": 2, "c": ""}
b = {"a": "", "b": 4, "c": 5}
c = {"a": "aaa", "b": ""}
d = {"a": "", "w": ""}

结果为:{'a': 'aaa', 'b': 4, 'c': 5, 'w': ''}

您可以使用这两个函数:

def merge_two_dicts(a, b, path=None):
    "merges b into a"
    if path is None:
        path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge_two_dicts(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass  # same leaf value
            else:
                if a[key] and not b[key]:
                    a[key] = a[key]
                else:
                    a[key] = b[key]
        else:
            a[key] = b[key]
    return a


def merge_multiple_dicts(*a):
    output = a[0]
    if len(a) >= 2:
        for n in range(len(a) - 1):
            output = merge_two_dicts(output, a[n + 1])

    return output

因此您可以使用 merge_multiple_dicts(a,b,c,d)

If you want to ignore empty spaces so that for example merging:

a = {"a": 1, "b": 2, "c": ""}
b = {"a": "", "b": 4, "c": 5}
c = {"a": "aaa", "b": ""}
d = {"a": "", "w": ""}

results in:{'a': 'aaa', 'b': 4, 'c': 5, 'w': ''}

You can use these 2 functions:

def merge_two_dicts(a, b, path=None):
    "merges b into a"
    if path is None:
        path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge_two_dicts(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass  # same leaf value
            else:
                if a[key] and not b[key]:
                    a[key] = a[key]
                else:
                    a[key] = b[key]
        else:
            a[key] = b[key]
    return a


def merge_multiple_dicts(*a):
    output = a[0]
    if len(a) >= 2:
        for n in range(len(a) - 1):
            output = merge_two_dicts(output, a[n + 1])

    return output

So you can just use merge_multiple_dicts(a,b,c,d)

花开柳相依 2024-11-22 12:30:40

如果您想更自由地选择何时应在合并字典中覆盖某个值,我有一个解决方案。也许这是一个冗长的脚本,但不难理解其逻辑。

感谢 fabiocaccamosenderle 分享 benedict 包< /a> 和列表中的嵌套迭代逻辑。这些知识是脚本开发的基础。

Python 要求

pip install python-benedict==0.24.3

Python 脚本

Dict 类的

from __future__ import annotations

from collections.abc import Mapping
from benedict import benedict
from typing import Iterator
from copy import deepcopy


class Dict:
    def __init__(self, data: dict = None):
        """
        Instantiates a dictionary object with nested keys-based indexing.

        Parameters
        ----------
        data: dict
            Dictionary.

        References
        ----------
        [1] 'Dict' class: https://stackoverflow.com/a/70908985/16109419
        [2] 'Benedict' package: https://github.com/fabiocaccamo/python-benedict
        [3] Dictionary nested iteration: https://stackoverflow.com/a/10756615/16109419
        """
        self.data = deepcopy(data) if data is not None else {}

    def get(self, keys: [object], **kwargs) -> (object, bool):
        """
        Get dictionary item value based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to get item value based on.

        Returns
        -------
        value, found: (object, bool)
            Item value, and whether the target item was found.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])
        value, found = None, False

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            trace = path + [outer_key]

            # Getting item value from dictionary:
            if trace == keys:
                value, found = outer_value, True
                break

            if trace == keys[:len(trace)] and isinstance(outer_value, Mapping):  # Recursion cutoff.
                value, found = self.get(
                    data=outer_value,
                    keys=keys,
                    path=trace
                )

        return value, found

    def set(self, keys: [object], value: object, **kwargs) -> bool:
        """
        Set dictionary item value based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to set item value based on.
        value: object
            Item value.

        Returns
        -------
        updated: bool
            Whether the target item was updated.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])
        updated = False

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            trace = path + [outer_key]

            # Setting item value on dictionary:
            if trace == keys:
                data[outer_key] = value
                updated = True
                break

            if trace == keys[:len(trace)] and isinstance(outer_value, Mapping):  # Recursion cutoff.
                updated = self.set(
                    data=outer_value,
                    keys=keys,
                    value=value,
                    path=trace
                )

        return updated

    def add(self, keys: [object], value: object, **kwargs) -> bool:
        """
        Add dictionary item value based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to add item based on.
        value: object
            Item value.

        Returns
        -------
        added: bool
            Whether the target item was added.
        """
        data = kwargs.get('data', self.data)
        added = False

        # Adding item on dictionary:
        if keys[0] not in data:
            if len(keys) == 1:
                data[keys[0]] = value
                added = True
            else:
                data[keys[0]] = {}

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            if outer_key == keys[0]:  # Recursion cutoff.
                if len(keys) > 1 and isinstance(outer_value, Mapping):
                    added = self.add(
                        data=outer_value,
                        keys=keys[1:],
                        value=value
                    )

        return added

    def remove(self, keys: [object], **kwargs) -> bool:
        """
        Remove dictionary item based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to remove item based on.

        Returns
        -------
        removed: bool
            Whether the target item was removed.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])
        removed = False

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            trace = path + [outer_key]

            # Removing item from dictionary:
            if trace == keys:
                del data[outer_key]
                removed = True
                break

            if trace == keys[:len(trace)] and isinstance(outer_value, Mapping):  # Recursion cutoff.
                removed = self.remove(
                    data=outer_value,
                    keys=keys,
                    path=trace
                )

        return removed

    def items(self, **kwargs) -> Iterator[object, object]:
        """
        Get dictionary items based on nested keys.

        Returns
        -------
        keys, value: Iterator[object, object]
            List of nested keys and list of values.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])

        for outer_key, outer_value in data.items():
            if isinstance(outer_value, Mapping):
                for inner_key, inner_value in self.items(data=outer_value, path=path + [outer_key]):
                    yield inner_key, inner_value
            else:
                yield path + [outer_key], outer_value

    @staticmethod
    def merge(dict_list: [dict], overwrite: bool = False, concat: bool = False, default_value: object = None) -> dict:
        """
        Merges dictionaries, with value assignment based on order of occurrence. Overwrites values if and only if:
            - The key does not yet exist on merged dictionary;
            - The current value of the key on merged dictionary is the default value.

        Parameters
        ----------
        dict_list: [dict]
            List of dictionaries.
        overwrite: bool
            Overwrites occurrences of values. If false, keep the first occurrence of each value found.
        concat: bool
            Concatenates occurrences of values for the same key.
        default_value: object
            Default value used as a reference to override dictionary attributes.

        Returns
        -------
        md: dict
            Merged dictionary.
        """
        dict_list = [d for d in dict_list if d is not None and isinstance(d, dict)] if dict_list is not None else []
        assert len(dict_list), f"no dictionaries given."

        # Keeping the first occurrence of each value:
        if not overwrite:
            dict_list = [Dict(d) for d in dict_list]

            for i, d in enumerate(dict_list[:-1]):
                for keys, value in d.items():
                    if value != default_value:
                        for j, next_d in enumerate(dict_list[i+1:], start=i+1):
                            next_d.remove(keys=keys)

            dict_list = [d.data for d in dict_list]

        md = benedict()
        md.merge(*dict_list, overwrite=True, concat=concat)

        return md

定义。定义 main 方法以显示示例。

import json


def main() -> None:
    dict_list = [
        {1: 'a', 2: None, 3: {4: None, 5: {6: None}}},
        {1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}},
        {1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}},
        {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}},
        {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}},
    ]

    d = Dict(data=dict_list[-1])

    print("Dictionary operations test:\n")
    print(f"data = {json.dumps(d.data, indent=4)}\n")
    print(f"d = Dict(data=data)")

    keys = [11]
    value = {12: {13: 14}}
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
    print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
    keys = [11, 12, 13]
    value = 14
    print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
    value = 15
    print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
    keys = [11]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [11, 12]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [11, 12, 13]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [11, 12, 13, 15]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [2]
    print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
    print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")

    print("\n-----------------------------\n")
    print("Dictionary values match test:\n")
    print(f"data = {json.dumps(d.data, indent=4)}\n")
    print(f"d = Dict(data=data)")

    for keys, value in d.items():
        real_value, found = d.get(keys=keys)
        status = "found" if found else "not found"
        print(f"d{keys} = {value} == {real_value} ({status}) --> {value == real_value}")

    print("\n-----------------------------\n")
    print("Dictionaries merge test:\n")

    for i, d in enumerate(dict_list, start=1):
        print(f"d{i} = {d}")

    dict_list_ = [f"d{i}" for i, d in enumerate(dict_list, start=1)]
    print(f"dict_list = [{', '.join(dict_list_)}]")

    md = Dict.merge(dict_list=dict_list)
    print("\nmd = Dict.merge(dict_list=dict_list)")
    print("print(md)")
    print(f"{json.dumps(md, indent=4)}")


if __name__ == '__main__':
    main()

输出

Dictionary operations test:

data = {
    "1": null,
    "2": "b",
    "3": {
        "4": null,
        "5": {
            "6": {
                "8": {
                    "9": {
                        "10": [
                            "g",
                            "h"
                        ]
                    }
                }
            }
        }
    }
}

d = Dict(data=data)
d.get(keys=[11]) --> (None, False)
d.set(keys=[11], value={12: {13: 14}}) --> False
d.add(keys=[11], value={12: {13: 14}}) --> True
d.add(keys=[11, 12, 13], value=14) --> False
d.set(keys=[11, 12, 13], value=15) --> True
d.get(keys=[11]) --> ({12: {13: 15}}, True)
d.get(keys=[11, 12]) --> ({13: 15}, True)
d.get(keys=[11, 12, 13]) --> (15, True)
d.get(keys=[11, 12, 13, 15]) --> (None, False)
d.remove(keys=[2]) --> True
d.remove(keys=[2]) --> False
d.get(keys=[2]) --> (None, False)

-----------------------------

Dictionary values match test:

data = {
    "1": null,
    "3": {
        "4": null,
        "5": {
            "6": {
                "8": {
                    "9": {
                        "10": [
                            "g",
                            "h"
                        ]
                    }
                }
            }
        }
    },
    "11": {
        "12": {
            "13": 15
        }
    }
}

d = Dict(data=data)
d[1] = None == None (found) --> True
d[3, 4] = None == None (found) --> True
d[3, 5, 6, 8, 9, 10] = ['g', 'h'] == ['g', 'h'] (found) --> True
d[11, 12, 13] = 15 == 15 (found) --> True

-----------------------------

Dictionaries merge test:

d1 = {1: 'a', 2: None, 3: {4: None, 5: {6: None}}}
d2 = {1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}}
d3 = {1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}}
d4 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}}
d5 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}}
dict_list = [d1, d2, d3, d4, d5]

md = Dict.merge(dict_list=dict_list)
print(md)
{
    "1": "a",
    "2": "b",
    "3": {
        "4": "c",
        "5": {
            "6": {
                "7": "d",
                "8": {
                    "9": {
                        "10": [
                            "e",
                            "f"
                        ]
                    }
                }
            }
        }
    }
}

I have a solution if you want to have more freedom to choose when a value should be overwritten in the merged dictionary. Maybe it's a verbose script, but it's not hard to understand its logic.

Thanks fabiocaccamo and senderle for sharing the benedict package, and the nested iteration logic in lists, respectively. This knowledge was fundamental to the script development.

Python Requirements

pip install python-benedict==0.24.3

Python Script

Definition of the Dict class.

from __future__ import annotations

from collections.abc import Mapping
from benedict import benedict
from typing import Iterator
from copy import deepcopy


class Dict:
    def __init__(self, data: dict = None):
        """
        Instantiates a dictionary object with nested keys-based indexing.

        Parameters
        ----------
        data: dict
            Dictionary.

        References
        ----------
        [1] 'Dict' class: https://stackoverflow.com/a/70908985/16109419
        [2] 'Benedict' package: https://github.com/fabiocaccamo/python-benedict
        [3] Dictionary nested iteration: https://stackoverflow.com/a/10756615/16109419
        """
        self.data = deepcopy(data) if data is not None else {}

    def get(self, keys: [object], **kwargs) -> (object, bool):
        """
        Get dictionary item value based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to get item value based on.

        Returns
        -------
        value, found: (object, bool)
            Item value, and whether the target item was found.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])
        value, found = None, False

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            trace = path + [outer_key]

            # Getting item value from dictionary:
            if trace == keys:
                value, found = outer_value, True
                break

            if trace == keys[:len(trace)] and isinstance(outer_value, Mapping):  # Recursion cutoff.
                value, found = self.get(
                    data=outer_value,
                    keys=keys,
                    path=trace
                )

        return value, found

    def set(self, keys: [object], value: object, **kwargs) -> bool:
        """
        Set dictionary item value based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to set item value based on.
        value: object
            Item value.

        Returns
        -------
        updated: bool
            Whether the target item was updated.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])
        updated = False

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            trace = path + [outer_key]

            # Setting item value on dictionary:
            if trace == keys:
                data[outer_key] = value
                updated = True
                break

            if trace == keys[:len(trace)] and isinstance(outer_value, Mapping):  # Recursion cutoff.
                updated = self.set(
                    data=outer_value,
                    keys=keys,
                    value=value,
                    path=trace
                )

        return updated

    def add(self, keys: [object], value: object, **kwargs) -> bool:
        """
        Add dictionary item value based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to add item based on.
        value: object
            Item value.

        Returns
        -------
        added: bool
            Whether the target item was added.
        """
        data = kwargs.get('data', self.data)
        added = False

        # Adding item on dictionary:
        if keys[0] not in data:
            if len(keys) == 1:
                data[keys[0]] = value
                added = True
            else:
                data[keys[0]] = {}

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            if outer_key == keys[0]:  # Recursion cutoff.
                if len(keys) > 1 and isinstance(outer_value, Mapping):
                    added = self.add(
                        data=outer_value,
                        keys=keys[1:],
                        value=value
                    )

        return added

    def remove(self, keys: [object], **kwargs) -> bool:
        """
        Remove dictionary item based on nested keys.

        Parameters
        ----------
        keys: [object]
            Nested keys to remove item based on.

        Returns
        -------
        removed: bool
            Whether the target item was removed.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])
        removed = False

        # Looking for item location on dictionary:
        for outer_key, outer_value in data.items():
            trace = path + [outer_key]

            # Removing item from dictionary:
            if trace == keys:
                del data[outer_key]
                removed = True
                break

            if trace == keys[:len(trace)] and isinstance(outer_value, Mapping):  # Recursion cutoff.
                removed = self.remove(
                    data=outer_value,
                    keys=keys,
                    path=trace
                )

        return removed

    def items(self, **kwargs) -> Iterator[object, object]:
        """
        Get dictionary items based on nested keys.

        Returns
        -------
        keys, value: Iterator[object, object]
            List of nested keys and list of values.
        """
        data = kwargs.get('data', self.data)
        path = kwargs.get('path', [])

        for outer_key, outer_value in data.items():
            if isinstance(outer_value, Mapping):
                for inner_key, inner_value in self.items(data=outer_value, path=path + [outer_key]):
                    yield inner_key, inner_value
            else:
                yield path + [outer_key], outer_value

    @staticmethod
    def merge(dict_list: [dict], overwrite: bool = False, concat: bool = False, default_value: object = None) -> dict:
        """
        Merges dictionaries, with value assignment based on order of occurrence. Overwrites values if and only if:
            - The key does not yet exist on merged dictionary;
            - The current value of the key on merged dictionary is the default value.

        Parameters
        ----------
        dict_list: [dict]
            List of dictionaries.
        overwrite: bool
            Overwrites occurrences of values. If false, keep the first occurrence of each value found.
        concat: bool
            Concatenates occurrences of values for the same key.
        default_value: object
            Default value used as a reference to override dictionary attributes.

        Returns
        -------
        md: dict
            Merged dictionary.
        """
        dict_list = [d for d in dict_list if d is not None and isinstance(d, dict)] if dict_list is not None else []
        assert len(dict_list), f"no dictionaries given."

        # Keeping the first occurrence of each value:
        if not overwrite:
            dict_list = [Dict(d) for d in dict_list]

            for i, d in enumerate(dict_list[:-1]):
                for keys, value in d.items():
                    if value != default_value:
                        for j, next_d in enumerate(dict_list[i+1:], start=i+1):
                            next_d.remove(keys=keys)

            dict_list = [d.data for d in dict_list]

        md = benedict()
        md.merge(*dict_list, overwrite=True, concat=concat)

        return md

Definition of the main method to show examples.

import json


def main() -> None:
    dict_list = [
        {1: 'a', 2: None, 3: {4: None, 5: {6: None}}},
        {1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}},
        {1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}},
        {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}},
        {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}},
    ]

    d = Dict(data=dict_list[-1])

    print("Dictionary operations test:\n")
    print(f"data = {json.dumps(d.data, indent=4)}\n")
    print(f"d = Dict(data=data)")

    keys = [11]
    value = {12: {13: 14}}
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
    print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
    keys = [11, 12, 13]
    value = 14
    print(f"d.add(keys={keys}, value={value}) --> {d.add(keys=keys, value=value)}")
    value = 15
    print(f"d.set(keys={keys}, value={value}) --> {d.set(keys=keys, value=value)}")
    keys = [11]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [11, 12]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [11, 12, 13]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [11, 12, 13, 15]
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")
    keys = [2]
    print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
    print(f"d.remove(keys={keys}) --> {d.remove(keys=keys)}")
    print(f"d.get(keys={keys}) --> {d.get(keys=keys)}")

    print("\n-----------------------------\n")
    print("Dictionary values match test:\n")
    print(f"data = {json.dumps(d.data, indent=4)}\n")
    print(f"d = Dict(data=data)")

    for keys, value in d.items():
        real_value, found = d.get(keys=keys)
        status = "found" if found else "not found"
        print(f"d{keys} = {value} == {real_value} ({status}) --> {value == real_value}")

    print("\n-----------------------------\n")
    print("Dictionaries merge test:\n")

    for i, d in enumerate(dict_list, start=1):
        print(f"d{i} = {d}")

    dict_list_ = [f"d{i}" for i, d in enumerate(dict_list, start=1)]
    print(f"dict_list = [{', '.join(dict_list_)}]")

    md = Dict.merge(dict_list=dict_list)
    print("\nmd = Dict.merge(dict_list=dict_list)")
    print("print(md)")
    print(f"{json.dumps(md, indent=4)}")


if __name__ == '__main__':
    main()

Output

Dictionary operations test:

data = {
    "1": null,
    "2": "b",
    "3": {
        "4": null,
        "5": {
            "6": {
                "8": {
                    "9": {
                        "10": [
                            "g",
                            "h"
                        ]
                    }
                }
            }
        }
    }
}

d = Dict(data=data)
d.get(keys=[11]) --> (None, False)
d.set(keys=[11], value={12: {13: 14}}) --> False
d.add(keys=[11], value={12: {13: 14}}) --> True
d.add(keys=[11, 12, 13], value=14) --> False
d.set(keys=[11, 12, 13], value=15) --> True
d.get(keys=[11]) --> ({12: {13: 15}}, True)
d.get(keys=[11, 12]) --> ({13: 15}, True)
d.get(keys=[11, 12, 13]) --> (15, True)
d.get(keys=[11, 12, 13, 15]) --> (None, False)
d.remove(keys=[2]) --> True
d.remove(keys=[2]) --> False
d.get(keys=[2]) --> (None, False)

-----------------------------

Dictionary values match test:

data = {
    "1": null,
    "3": {
        "4": null,
        "5": {
            "6": {
                "8": {
                    "9": {
                        "10": [
                            "g",
                            "h"
                        ]
                    }
                }
            }
        }
    },
    "11": {
        "12": {
            "13": 15
        }
    }
}

d = Dict(data=data)
d[1] = None == None (found) --> True
d[3, 4] = None == None (found) --> True
d[3, 5, 6, 8, 9, 10] = ['g', 'h'] == ['g', 'h'] (found) --> True
d[11, 12, 13] = 15 == 15 (found) --> True

-----------------------------

Dictionaries merge test:

d1 = {1: 'a', 2: None, 3: {4: None, 5: {6: None}}}
d2 = {1: None, 2: None, 3: {4: 'c', 5: {6: {7: None}}}}
d3 = {1: None, 2: 'b', 3: {4: None, 5: {6: {7: 'd'}}}}
d4 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['e', 'f']}}}}}}
d5 = {1: None, 2: 'b', 3: {4: None, 5: {6: {8: {9: {10: ['g', 'h']}}}}}}
dict_list = [d1, d2, d3, d4, d5]

md = Dict.merge(dict_list=dict_list)
print(md)
{
    "1": "a",
    "2": "b",
    "3": {
        "4": "c",
        "5": {
            "6": {
                "7": "d",
                "8": {
                    "9": {
                        "10": [
                            "e",
                            "f"
                        ]
                    }
                }
            }
        }
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文