在python中递归地删除json对象列表中的null/空值

发布于 2025-01-19 05:30:33 字数 1858 浏览 3 评论 0 原文

我有一个具有这样的值的JSON对象(JSON字符串):

[
   {
      "id": 1,
      "object_k_id": "",
      "object_type": "report",
      "object_meta": {
         "source_id": 0,
         "report": "Customers"
      },
      "description": "Daily metrics for all customers",
      "business_name": "",
      "business_logic": "",
      "owners": [
         "[email protected]",
          null
      ],
      "stewards": [
         "[email protected]",
         ''
      ],
      "verified_use_cases": [
         null,
         null,
         "c4a48296-fd92-3606-bf84-99aacdf22a20",
         null
      ],
      "classifications": [
         null
      ],
      "domains": []
   }
]

BU我想要的最终格式是删除了nulls和空列表项的内容:类似的内容:

[
   {
      "id": 1,
      "object_k_id": "",
      "object_type": "report",
      "object_meta": {
         "source_id": 0,
         "report": "Customers"
      },
      "description": "Daily metrics for all customers",
      "business_name": "",
      "business_logic": "",
      "owners": [
         "[email protected]"
      ],
      "stewards": [
         "[email protected]"
      ],
      "verified_use_cases": [
         "c4a48296-fd92-3606-bf84-99aacdf22a20"
      ],
      "classifications": [],
      "domains": []
   }
]

我希望输出排除nulls,空字符串并制造。看起来更干净。 我需要对所有JSON中的所有列表进行递归进行此操作。

甚至不仅仅是递归,如果我可以一口气做而不是遍历每个元素,那将是有帮助的。

我只需要清理列表。

有人可以帮我吗?提前致谢

I have a json object (json string) which has values like this:

[
   {
      "id": 1,
      "object_k_id": "",
      "object_type": "report",
      "object_meta": {
         "source_id": 0,
         "report": "Customers"
      },
      "description": "Daily metrics for all customers",
      "business_name": "",
      "business_logic": "",
      "owners": [
         "[email protected]",
          null
      ],
      "stewards": [
         "[email protected]",
         ''
      ],
      "verified_use_cases": [
         null,
         null,
         "c4a48296-fd92-3606-bf84-99aacdf22a20",
         null
      ],
      "classifications": [
         null
      ],
      "domains": []
   }
]

Bu the final format I want is something that has removed the nulls and the empty list items: something like this:

[
   {
      "id": 1,
      "object_k_id": "",
      "object_type": "report",
      "object_meta": {
         "source_id": 0,
         "report": "Customers"
      },
      "description": "Daily metrics for all customers",
      "business_name": "",
      "business_logic": "",
      "owners": [
         "[email protected]"
      ],
      "stewards": [
         "[email protected]"
      ],
      "verified_use_cases": [
         "c4a48296-fd92-3606-bf84-99aacdf22a20"
      ],
      "classifications": [],
      "domains": []
   }
]

I want the output to exclude nulls, empty strings and make it look more clean.
I need to do this recursively for all the lists in all the jsons I have.

Even more than recursive, it would be helpful if I can do it at one stretch rather than looping through each element.

I need to clean only the lists though.

Can anyone please help me with this? Thanks in advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

岛歌少女 2025-01-26 05:30:33
import json


def recursive_dict_clean(d):
    for k, v in d.items():
        if isinstance(v, list):
            v[:] = [i for i in v if i]
        if isinstance(v, dict):
            recursive_dict_lookup(v)


data = json.loads("""[{
    "id": 1,
    "object_k_id": "",
    "object_type": "report",
    "object_meta": {
        "source_id": 0,
        "report": "Customers"
    },
    "description": "Daily metrics for all customers",
    "business_name": "",
    "business_logic": "",
    "owners": [
        "[email protected]",
        null
    ],
    "stewards": [
        "[email protected]"
    ],
    "verified_use_cases": [
        null,
        null,
        "c4a48296-fd92-3606-bf84-99aacdf22a20",
        null
    ],
    "classifications": [
        null
    ],
    "domains": []
}]""")


for d in data:
    recursive_dict_clean(d)

print(data):
[{'id': 1,
  'object_k_id': '',
  'object_type': 'report',
  'object_meta': {'source_id': 0, 'report': 'Customers'},
  'description': 'Daily metrics for all customers',
  'business_name': '',
  'business_logic': '',
  'owners': ['[email protected]'],
  'stewards': ['[email protected]'],
  'verified_use_cases': ['c4a48296-fd92-3606-bf84-99aacdf22a20'],
  'classifications': [],
  'domains': []}]

PS:您的 json 字符串无效。

import json


def recursive_dict_clean(d):
    for k, v in d.items():
        if isinstance(v, list):
            v[:] = [i for i in v if i]
        if isinstance(v, dict):
            recursive_dict_lookup(v)


data = json.loads("""[{
    "id": 1,
    "object_k_id": "",
    "object_type": "report",
    "object_meta": {
        "source_id": 0,
        "report": "Customers"
    },
    "description": "Daily metrics for all customers",
    "business_name": "",
    "business_logic": "",
    "owners": [
        "[email protected]",
        null
    ],
    "stewards": [
        "[email protected]"
    ],
    "verified_use_cases": [
        null,
        null,
        "c4a48296-fd92-3606-bf84-99aacdf22a20",
        null
    ],
    "classifications": [
        null
    ],
    "domains": []
}]""")


for d in data:
    recursive_dict_clean(d)

print(data):
[{'id': 1,
  'object_k_id': '',
  'object_type': 'report',
  'object_meta': {'source_id': 0, 'report': 'Customers'},
  'description': 'Daily metrics for all customers',
  'business_name': '',
  'business_logic': '',
  'owners': ['[email protected]'],
  'stewards': ['[email protected]'],
  'verified_use_cases': ['c4a48296-fd92-3606-bf84-99aacdf22a20'],
  'classifications': [],
  'domains': []}]

P.S.: Your json string is not valid.

Spring初心 2025-01-26 05:30:33

您可以使用内置 object_pairs_hook 从字符串解码时解析数据。

https://docs.python.org/3/library/json。 html#json.load

此函数运行的时间都可以调用 dict()并删除所有 none none 从列表中使用简单列表, ,否则将数据独自一人,让解码器做这件事。

#!/usr/bin/env python3
import json
data_string = """[
   {
      "id": 1,
      "object_k_id": "",
      "object_type": "report",
      "object_meta": {
         "source_id": 0,
         "report": "Customers"
      },
      "description": "Daily metrics for all customers",
      "business_name": "",
      "business_logic": "",
      "owners": [
         "[email protected]",
          null
      ],
      "stewards": [
         "[email protected]",
         ""
      ],
      "verified_use_cases": [
         null,
         null,
         "c4a48296-fd92-3606-bf84-99aacdf22a20",
         null
      ],
      "classifications": [
         null
      ],
      "domains": []
   }
]"""

def json_hook(obj):
    return_obj = {}
    for k, v in obj:
        if isinstance(v, list):
            v = [x for x in v if x is not None]

        return_obj[k] = v

    return return_obj

data = json.loads(data_string, object_pairs_hook=json_hook)

print(json.dumps(data, indent=4))

结果:

[
    {
        "id": 1,
        "object_k_id": "",
        "object_type": "report",
        "object_meta": {
            "source_id": 0,
            "report": "Customers"
        },
        "description": "Daily metrics for all customers",
        "business_name": "",
        "business_logic": "",
        "owners": [
            "[email protected]"
        ],
        "stewards": [
            "[email protected]",
            ""
        ],
        "verified_use_cases": [
            "c4a48296-fd92-3606-bf84-99aacdf22a20"
        ],
        "classifications": [],
        "domains": []
    }
]

在您的示例中,您从 stewards 中删除“” 值,如果您需要该行为,则可以用不是无 >不在(无,“”) ..中,但这似乎是一个错误,因为您将空字符串留在其他地方。

You can use the inbuilt object_pairs_hook to parse the data as you decode it from your string.

https://docs.python.org/3/library/json.html#json.load

This function runs ever time the decoder might call dict() and removes all None objects from lists as it goes using a simple list comprehension, otherwise leaving the data alone and letting the decoder do it's thing.

#!/usr/bin/env python3
import json
data_string = """[
   {
      "id": 1,
      "object_k_id": "",
      "object_type": "report",
      "object_meta": {
         "source_id": 0,
         "report": "Customers"
      },
      "description": "Daily metrics for all customers",
      "business_name": "",
      "business_logic": "",
      "owners": [
         "[email protected]",
          null
      ],
      "stewards": [
         "[email protected]",
         ""
      ],
      "verified_use_cases": [
         null,
         null,
         "c4a48296-fd92-3606-bf84-99aacdf22a20",
         null
      ],
      "classifications": [
         null
      ],
      "domains": []
   }
]"""

def json_hook(obj):
    return_obj = {}
    for k, v in obj:
        if isinstance(v, list):
            v = [x for x in v if x is not None]

        return_obj[k] = v

    return return_obj

data = json.loads(data_string, object_pairs_hook=json_hook)

print(json.dumps(data, indent=4))

Result:

[
    {
        "id": 1,
        "object_k_id": "",
        "object_type": "report",
        "object_meta": {
            "source_id": 0,
            "report": "Customers"
        },
        "description": "Daily metrics for all customers",
        "business_name": "",
        "business_logic": "",
        "owners": [
            "[email protected]"
        ],
        "stewards": [
            "[email protected]",
            ""
        ],
        "verified_use_cases": [
            "c4a48296-fd92-3606-bf84-99aacdf22a20"
        ],
        "classifications": [],
        "domains": []
    }
]

in your example you remove the "" value from stewards, if you want that behaviour, you can replace is not None with not in (None, "").. but it seemed like that might've been a mistake since you left empty strings in other places.

桃酥萝莉 2025-01-26 05:30:33

您可以将您的 JSON 转换为 dict 然后使用函数下面的功能,然后将其转换为 JSON 再次:

def clean_dict(input_dict):
    output = {}
    for key, value in input_dict.items():
        if isinstance(value, dict):
            output[key] = clean_dict(value)
        elif isinstance(value, list):
            output[key] = []
            for item in value:
                if isinstance(value, dict):
                    output[key].append(clean_dict(item))
                elif value not in [None, '']:
                    output[key].append(item)
        else:
            output[key] = value
    return output

感谢NO

You can convert your json to dict then use the function below and convert it to json again:

def clean_dict(input_dict):
    output = {}
    for key, value in input_dict.items():
        if isinstance(value, dict):
            output[key] = clean_dict(value)
        elif isinstance(value, list):
            output[key] = []
            for item in value:
                if isinstance(value, dict):
                    output[key].append(clean_dict(item))
                elif value not in [None, '']:
                    output[key].append(item)
        else:
            output[key] = value
    return output

Thanks to N.O

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文