解析嵌套列表

发布于 2025-01-19 23:53:51 字数 6040 浏览 0 评论 0原文

因此,我有一个嵌套的列表,需要解析并操纵内容。我正在从Merriam Webster字典API中获取这些数据,

这就是JSON数据的样子:

[
  {
    "meta": {
      "id": "trellis:2",
      "uuid": "3a22ee1d-f552-4836-bd2a-3e6b2fe11884",
      "sort": "200364400",
      "src": "collegiate",
      "section": "alpha",
      "stems": [
        "trellis",
        "trellised",
        "trellises",
        "trellising"
      ],
      "offensive": false
    },
    "hom": 2,
    "hwi": {
      "hw": "trellis"
    },
    "fl": "verb",
    "ins": [
      {
        "if": "trel*lised"
      },
      {
        "if": "trel*lis*ing"
      },
      {
        "if": "trel*lis*es"
      }
    ],
    "def": [
      {
        "vd": "transitive verb",
        "sseq": [
          [
            [
              "sense",
              {
                "sn": "1",
                "dt": [
                  [
                    "text",
                    "{bc}to provide with a trellis"
                  ]
                ],
                "sdsense": {
                  "sd": "especially",
                  "dt": [
                    [
                      "text",
                      "{bc}to train (a plant, such as a vine) on a trellis"
                    ]
                  ]
                }
              }
            ]
          ],
          [
            [
              "sense",
              {
                "sn": "2",
                "dt": [
                  [
                    "text",
                    "{bc}to cross or interlace on or through {bc}{sx|interweave||}"
                  ]
                ]
              }
            ]
          ]
        ]
      }
    ],
    "date": "15th century{ds||1||}",
    "shortdef": [
      "to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis",
      "to cross or interlace on or through : interweave"
    ]
  },
  {
    "meta": {
      "id": "trellis:1",
      "uuid": "db2e1190-dd93-45f3-831c-9e9536aab601",
      "sort": "200364300",
      "src": "collegiate",
      "section": "alpha",
      "stems": [
        "trellis",
        "trellised",
        "trellises"
      ],
      "offensive": false
    },
    "hom": 1,
    "hwi": {
      "hw": "trel*lis",
      "prs": [
        {
          "mw": "ˈtre-ləs",
          "sound": {
            "audio": "trelli01",
            "ref": "c",
            "stat": "1"
          }
        }
      ]
    },
    "fl": "noun",
    "def": [
      {
        "sseq": [
          [
            [
              "sense",
              {
                "sn": "1",
                "dt": [
                  [
                    "text",
                    "{bc}a frame of latticework used as a screen or as a support for climbing plants"
                  ]
                ]
              }
            ]
          ],
          [
            [
              "sense",
              {
                "sn": "2",
                "dt": [
                  [
                    "text",
                    "{bc}a construction (such as a summerhouse) chiefly of latticework"
                  ]
                ]
              }
            ]
          ],
          [
            [
              "sense",
              {
                "sn": "3",
                "dt": [
                  [
                    "text",
                    "{bc}an arrangement that forms or gives the effect of a lattice "
                  ],
                  [
                    "vis",
                    [
                      {
                        "t": "a {wi}trellis{/wi} of interlacing streams"
                      }
                    ]
                  ]
                ]
              }
            ]
          ]
        ]
      }
    ],
    "uros": [
      {
        "ure": "trel*lised",
        "prs": [
          {
            "mw": "ˈtre-ləst",
            "sound": {
              "audio": "trelli02",
              "ref": "c",
              "stat": "1"
            }
          }
        ],
        "fl": "adjective"
      }
    ],
    "art": {
      "artid": "trellis",
      "capt": "trellis 1"
    },
    "et": [
      [
        "text",
        "Middle English {it}trelis{/it}, from Anglo-French {it}treleis{/it}, from Old French {it}treille{/it} arbor, from Latin {it}trichila{/it} summerhouse"
      ]
    ],
    "date": "14th century{ds||1||}",
    "shortdef": [
      "a frame of latticework used as a screen or as a support for climbing plants",
      "a construction (such as a summerhouse) chiefly of latticework",
      "an arrangement that forms or gives the effect of a lattice"
    ]
  }
]

这就是我想将数据更改为: 我正在从shortdef我的当前代码中提取定义

Definition (Entry 1/2):
1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis

2: to cross or interlace on or through : interweave1: a frame of latticework used as a screen or as a support for climbing plants

Definition (Entry 2/2):
1: a frame of latticework used as a screen or as a support for climbing plants

2: a construction (such as a summerhouse) chiefly of latticework

3: an arrangement that forms or gives the effect of a lattice

,只能解析非嵌套列表,我正在努力弄清楚如何将其修改以解析嵌套的列表:

defin_formatted = ""
word_defin = []
for i in data:
    word_defin.append(i['shortdef'])

for group in word_defin:
    if len(group) > 1:
        result = []
        for i,v in enumerate(group):
            result.append("**{}:** {}".format(i+1, v))
        group = '\n\n'.join(result)
        
    else:
        group = group[0]
    defin_formatted = defin_formatted + group

此代码会产生此输出:

1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis

2: to cross or interlace on or through : interweave1: a frame of latticework used as a screen or as a support for climbing plants

2: a construction (such as a summerhouse) chiefly of latticework

3: an arrangement that forms or gives the effect of a lattice

这是非常接近的,所以意图但有缺陷

So I have a nested list that I need to parse through and manipulate the contents of. I am getting this data from the Merriam Webster dictionary API

Here is what that JSON data looks like:

[
  {
    "meta": {
      "id": "trellis:2",
      "uuid": "3a22ee1d-f552-4836-bd2a-3e6b2fe11884",
      "sort": "200364400",
      "src": "collegiate",
      "section": "alpha",
      "stems": [
        "trellis",
        "trellised",
        "trellises",
        "trellising"
      ],
      "offensive": false
    },
    "hom": 2,
    "hwi": {
      "hw": "trellis"
    },
    "fl": "verb",
    "ins": [
      {
        "if": "trel*lised"
      },
      {
        "if": "trel*lis*ing"
      },
      {
        "if": "trel*lis*es"
      }
    ],
    "def": [
      {
        "vd": "transitive verb",
        "sseq": [
          [
            [
              "sense",
              {
                "sn": "1",
                "dt": [
                  [
                    "text",
                    "{bc}to provide with a trellis"
                  ]
                ],
                "sdsense": {
                  "sd": "especially",
                  "dt": [
                    [
                      "text",
                      "{bc}to train (a plant, such as a vine) on a trellis"
                    ]
                  ]
                }
              }
            ]
          ],
          [
            [
              "sense",
              {
                "sn": "2",
                "dt": [
                  [
                    "text",
                    "{bc}to cross or interlace on or through {bc}{sx|interweave||}"
                  ]
                ]
              }
            ]
          ]
        ]
      }
    ],
    "date": "15th century{ds||1||}",
    "shortdef": [
      "to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis",
      "to cross or interlace on or through : interweave"
    ]
  },
  {
    "meta": {
      "id": "trellis:1",
      "uuid": "db2e1190-dd93-45f3-831c-9e9536aab601",
      "sort": "200364300",
      "src": "collegiate",
      "section": "alpha",
      "stems": [
        "trellis",
        "trellised",
        "trellises"
      ],
      "offensive": false
    },
    "hom": 1,
    "hwi": {
      "hw": "trel*lis",
      "prs": [
        {
          "mw": "ˈtre-ləs",
          "sound": {
            "audio": "trelli01",
            "ref": "c",
            "stat": "1"
          }
        }
      ]
    },
    "fl": "noun",
    "def": [
      {
        "sseq": [
          [
            [
              "sense",
              {
                "sn": "1",
                "dt": [
                  [
                    "text",
                    "{bc}a frame of latticework used as a screen or as a support for climbing plants"
                  ]
                ]
              }
            ]
          ],
          [
            [
              "sense",
              {
                "sn": "2",
                "dt": [
                  [
                    "text",
                    "{bc}a construction (such as a summerhouse) chiefly of latticework"
                  ]
                ]
              }
            ]
          ],
          [
            [
              "sense",
              {
                "sn": "3",
                "dt": [
                  [
                    "text",
                    "{bc}an arrangement that forms or gives the effect of a lattice "
                  ],
                  [
                    "vis",
                    [
                      {
                        "t": "a {wi}trellis{/wi} of interlacing streams"
                      }
                    ]
                  ]
                ]
              }
            ]
          ]
        ]
      }
    ],
    "uros": [
      {
        "ure": "trel*lised",
        "prs": [
          {
            "mw": "ˈtre-ləst",
            "sound": {
              "audio": "trelli02",
              "ref": "c",
              "stat": "1"
            }
          }
        ],
        "fl": "adjective"
      }
    ],
    "art": {
      "artid": "trellis",
      "capt": "trellis 1"
    },
    "et": [
      [
        "text",
        "Middle English {it}trelis{/it}, from Anglo-French {it}treleis{/it}, from Old French {it}treille{/it} arbor, from Latin {it}trichila{/it} summerhouse"
      ]
    ],
    "date": "14th century{ds||1||}",
    "shortdef": [
      "a frame of latticework used as a screen or as a support for climbing plants",
      "a construction (such as a summerhouse) chiefly of latticework",
      "an arrangement that forms or gives the effect of a lattice"
    ]
  }
]

And here is what I want to change the data to:
I am extracting the definitions from shortdef

Definition (Entry 1/2):
1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis

2: to cross or interlace on or through : interweave1: a frame of latticework used as a screen or as a support for climbing plants

Definition (Entry 2/2):
1: a frame of latticework used as a screen or as a support for climbing plants

2: a construction (such as a summerhouse) chiefly of latticework

3: an arrangement that forms or gives the effect of a lattice

My current code below is only able to parse non nested lists and I am struggling to figure out how to modify it to be able to parse nested lists:

defin_formatted = ""
word_defin = []
for i in data:
    word_defin.append(i['shortdef'])

for group in word_defin:
    if len(group) > 1:
        result = []
        for i,v in enumerate(group):
            result.append("**{}:** {}".format(i+1, v))
        group = '\n\n'.join(result)
        
    else:
        group = group[0]
    defin_formatted = defin_formatted + group

This code produces this output:

1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis

2: to cross or interlace on or through : interweave1: a frame of latticework used as a screen or as a support for climbing plants

2: a construction (such as a summerhouse) chiefly of latticework

3: an arrangement that forms or gives the effect of a lattice

This is very close so what is intended but flawed

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

绝情姑娘 2025-01-26 23:53:51

以下是我的处理方法。首先,提取到更适合我们想要进行的转换的结构,其次,转换/装饰本身。作为最后一步,我将构建一个包含所有定义(或多个定义)的实际字符串。

import warnings
from collections import defaultdict

data2 = defaultdict(dict)
for i, d in enumerate(data):
    try:
        key, num = d['meta']['id'].split(':', 1)
        defs = d['shortdef']
        data2[key][num] = defs
    except KeyError as e:
        warnings.warn(f'caught in data[{i}]: {e!r}')

data2 = {k: [defs for _, defs in sorted(v.items())] for k, v in data2.items()}

此时,我们有了一个包含所需信息的结构:

>>> data2
{'trellis': [['a frame of latticework used as a screen or as a support for climbing plants',
   'a construction (such as a summerhouse) chiefly of latticework',
   'an arrangement that forms or gives the effect of a lattice'],
  ['to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis',
   'to cross or interlace on or through : interweave']]}

然后,我们可以进行一些转换(所有推导式),以用“(Entry i of n)”和示例的编号来装饰定义。

nested_lines = [
    (f'Definition of {k} (Entry {i} of {len(v)})',
     [f'{j}: {txt}' for j, txt in enumerate(defs, 1)])
    for k, v in data2.items()
    for i, defs in enumerate(v, 1)
]

我们现在有:

>>> nested_lines
[('Definition of trellis (Entry 1 of 2)',
  ['1: a frame of latticework used as a screen or as a support for climbing plants',
   '2: a construction (such as a summerhouse) chiefly of latticework',
   '3: an arrangement that forms or gives the effect of a lattice']),
 ('Definition of trellis (Entry 2 of 2)',
  ['1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis',
   '2: to cross or interlace on or through : interweave'])]

现在,如果我们想打印这些定义,比如说,我们会这样做:

text = '\n\n'.join(['\n'.join([k, '', '\n'.join(v)]) for k, v in nested_lines])

>>> print(text)
Definition of trellis (Entry 1 of 2)

1: a frame of latticework used as a screen or as a support for climbing plants
2: a construction (such as a summerhouse) chiefly of latticework
3: an arrangement that forms or gives the effect of a lattice

Definition of trellis (Entry 2 of 2)

1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis
2: to cross or interlace on or through : interweave

Here is how I would approach this. First, an extraction into a structure more amenable to the transformations we want to do, and second, the transformations/decorations themselves. I would build an actual string with all of a definition (or multiple ones) as the last step.

import warnings
from collections import defaultdict

data2 = defaultdict(dict)
for i, d in enumerate(data):
    try:
        key, num = d['meta']['id'].split(':', 1)
        defs = d['shortdef']
        data2[key][num] = defs
    except KeyError as e:
        warnings.warn(f'caught in data[{i}]: {e!r}')

data2 = {k: [defs for _, defs in sorted(v.items())] for k, v in data2.items()}

At this point, we have a structure with the information we want:

>>> data2
{'trellis': [['a frame of latticework used as a screen or as a support for climbing plants',
   'a construction (such as a summerhouse) chiefly of latticework',
   'an arrangement that forms or gives the effect of a lattice'],
  ['to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis',
   'to cross or interlace on or through : interweave']]}

Then, we can do some transformations (all comprehensions) to decorate the definitions with "(Entry i of n)" and the numbering of the examples.

nested_lines = [
    (f'Definition of {k} (Entry {i} of {len(v)})',
     [f'{j}: {txt}' for j, txt in enumerate(defs, 1)])
    for k, v in data2.items()
    for i, defs in enumerate(v, 1)
]

We now have:

>>> nested_lines
[('Definition of trellis (Entry 1 of 2)',
  ['1: a frame of latticework used as a screen or as a support for climbing plants',
   '2: a construction (such as a summerhouse) chiefly of latticework',
   '3: an arrangement that forms or gives the effect of a lattice']),
 ('Definition of trellis (Entry 2 of 2)',
  ['1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis',
   '2: to cross or interlace on or through : interweave'])]

And now, if we want to print these definitions, say, we would do:

text = '\n\n'.join(['\n'.join([k, '', '\n'.join(v)]) for k, v in nested_lines])

>>> print(text)
Definition of trellis (Entry 1 of 2)

1: a frame of latticework used as a screen or as a support for climbing plants
2: a construction (such as a summerhouse) chiefly of latticework
3: an arrangement that forms or gives the effect of a lattice

Definition of trellis (Entry 2 of 2)

1: to provide with a trellis; especially : to train (a plant, such as a vine) on a trellis
2: to cross or interlace on or through : interweave
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文