嵌套支架解析字符串

发布于 2025-01-28 20:12:54 字数 728 浏览 0 评论 0原文

我有一个带有以下格式的字符串

[KEY=VALUE, KEY2=VALUE, KEY3=VALUE, KEY4=VALUE_COMPLEX_CHARS, KEY={KEY=VALUE, COLLECTION[KEY][KEY]=VLAUE, COLLECTION[KEY][KEY][KEY]=VALUE}, KEY=VALUE] [TEXT] [CAN BE TEXT OR KEY=VAL]

,我想使用python将其解析,将其映射到字典或列表中。

我能够使用以下代码对其进行解析:

lst1 = [x[0] or x[1] for x in re.findall(r'\[(.*?)\]|\((.*?)\)', str)]
print(lst1[0])

但是问题是,如果字符串包含如上上述嵌套的支架,上面的代码将中断。当输入很简单时,它正常起作用:

[KEY1=VALUE, KEY2=VALUE, KEY=VALUE, KEY=VALUE_COMPLEX_CHARS KEY=VALUE] [TEXT] [CAN BE TEXT OR KEY=VAL]

out put是一个列表,其中包含每个括号之间的所有内容,

list[0] = [...]
list[1] = [...]

请帮助上面的代码,以便用嵌套支架解析复杂的字符串。

非常感谢您的帮助。

I have a string with the following format

[KEY=VALUE, KEY2=VALUE, KEY3=VALUE, KEY4=VALUE_COMPLEX_CHARS, KEY={KEY=VALUE, COLLECTION[KEY][KEY]=VLAUE, COLLECTION[KEY][KEY][KEY]=VALUE}, KEY=VALUE] [TEXT] [CAN BE TEXT OR KEY=VAL]

And I want to parse it using Python, mapping it into dictionary or a list.

I'm able to parse it with the following code:

lst1 = [x[0] or x[1] for x in re.findall(r'\[(.*?)\]|\((.*?)\)', str)]
print(lst1[0])

But the problem is that the code above will break if the string contains nested brackets like the above one. It works normally when the input is simple:

[KEY1=VALUE, KEY2=VALUE, KEY=VALUE, KEY=VALUE_COMPLEX_CHARS KEY=VALUE] [TEXT] [CAN BE TEXT OR KEY=VAL]

The out put is a list contains everything between each bracket

list[0] = [...]
list[1] = [...]

Please help the code above so it can parse complex string with nested brackets.

Thank you very much for your help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

雨轻弹 2025-02-04 20:12:55

我不认为这正是您要寻找的,但这显示了如何处理这样的嵌套结构。

s = "[KEY=VALUE, KEY2=VALUE, KEY3=VALUE, KEY4=VALUE_COMPLEX_CHARS, KEY={KEY=VALUE, COLLECTION[KEY][KEY]=VLAUE, COLLECTION[KEY][KEY][KEY]=VALUE}, KEY=VALUE] [TEXT] [CAN BE TEXT OR KEY=VAL]"

def parse(s):
    accum = []
    last = ''
    key = ''
    nested = 0
    while s:
        c = s.pop(0)
        if c in '[{':
            if last:
                last += c
                nested += 1
            else:
                accum.append( parse(s) )
        elif c == ' ' and not last:
            continue
        elif c == '=':
            key = last
            last = ''
        elif c == ',':
            if key:
                accum.append( (key, last) )
            else:
                accum.append( last )
            key = ''
            last = ''
        elif c in ']}':
            if nested:
                last += c
                nested -= 1
            elif key:
                accum.append( (key, last) )
            elif last:
                accum.append( last )
            return accum
        else:
            last += c
    return accum

s = list(s)
accum = []
while s:
    accum.extend(parse(s))

print(accum)

输出:

[[('KEY', 'VALUE'), ('KEY2', 'VALUE'), ('KEY3', 'VALUE'), ('KEY4', 'VALUE_COMPLEX_CHARS'), [('KEY', 'VALUE')], ['KEY'], 'VLAUE'], ['KEY'], ['KEY'], 'VALUE', '', ('KEY', 'VALUE'), ['TEXT'], [('CAN BE TEXT OR KEY', 'VAL')]]

I don't think this is exactly what you're looking for, but this shows how to handle nested structures like this.

s = "[KEY=VALUE, KEY2=VALUE, KEY3=VALUE, KEY4=VALUE_COMPLEX_CHARS, KEY={KEY=VALUE, COLLECTION[KEY][KEY]=VLAUE, COLLECTION[KEY][KEY][KEY]=VALUE}, KEY=VALUE] [TEXT] [CAN BE TEXT OR KEY=VAL]"

def parse(s):
    accum = []
    last = ''
    key = ''
    nested = 0
    while s:
        c = s.pop(0)
        if c in '[{':
            if last:
                last += c
                nested += 1
            else:
                accum.append( parse(s) )
        elif c == ' ' and not last:
            continue
        elif c == '=':
            key = last
            last = ''
        elif c == ',':
            if key:
                accum.append( (key, last) )
            else:
                accum.append( last )
            key = ''
            last = ''
        elif c in ']}':
            if nested:
                last += c
                nested -= 1
            elif key:
                accum.append( (key, last) )
            elif last:
                accum.append( last )
            return accum
        else:
            last += c
    return accum

s = list(s)
accum = []
while s:
    accum.extend(parse(s))

print(accum)

Output:

[[('KEY', 'VALUE'), ('KEY2', 'VALUE'), ('KEY3', 'VALUE'), ('KEY4', 'VALUE_COMPLEX_CHARS'), [('KEY', 'VALUE')], ['KEY'], 'VLAUE'], ['KEY'], ['KEY'], 'VALUE', '', ('KEY', 'VALUE'), ['TEXT'], [('CAN BE TEXT OR KEY', 'VAL')]]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文