如何将逗号分隔的字符串解析为列表(警告)?

发布于 2024-07-05 16:59:28 字数 211 浏览 4 评论 0原文

我需要能够将像这样的字符串放入:

'''foo, bar, "one, two", three four'''

['foo', 'bar', 'one, two', 'three four']

有一种感觉(来自#python 的提示)解决方案将涉及 shlex 模块。

I need to be able to take a string like:

'''foo, bar, "one, two", three four'''

into:

['foo', 'bar', 'one, two', 'three four']

I have an feeling (with hints from #python) that the solution is going to involve the shlex module.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

梦行七里 2024-07-12 16:59:28

这取决于您想要变得多么复杂......您是否想要允许不止一种类型的引用。 转义引号怎么样?

您的语法看起来非常像常见的 CSV 文件格式,Python 标准库支持该格式:

import csv
reader = csv.reader(['''foo, bar, "one, two", three four'''], skipinitialspace=True)
for r in reader:
  print r

输出:

['foo', 'bar', 'one, two', 'three four']

HTH!

It depends how complicated you want to get... do you want to allow more than one type of quoting. How about escaped quotes?

Your syntax looks very much like the common CSV file format, which is supported by the Python standard library:

import csv
reader = csv.reader(['''foo, bar, "one, two", three four'''], skipinitialspace=True)
for r in reader:
  print r

Outputs:

['foo', 'bar', 'one, two', 'three four']

HTH!

邮友 2024-07-12 16:59:28

shlex 模块解决方案允许转义引号、一个引号转义另一个引号以及 shell 支持的所有奇特内容。

>>> import shlex
>>> my_splitter = shlex.shlex('''foo, bar, "one, two", three four''', posix=True)
>>> my_splitter.whitespace += ','
>>> my_splitter.whitespace_split = True
>>> print list(my_splitter)
['foo', 'bar', 'one, two', 'three', 'four']

转义引号示例:

>>> my_splitter = shlex.shlex('''"test, a",'foo,bar",baz',bar \xc3\xa4 baz''',
                              posix=True) 
>>> my_splitter.whitespace = ',' ; my_splitter.whitespace_split = True 
>>> print list(my_splitter)
['test, a', 'foo,bar",baz', 'bar \xc3\xa4 baz']

The shlex module solution allows escaped quotes, one quote escape another, and all fancy stuff shell supports.

>>> import shlex
>>> my_splitter = shlex.shlex('''foo, bar, "one, two", three four''', posix=True)
>>> my_splitter.whitespace += ','
>>> my_splitter.whitespace_split = True
>>> print list(my_splitter)
['foo', 'bar', 'one, two', 'three', 'four']

escaped quotes example:

>>> my_splitter = shlex.shlex('''"test, a",'foo,bar",baz',bar \xc3\xa4 baz''',
                              posix=True) 
>>> my_splitter.whitespace = ',' ; my_splitter.whitespace_split = True 
>>> print list(my_splitter)
['test, a', 'foo,bar",baz', 'bar \xc3\xa4 baz']
风追烟花雨 2024-07-12 16:59:28

您可能还需要考虑 csv 模块。 我还没有尝试过,但看起来你的输入数据更接近 CSV,而不是 shell 语法(这是 shlex 解析的)。

You may also want to consider the csv module. I haven't tried it, but it looks like your input data is closer to CSV than to shell syntax (which is what shlex parses).

做个ˇ局外人 2024-07-12 16:59:28

你可以这样做:

>>> import re
>>> pattern = re.compile(r'\s*("[^"]*"|.*?)\s*,')
>>> def split(line):
...  return [x[1:-1] if x[:1] == x[-1:] == '"' else x
...          for x in pattern.findall(line.rstrip(',') + ',')]
... 
>>> split("foo, bar, baz")
['foo', 'bar', 'baz']
>>> split('foo, bar, baz, "blub blah"')
['foo', 'bar', 'baz', 'blub blah']

You could do something like this:

>>> import re
>>> pattern = re.compile(r'\s*("[^"]*"|.*?)\s*,')
>>> def split(line):
...  return [x[1:-1] if x[:1] == x[-1:] == '"' else x
...          for x in pattern.findall(line.rstrip(',') + ',')]
... 
>>> split("foo, bar, baz")
['foo', 'bar', 'baz']
>>> split('foo, bar, baz, "blub blah"')
['foo', 'bar', 'baz', 'blub blah']
终难愈 2024-07-12 16:59:28

如果它不需要很漂亮,这可能会让你上路:

def f(s, splitifeven):
    if splitifeven & 1:
        return [s]
    return [x.strip() for x in s.split(",") if x.strip() != '']

ss = 'foo, bar, "one, two", three four'

print sum([f(s, sie) for sie, s in enumerate(ss.split('"'))], [])

If it doesn't need to be pretty, this might get you on your way:

def f(s, splitifeven):
    if splitifeven & 1:
        return [s]
    return [x.strip() for x in s.split(",") if x.strip() != '']

ss = 'foo, bar, "one, two", three four'

print sum([f(s, sie) for sie, s in enumerate(ss.split('"'))], [])
情话墙 2024-07-12 16:59:28

我想说正则表达式就是您在这里寻找的,尽管我对 Python 的正则表达式引擎不是很熟悉。

假设您使用惰性匹配,您可以在字符串上获取一组匹配项,然后将其放入数组中。

I'd say a regular expression would be what you're looking for here, though I'm not terribly familiar with Python's Regex engine.

Assuming you use lazy matches, you can get a set of matches on a string which you can put into your array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文