清理嵌套列表

发布于 2024-11-25 12:00:37 字数 524 浏览 1 评论 0原文

我有一个巨大的混乱的嵌套列表，看起来像这样，只是更长：

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]

最终我想要看起来像这样的东西：

neat_fruit = [['watermelon',0,1.0], ['apple',0,1.0], ['pineapple',0,1.0], ['strawberry, banana',0,1.0], ['peach plum pear',0,1.0], ['orange, grape',0,1.0]]

但我不知道如何处理引号中的双引号以及如何将水果从数字，尤其是用逗号分隔一些水果。我尝试了很多事情，但一切似乎都让事情变得更加混乱。任何建议将不胜感激。

原文

I have a huge mess of a nested list that looks something like this, just longer:

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]

Ultimately I want something that looks like this:

neat_fruit = [['watermelon',0,1.0], ['apple',0,1.0], ['pineapple',0,1.0], ['strawberry, banana',0,1.0], ['peach plum pear',0,1.0], ['orange, grape',0,1.0]]

but I'm not sure how to deal with the double quotes in the quotes and how to split the fruits from the numbers, especially with the commas separating some of the fruits. I've tried a bunch of things, but everything just seems to make it even more of a mess. Any suggestions would be greatly appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦断已成空 2024-12-02 12:00:37

使用 csv 模块（在标准库中）来处理名称中带有逗号的双引号水果：

import csv
import io

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]

# flatten the list of lists into a string:
data='\n'.join(item[0].strip() for item in fruit_mess)    
reader=csv.reader(io.BytesIO(data))
neat_fruit=[[fruit,int(num1),float(num2)] for fruit,num1,num2 in reader]

print(neat_fruit)    
# [['watermelon', 0, 1.0], ['apple', 0, 1.0], ['pineapple', 0, 1.0], ['strawberry, banana', 0, 1.0], ['peach plum pear', 0, 1.0], ['orange, grape', 0, 1.0]]

Use the csv module (in the standard library) to handle the double-quoted fruits with commas in their names:

import csv
import io

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]

# flatten the list of lists into a string:
data='\n'.join(item[0].strip() for item in fruit_mess)    
reader=csv.reader(io.BytesIO(data))
neat_fruit=[[fruit,int(num1),float(num2)] for fruit,num1,num2 in reader]

print(neat_fruit)    
# [['watermelon', 0, 1.0], ['apple', 0, 1.0], ['pineapple', 0, 1.0], ['strawberry, banana', 0, 1.0], ['peach plum pear', 0, 1.0], ['orange, grape', 0, 1.0]]

回复收藏 0 原文

妳是的陽光 2024-12-02 12:00:37

另一种简单的解决方案：

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]
for i,x in enumerate(fruit_mess):
    data = x[0].rstrip('\n').rsplit(',', 2)
    fruit_mess[i] = [data[0], int(data[1]), float(data[2])]

One more simple solution:

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]
for i,x in enumerate(fruit_mess):
    data = x[0].rstrip('\n').rsplit(',', 2)
    fruit_mess[i] = [data[0], int(data[1]), float(data[2])]

回复收藏 0 原文

老娘不死你永远是小三 2024-12-02 12:00:37

基于正则表达式的解决方案：

>>> import re
>>> regex = re.compile(r'("[^"]*"|[^,]*),(\d+),([\d.]+)')
>>> neat_fruit = []
>>> for item in fruit_mess:
...     match = regex.match(item[0])
...     result = [match.group(1).strip('"'), int(match.group(2)), float(match.group(3))]
...     neat_fruit.append(result)
...
>>> neat_fruit
[['watermelon', 0, 1.0], ['apple', 0, 1.0], ['pineapple', 0, 1.0], ['strawberry,
 banana', 0, 1.0], ['peach plum pear', 0, 1.0], ['orange, grape', 0, 1.0]]

A regex-based solution:

>>> import re
>>> regex = re.compile(r'("[^"]*"|[^,]*),(\d+),([\d.]+)')
>>> neat_fruit = []
>>> for item in fruit_mess:
...     match = regex.match(item[0])
...     result = [match.group(1).strip('"'), int(match.group(2)), float(match.group(3))]
...     neat_fruit.append(result)
...
>>> neat_fruit
[['watermelon', 0, 1.0], ['apple', 0, 1.0], ['pineapple', 0, 1.0], ['strawberry,
 banana', 0, 1.0], ['peach plum pear', 0, 1.0], ['orange, grape', 0, 1.0]]

回复收藏 0 原文

~没有更多了~