读取带有间距和逗号的多个列表的文本文件，列表中的元素之间存在熊猫数据框

发布于 2025-02-11 08:03:10 字数 1124 浏览 3 评论 0原文

我有一个称为tropical.txt的文本文件，该文件具有多个列表，每个列表均由新行分开。请注意，逗号被空间包围。

 space here and space here
         | |
['papaya' , 'mangosteen' , 'banana']
[]
['coconut' , 'mango']
['mangosteen' , 'papaya']

我尝试了以下代码

import pandas as pd

df = pd.read_csv('tropical.txt', sep='\n', header=None, engine = 'python')
df

，

ValueError: Specified \n as separator or delimiter. This forces the python engine which does not accept a line terminator. Hence it is not allowed to use the line terminator as separator.

如果我要做

import pandas as pd

df = pd.read_csv('tropical.txt', header= None, engine = 'python')
df

输出，我期望

         0           1             2
0   ['papaya'   'mangosteen'    'banana']
1   []               None        None
2   ['coconut'      'mango']     None
3   ['mangosteen'   'papaya']    None

有


                        0   
0   [papaya,mangosteen,banana]
1   []  
2   [coconut,mango] 
3   [mangosteen,papaya]

任何建议吗？

原文

I have a text file called tropical.txt that have multiple lists and each list is separated by a new line. Notice the comma is surrounded by spaces.

 space here and space here
         | |
['papaya' , 'mangosteen' , 'banana']
[]
['coconut' , 'mango']
['mangosteen' , 'papaya']

I tried the following code

import pandas as pd

df = pd.read_csv('tropical.txt', sep='\n', header=None, engine = 'python')
df

which gives me

ValueError: Specified \n as separator or delimiter. This forces the python engine which does not accept a line terminator. Hence it is not allowed to use the line terminator as separator.

If I were to just do

import pandas as pd

df = pd.read_csv('tropical.txt', header= None, engine = 'python')
df

The output isn't what I wanted

         0           1             2
0   ['papaya'   'mangosteen'    'banana']
1   []               None        None
2   ['coconut'      'mango']     None
3   ['mangosteen'   'papaya']    None

I am expecting


                        0   
0   [papaya,mangosteen,banana]
1   []  
2   [coconut,mango] 
3   [mangosteen,papaya]

Any suggestion?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

归属感 2025-02-18 08:03:10

您可以通过指定不会在行中出现的分离器（例如\ 0）来使用read_csv（以便整体读取每行）和 aSt.literal_eval 作为转换器的转换器值：

import ast

pd.read_csv('tropical.txt', header=None, sep='\0', names=['fruits'], converters={ 'fruits' : ast.literal_eval })

输出：

                         fruits
0  [papaya, mangosteen, banana]
1                            []
2              [coconut, mango]
3          [mangosteen, papaya]

You can use read_csv, by specifying a separator which will not occur in the lines (e.g. \0) (so that each line will be read as a whole) and ast.literal_eval as a converter for the values:

import ast

pd.read_csv('tropical.txt', header=None, sep='\0', names=['fruits'], converters={ 'fruits' : ast.literal_eval })

Output:

                         fruits
0  [papaya, mangosteen, banana]
1                            []
2              [coconut, mango]
3          [mangosteen, papaya]

回复收藏 0 原文

~没有更多了~