读取带有间距和逗号的多个列表的文本文件,列表中的元素之间存在熊猫数据框

发布于 2025-02-11 08:03:10 字数 1124 浏览 3 评论 0原文

我有一个称为tropical.txt的文本文件,该文件具有多个列表,每个列表均由新行分开。请注意,逗号被空间包围。

 space here and space here
         | |
['papaya' , 'mangosteen' , 'banana']
[]
['coconut' , 'mango']
['mangosteen' , 'papaya']

我尝试了以下代码

import pandas as pd

df = pd.read_csv('tropical.txt', sep='\n', header=None, engine = 'python')
df

ValueError: Specified \n as separator or delimiter. This forces the python engine which does not accept a line terminator. Hence it is not allowed to use the line terminator as separator.

如果我要做

import pandas as pd

df = pd.read_csv('tropical.txt', header= None, engine = 'python')
df

输出,我期望

         0           1             2
0   ['papaya'   'mangosteen'    'banana']
1   []               None        None
2   ['coconut'      'mango']     None
3   ['mangosteen'   'papaya']    None



                        0   
0   [papaya,mangosteen,banana]
1   []  
2   [coconut,mango] 
3   [mangosteen,papaya]


任何建议吗?

I have a text file called tropical.txt that have multiple lists and each list is separated by a new line. Notice the comma is surrounded by spaces.

 space here and space here
         | |
['papaya' , 'mangosteen' , 'banana']
[]
['coconut' , 'mango']
['mangosteen' , 'papaya']

I tried the following code

import pandas as pd

df = pd.read_csv('tropical.txt', sep='\n', header=None, engine = 'python')
df

which gives me

ValueError: Specified \n as separator or delimiter. This forces the python engine which does not accept a line terminator. Hence it is not allowed to use the line terminator as separator.

If I were to just do

import pandas as pd

df = pd.read_csv('tropical.txt', header= None, engine = 'python')
df

The output isn't what I wanted

         0           1             2
0   ['papaya'   'mangosteen'    'banana']
1   []               None        None
2   ['coconut'      'mango']     None
3   ['mangosteen'   'papaya']    None


I am expecting


                        0   
0   [papaya,mangosteen,banana]
1   []  
2   [coconut,mango] 
3   [mangosteen,papaya]


Any suggestion?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

归属感 2025-02-18 08:03:10

您可以通过指定不会在行中出现的分离器(例如\ 0)来使用read_csv(以便整体读取每行)和 aSt.literal_eval 作为转换器的转换器值:

import ast

pd.read_csv('tropical.txt', header=None, sep='\0', names=['fruits'], converters={ 'fruits' : ast.literal_eval })

输出:

                         fruits
0  [papaya, mangosteen, banana]
1                            []
2              [coconut, mango]
3          [mangosteen, papaya]

You can use read_csv, by specifying a separator which will not occur in the lines (e.g. \0) (so that each line will be read as a whole) and ast.literal_eval as a converter for the values:

import ast

pd.read_csv('tropical.txt', header=None, sep='\0', names=['fruits'], converters={ 'fruits' : ast.literal_eval })

Output:

                         fruits
0  [papaya, mangosteen, banana]
1                            []
2              [coconut, mango]
3          [mangosteen, papaya]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文