读取带有间距和逗号的多个列表的文本文件,列表中的元素之间存在熊猫数据框
我有一个称为tropical.txt
的文本文件,该文件具有多个列表,每个列表均由新行分开。请注意,逗号被空间包围。
space here and space here
| |
['papaya' , 'mangosteen' , 'banana']
[]
['coconut' , 'mango']
['mangosteen' , 'papaya']
我尝试了以下代码
import pandas as pd
df = pd.read_csv('tropical.txt', sep='\n', header=None, engine = 'python')
df
,
ValueError: Specified \n as separator or delimiter. This forces the python engine which does not accept a line terminator. Hence it is not allowed to use the line terminator as separator.
如果我要做
import pandas as pd
df = pd.read_csv('tropical.txt', header= None, engine = 'python')
df
输出,我期望
0 1 2
0 ['papaya' 'mangosteen' 'banana']
1 [] None None
2 ['coconut' 'mango'] None
3 ['mangosteen' 'papaya'] None
有
0
0 [papaya,mangosteen,banana]
1 []
2 [coconut,mango]
3 [mangosteen,papaya]
任何建议吗?
I have a text file called tropical.txt
that have multiple lists and each list is separated by a new line. Notice the comma is surrounded by spaces.
space here and space here
| |
['papaya' , 'mangosteen' , 'banana']
[]
['coconut' , 'mango']
['mangosteen' , 'papaya']
I tried the following code
import pandas as pd
df = pd.read_csv('tropical.txt', sep='\n', header=None, engine = 'python')
df
which gives me
ValueError: Specified \n as separator or delimiter. This forces the python engine which does not accept a line terminator. Hence it is not allowed to use the line terminator as separator.
If I were to just do
import pandas as pd
df = pd.read_csv('tropical.txt', header= None, engine = 'python')
df
The output isn't what I wanted
0 1 2
0 ['papaya' 'mangosteen' 'banana']
1 [] None None
2 ['coconut' 'mango'] None
3 ['mangosteen' 'papaya'] None
I am expecting
0
0 [papaya,mangosteen,banana]
1 []
2 [coconut,mango]
3 [mangosteen,papaya]
Any suggestion?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以通过指定不会在行中出现的分离器(例如
\ 0
)来使用read_csv
(以便整体读取每行)和aSt.literal_eval
作为转换器的转换器值:输出:
You can use
read_csv
, by specifying a separator which will not occur in the lines (e.g.\0
) (so that each line will be read as a whole) andast.literal_eval
as a converter for the values:Output: