将非CSV文件的数据解析为Python中的多个列表
我有一个文本文件,其中包含一些非常巨大的数据,该数据代表用户给特定电影的评分。我的文件(.txt)的结构就是这样:
1:
1711859 ,4 ,2005 −05 −08
1245640 ,3 ,2005 −12 −19
2:
808731,4,2005−10−31
337541,5,2005−03−23
1和2表示电影ID的跟随半列,然后用户ID,然后是他给电影的评分,然后是一年。
由于这显然不是CSV文件,因此有人可以指导我如何编写解析器以读取此文件并创建2个列表。一个用于电影ID,另一个是一个包含评分的列表?
I have a text file which contains some really huge data which represents ratings given by users to specific movies. the structure of my file (.txt) is as such:
1:
1711859 ,4 ,2005 −05 −08
1245640 ,3 ,2005 −12 −19
2:
808731,4,2005−10−31
337541,5,2005−03−23
1 and 2 represent the movie ID's follow by a semi column then the user ID followed by the rating he gave to the movie and then the year.
Since this is clearly not a csv file, can someone please guide me on how to write a parser to read this file and create 2 lists. one for the movie ID's and the other, a list containing the ratings?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
,但是可以将其转换为CSV文件,让
file.txt
内容将创建
file.csv
,然后可以喂食。进入CSV解析器。说明:如果当前行没有
,
,请在之前获取
作为Movieid之前的内容,否则打印Movieid,然后是由,
剪切的行。请注意,我将end
设置为line
已经拥有自己的newline。 免责声明:我认为您的文件是UTF-8编码。Right, but it could be converted into csv file, let
file.txt
content bethen
will create
file.csv
Which then could be feed into CSV parser. Explanation: If current line does not have
,
then get what is before:
as movieid, otherwise print movieid followed by line sheared by,
. Note that I setend
to empty string asline
already has it own newline. Disclaimer: I assume your file is UTF-8 encoded.