按行读取行,并将匹配列保存到新文档python -beginne
我需要说我是编码的初学者,如果您经验丰富,这很容易,但是我找不到与我的任何问题,所以我要在这里发送。
我有一个庞大的文档(〜40 GB),这是一个表内容。该文件分为列之间,它们之间有许多空白。另外,我在另一个文档(一个很小的文档)中有一个列表,其中我的名称也包含在大型文档中。如果我的代码在此列中找到了名称与列表中的名称之一匹配的行,那么我想将每个线附加到另一个列表中以仅使用此特定的列表:
我在蜘蛛网上创建了以下代码:
import pandas as pd
station_list = pd.read_fwf(r"D:\Não apagar arquivos importantes\Desktop\Para dados grandes\Lisn.txt", header = None, skiprows = 1)
saved=[]
with open (r"D:\Não apagar arquivos importantes\Downloads\Out 2016\los_20161015.001.h5.txt") as f:
for line in f:
lst=line.split()
sline=" ".join(lst)
tag=lst[12]
if tag in station_list:
saved.append(sline)
因此, 当我运行它时,它会很快逐行读取行,但是..即使我在file Station_list上的大多数名称都在F上的列内,但它并不能将Sline保存到我的新保存列表中。
它可以是什么?有人有建议吗?是否有更有效的使用大文件的方法?
变量的图像我得到了 谢谢
I need to say that I am a beginner at coding and maybe this can be easy if you are more experienced but I couldn't find any problem similar to mine so I am sending here.
I have a huge document (~40 GB) that is a table content. This file is divided in columns with many blank spaces between them. Also, I have a list in another document (a very small one) where I have some names that is also contained in the huge document. If my code find lines that in this column the name matches with one of the names I have on the list, then I want to append everyline to another list to use only this specific ones:
So I created the following code on spider:
import pandas as pd
station_list = pd.read_fwf(r"D:\Não apagar arquivos importantes\Desktop\Para dados grandes\Lisn.txt", header = None, skiprows = 1)
saved=[]
with open (r"D:\Não apagar arquivos importantes\Downloads\Out 2016\los_20161015.001.h5.txt") as f:
for line in f:
lst=line.split()
sline=" ".join(lst)
tag=lst[12]
if tag in station_list:
saved.append(sline)
So, when I run it, it reads line by line very quickly but.. Even though most of the names I have on the file station_list are inside the column on f, it do not save the slines to my new saved list.
What can it be?? Does anybody have a suggestion? Is there any more effective way to use big files?
Image of the variables I get
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论