将文件下载到内存中
我正在编写一个 python 脚本,我只需要一系列非常小的文本文件的第二行。我想提取此文件而不像我目前那样将文件保存到我的硬盘驱动器上。
我发现了一些引用 TempFile 和 StringIO 模块的线程,但我无法理解它们。
目前,我下载所有文件并按顺序命名它们,例如 1.txt、2.txt 等,然后遍历所有文件并提取第二行。我想打开文件,抓取该行,然后继续查找、打开和读取下一个文件。
以下是我目前将其写入硬盘的操作:
while (count4 <= num_files):
file_p = [directory,str(count4),'.txt']
file_path = ''.join(file_p)
cand_summary = string.strip(linecache.getline(file_path, 2))
linkFile = open('Summary.txt', 'a')
linkFile.write(cand_summary)
linkFile.write("\n")
count4 = count4 + 1
linkFile.close()
I am writing a python script and I just need the second line of a series of very small text files. I would like to extract this without saving the file to my harddrive as I currently do.
I have found a few threads that reference the TempFile and StringIO modules but I was unable to make much sense of them.
Currently I download all of the files and name them sequentially like 1.txt, 2.txt, etc, then go through all of them and extract the second line. I would like to open the file grab the line then move on to finding and opening and reading the next file.
Here is what I do currently with writing it to my HDD:
while (count4 <= num_files):
file_p = [directory,str(count4),'.txt']
file_path = ''.join(file_p)
cand_summary = string.strip(linecache.getline(file_path, 2))
linkFile = open('Summary.txt', 'a')
linkFile.write(cand_summary)
linkFile.write("\n")
count4 = count4 + 1
linkFile.close()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
只需将文件写入替换为对列表上的
append()
的调用即可。例如:顺便说一句,您通常会写
count += 1
。而且count4
看起来也使用从 1 开始的索引。这对于 Python 来说似乎很不寻常。Just replace the file writing with a call to
append()
on a list. For example:As an aside you would normally write
count += 1
. Also it looks likecount4
uses 1-based indexing. That seems pretty unusual for Python.您可以在每次迭代中打开和关闭输出文件。
为什么不简单地这样做
另外,linecache 可能不是正确的工具,因为它针对从同一文件中读取多行进行了优化,而不是从多个文件中读取同一行。
相反,最好这样做
此外,如果您删除
strip()
方法,则不必重新添加\n
,但谁知道为什么您在其中添加它那里。也许.lstrip()
会更好?最后,手动 while 循环是怎么回事?为什么不使用 for 循环呢?
最后,在您发表评论后,我知道您希望将结果放入列表而不是文件中。好的。
总而言之:
You open and close the output file in every iteration.
Why not simply do
Also,
linecache
is probably not the right tool here since it's optimized for reading multiple lines from the same file, not the same line from multiple files.Instead, better do
Also, if you drop the
strip()
method, you don't have to re-add the\n
, but who knows why you have that in there. Perhaps.lstrip()
would be better?Finally, what's with the manual while loop? Why not use a for loop?
Lastly, after your comment, I understand you want to put the result in a list instead of a file. OK.
All in all: