如何读取大型日志/txt文件(在几个GB中),首先在内存中占用n个线数,然后采用下一个n个行数
我尝试了这个程序,该程序正在通过块中的字符读取我的文件,这是我想要的行为
def read_in_chunks(file_object, chunk_size=1024):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
with open('really_big_file.dat') as f:
for piece in read_in_chunks(f):
print(piece)
,但是当我尝试使用readlines()应用相同的方法时,它对我不起作用。这是我正在尝试的代码。.
def read_in_chunks(file_object, chunk_size=5):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.readlines()[0:chunk_size]
if not data:
break
yield data
with open('Traefik.log') as f:
for piece in read_in_chunks(f):
print(piece)
有人可以帮助我如何实现n个行数的相同块行为?
I have tried this program which is reading my file by characters in chunks which is the behaviour I want
def read_in_chunks(file_object, chunk_size=1024):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
with open('really_big_file.dat') as f:
for piece in read_in_chunks(f):
print(piece)
But When I try to apply the same method using readlines() then it doesn't works for me. Here is the code I am trying..
def read_in_chunks(file_object, chunk_size=5):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.readlines()[0:chunk_size]
if not data:
break
yield data
with open('Traefik.log') as f:
for piece in read_in_chunks(f):
print(piece)
Can Somebody help me how can I achieve the same chunks behaviour for N number of lines?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
default 。 /a>将流的全部内容读取到列表中。但是您可以给它一个字节大小以在块中产生线条:
因此,您可以将功能调整为类似的功能:
但这不能保证每块固定的线数。如果您在文档中进一步查看,您会找到以下建议:
这暗示着这样的事情
可能更适合。
By default .readlines() reads the whole content of the stream into a list. But you can give it a byte size to produce lines in chunks:
So, you could adjust your function to something like:
But that doesn't guarantee a fixed number of lines per chunk. If you look a bit further in the docs you'll find the following advice:
That's a hint that something like this
might be better suited.