在python中读取用户给定的开始和结束位置之间的文本文件
我有一个巨大的文本文件,我想从中选择性地读取几行。 使用tell()我知道我想要读取的位置。
有没有办法可以读取两个位置之间文件中的所有文本? 像 file.read(beginPos, endPos)
或者读取包含 beginPos 的行号和包含 endPos 的行号之间的所有文本?
I have a huge text file from which i want to selectively read a few lines.
Using tell() i know the positions i want to read between.
Is there a way i can read all the text in the file between the two positions?
like file.read(beginPos, endPos)
or maybe, read all text between line number containing beginPos and line number containing endPos?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您现在有了起点(使用
tell()
)和终点,您可以简单地执行file.read(end-start)
,它会读取 <代码>结束-开始字节。如果开始时的偏移量不正确,请使用 seek() 方法 (file.seek(start)
)。If you now the start point (with
tell()
) and the end point, you could simply do afile.read(end-start)
, it will read theend-start
bytes. If you're not at the correct offset on begining, use the seek() method (file.seek(start)
) first.您需要打开文件,然后
fileobj.seek(beginPos)
,然后fileobj.read(endPos-beginPos)
You will want to open the file then
fileobj.seek(beginPos)
and thenfileobj.read(endPos-beginPos)
您看过使用内存映射吗? (http://docs.python.org/library/mmap.html)
一旦有了文件的内存映射,您就可以像对字符串(或列表)一样对其进行切片,而无需将整个文件读入内存。
如果您只想读取文件的单个部分一次,则可能会造成不必要的复杂性,但如果您要执行大量 IO,则可以使其更易于管理。
来自Python文档:
Have you looked at using memory mapping? (http://docs.python.org/library/mmap.html)
Once you have a memory map of the file, you can slice it like you would a string (or list) without having to read the entire file into memory.
It might be unnecessary complexity if you're only going to read a single section of the file once, but it you're going to do a lot of IO, it can make it much easier to manage.
from the python docs: