Python:使用 f.next() 迭代时在文件中倒回一行

发布于 2024-09-15 14:21:53 字数 610 浏览 11 评论 0原文

当您使用 f.next() 迭代文件时,Python 的 f.tell 无法按我的预期工作:

>>> f=open(".bash_profile", "r")
>>> f.tell()
0
>>> f.next()
"alias rm='rm -i'\n"
>>> f.tell()
397
>>> f.next()
"alias cp='cp -i'\n"
>>> f.tell()
397
>>> f.next()
"alias mv='mv -i'\n"
>>> f.tell()
397

看起来它为您提供了缓冲区的位置,而不是刚刚使用 next() 获得的位置。

我以前使用过seek/tell 技巧 在使用 readline() 迭代文件时倒回一行。使用 next() 时有没有办法倒回一行?

Python's f.tell doesn't work as I expected when you iterate over a file with f.next():

>>> f=open(".bash_profile", "r")
>>> f.tell()
0
>>> f.next()
"alias rm='rm -i'\n"
>>> f.tell()
397
>>> f.next()
"alias cp='cp -i'\n"
>>> f.tell()
397
>>> f.next()
"alias mv='mv -i'\n"
>>> f.tell()
397

Looks like it gives you the position of the buffer rather than the position of what you just got with next().

I've previously used the seek/tell trick to rewind one line when iterating over a file with readline(). Is there a way to rewind one line when using next()?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

因为看清所以看轻 2024-09-22 14:21:53

不。我会制作一个适配器,主要转发所有调用,但在执行 next 时保留最后一行的副本,然后让您调用不同的方法以使该行再次弹出。

实际上,我会让适配器成为一个可以包装任何可迭代对象的适配器,而不是文件的包装器,因为这听起来它在其他上下文中经常有用。

Alex 使用 itertools.tee 适配器的建议也有效,但我认为编写自己的迭代器适配器来处理这种情况通常会更干净。

这是一个例子:

class rewindable_iterator(object):
    not_started = object()

    def __init__(self, iterator):
        self._iter = iter(iterator)
        self._use_save = False
        self._save = self.not_started

    def __iter__(self):
        return self

    def next(self):
        if self._use_save:
            self._use_save = False
        else:
            self._save = self._iter.next()
        return self._save

    def backup(self):
        if self._use_save:
            raise RuntimeError("Tried to backup more than one step.")
        elif self._save is self.not_started:
            raise RuntimeError("Can't backup past the beginning.")
        self._use_save = True


fiter = rewindable_iterator(file('file.txt', 'r'))
for line in fiter:
    result = process_line(line)
    if result is DoOver:
        fiter.backup()

这并不会太难扩展到允许您备份多个值的东西。

No. I would make an adapter that largely forwarded all calls, but kept a copy of the last line when you did next and then let you call a different method to make that line pop out again.

I would actually make the adapter be an adapter that could wrap any iterable instead of a wrapper for file because that sounds like it would be frequently useful in other contexts.

Alex's suggestion of using the itertools.tee adapter also works, but I think writing your own iterator adapter to handle this case in general would be cleaner.

Here is an example:

class rewindable_iterator(object):
    not_started = object()

    def __init__(self, iterator):
        self._iter = iter(iterator)
        self._use_save = False
        self._save = self.not_started

    def __iter__(self):
        return self

    def next(self):
        if self._use_save:
            self._use_save = False
        else:
            self._save = self._iter.next()
        return self._save

    def backup(self):
        if self._use_save:
            raise RuntimeError("Tried to backup more than one step.")
        elif self._save is self.not_started:
            raise RuntimeError("Can't backup past the beginning.")
        self._use_save = True


fiter = rewindable_iterator(file('file.txt', 'r'))
for line in fiter:
    result = process_line(line)
    if result is DoOver:
        fiter.backup()

This wouldn't be too hard to extend into something that allowed you to backup by more than just one value.

迷你仙 2024-09-22 14:21:53

itertools.tee 可能是最不糟糕的方法 - 你可以不要“击败”通过迭代文件完成的缓冲(您也不想这样做:性能影响会很糟糕),因此保留两个迭代器,一个“落后一步”另一个,对我来说似乎是最合理的解决方案。

import itertools as it

with open('a.txt') as f:
  f1, f2 = it.tee(f)
  f2 = it.chain([None], f2)
  for thisline, prevline in it.izip(f1, f2):
    ...

itertools.tee is probably the least-bad approach -- you can't "defeat" the buffering done by iterating on the file (nor would you want to: the performance effects would be terrible), so keeping two iterators, one "one step behind" the other, seems the soundest solution to me.

import itertools as it

with open('a.txt') as f:
  f1, f2 = it.tee(f)
  f2 = it.chain([None], f2)
  for thisline, prevline in it.izip(f1, f2):
    ...
苦行僧 2024-09-22 14:21:53

Python 的文件迭代器会进行大量缓冲,从而将文件中的位置提前到迭代之前。如果您想使用 file.tell() 您必须“以旧方式”执行此操作:

with open(filename) as fileob:
  line = fileob.readline()
  while line:
    print fileob.tell()
    line = fileob.readline()

Python's file iterator does a lot of buffering, thereby advancing the position in the file far ahead of your iteration. If you want to use file.tell() you must do it "the old way":

with open(filename) as fileob:
  line = fileob.readline()
  while line:
    print fileob.tell()
    line = fileob.readline()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文