为什么我可以在打开的文件上两次调用read()?

发布于 2025-02-01 20:58:53 字数 516 浏览 3 评论 0原文

对于我正在进行的练习,我正在尝试使用read()方法两次读取给定文件的内容。奇怪的是,当我第二次称其为字符串时,它似乎并没有将文件内容返回?

我当然知道这不是最有效或最佳的方法

f = f.open()

# get the year
match = re.search(r'Popularity in (\d+)', f.read())

if match:
  print match.group(1)

# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f.read())

if matches:
  # matches is always None

,这不是这里的重点。关键是,为什么我不能两次调用read()?我必须重置文件句柄吗?还是关闭 /重新打开文件以便这样做?

For an exercise I'm doing, I'm trying to read the contents of a given file twice using the read() method. Strangely, when I call it the second time, it doesn't seem to return the file content as a string?

Here's the code

f = f.open()

# get the year
match = re.search(r'Popularity in (\d+)', f.read())

if match:
  print match.group(1)

# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f.read())

if matches:
  # matches is always None

Of course I know that this is not the most efficient or best way, this is not the point here. The point is, why can't I call read() twice? Do I have to reset the file handle? Or close / reopen the file in order to do that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

夢归不見 2025-02-08 20:58:53

调用read()通过整个文件读取,并将读取光标留在文件末尾(仅此而已)。如果您想一次读取一定数量的行,则可以使用readline()readlines()或通过使用for in handle in handle in :

要直接回答您的问题,一旦读取文件,read()您可以使用seek> seek(0)将读取光标返回到文件的开始(文档为在这里)。如果您知道该文件不会太大,也可以将read()输出保存到变量,在您的findall表达式中使用它。

PS。完成后,不要忘记关闭文件。

Calling read() reads through the entire file and leaves the read cursor at the end of the file (with nothing more to read). If you are looking to read a certain number of lines at a time you could use readline(), readlines() or iterate through lines with for line in handle:.

To answer your question directly, once a file has been read, with read() you can use seek(0) to return the read cursor to the start of the file (docs are here). If you know the file isn't going to be too large, you can also save the read() output to a variable, using it in your findall expressions.

Ps. Don't forget to close the file after you are done with it.

绅刃 2025-02-08 20:58:53

如其他答案所建议的,您应该使用seek()

我只是写一个例子:

>>> a = open('file.txt')
>>> a.read()
#output
>>> a.seek(0)
>>> a.read()
#same output

As other answers suggested, you should use seek().

I'll just write an example:

>>> a = open('file.txt')
>>> a.read()
#output
>>> a.seek(0)
>>> a.read()
#same output
寻找一个思念的角度 2025-02-08 20:58:53

到目前为止回答这个问题的每个人都是绝对正确的 - read()都可以通过文件移动,因此您打电话给它,您将无法再次调用它。

我要补充的是,在您的特殊情况下,您无需查找启动或重新打开文件,您只能将读取的文本存储在本地变量中,并使用两次或在您的计划中,您会喜欢多次:

f = f.open()
text = f.read() # read the file into a local variable
# get the year
match = re.search(r'Popularity in (\d+)', text)
if match:
  print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', text)
if matches:
  # matches will now not always be None

Everyone who has answered this question so far is absolutely right - read() moves through the file, so after you've called it, you can't call it again.

What I'll add is that in your particular case, you don't need to seek back to the start or reopen the file, you can just store the text that you've read in a local variable, and use it twice, or as many times as you like, in your program:

f = f.open()
text = f.read() # read the file into a local variable
# get the year
match = re.search(r'Popularity in (\d+)', text)
if match:
  print match.group(1)
# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', text)
if matches:
  # matches will now not always be None
几度春秋 2025-02-08 20:58:53

读取指针移至最后一个读字节/字符之后。使用seek()方法将读取指针倒入开始。

The read pointer moves to after the last read byte/character. Use the seek() method to rewind the read pointer to the beginning.

二智少女猫性小仙女 2025-02-08 20:58:53

每个打开的文件都有关联的位置。
当您阅读()时,您会从该位置阅读。
例如读取(10)从新打开的文件中读取前10个字节,然后另一个读取(10)读取下一个10个字节。
read()没有参数读取文件的所有内容,将文件位置留在文件末尾。下次您致电read()无需阅读。

您可以使用seek移动文件位置。或者在您的情况下可能更好的是做一个read()并保留两个搜索结果。

Every open file has an associated position.
When you read() you read from that position.
For example read(10) reads the first 10 bytes from a newly opened file, then another read(10) reads the next 10 bytes.
read() without arguments reads all of the contents of the file, leaving the file position at the end of the file. Next time you call read() there is nothing to read.

You can use seek to move the file position. Or probably better in your case would be to do one read() and keep the result for both searches.

太阳哥哥 2025-02-08 20:58:53

read() 消费。因此,您可以在重新阅读之前 reset seek 。或者,如果它适合您的任务,则可以使用读取(n)仅消耗n字节。

read() consumes. So, you could reset the file, or seek to the start before re-reading. Or, if it suites your task, you can use read(n) to consume only n bytes.

空城仅有旧梦在 2025-02-08 20:58:53

我总是发现阅读方法沿着黑暗的小巷走了一些东西。您会稍微停下来停下来,但是如果您不计算自己的步骤,则不确定自己的距离是多远。 Seek通过重新定位提供解决方案,另一个选项是确定哪个返回文件沿文件的位置。可能是python文件API可以将读取和寻找到read_from(位置,字节)以使其变得更简单 - 直到发生这种情况,您应该读取

I always find the read method something of a walk down a dark alley. You go down a bit and stop but if you are not counting your steps you are not sure how far along you are. Seek gives the solution by repositioning, the other option is Tell which returns the position along the file. May be the Python file api can combine read and seek into a read_from(position,bytes) to make it simpler - till that happens you should read this page.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文