urllib.urlopen() 如何工作？

发布于 2024-12-01 21:28:25 字数 601 浏览 7 评论 0原文

让我们考虑一个大文件（~100MB）。我们假设该文件是基于行的（文本文件，行相对较短，约为 80 个字符）。如果我使用内置 open()/file() 文件将加载到惰性方式。 IE 如果我执行aFile.readline()，则只有文件的一部分会驻留在内存中。 urllib.urlopen() 是否执行类似的操作（使用磁盘上的缓存）？

urllib.urlopen().readline() 和 file().readline() 性能差异有多大？我们假设该文件位于本地主机上。一旦我用 urllib.urlopen() 打开它，然后用 file() 打开它。当我使用 readline() 循环文件时，性能/内存消耗会有多大差异？

处理通过 urllib.urlopen() 打开的文件的最佳方法是什么？逐行处理是不是更快？或者我应该将一堆行（~50）加载到列表中然后处理该列表？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

半枫 2024-12-08 21:28:25

open （或 file）和 urllib.urlopen 看起来它们或多或少在那里做同样的事情。 urllib.urlopen（基本上）创建一个 socket._socketobject，然后调用 makefile 方法（该方法的内容包含在下面）

def makefile(self, mode='r', bufsize=-1):
    """makefile([mode[, bufsize]]) -> file object

    Return a regular file object corresponding to the socket.  The mode
    and bufsize arguments are as for the built-in open() function."""
    return _fileobject(self._sock, mode, bufsize)

open (or file) and urllib.urlopen look like they're more or less doing the same thing there. urllib.urlopen is (basically) creating a socket._socketobject and then invoking the makefile method (contents of that method included below)

def makefile(self, mode='r', bufsize=-1):
    """makefile([mode[, bufsize]]) -> file object

    Return a regular file object corresponding to the socket.  The mode
    and bufsize arguments are as for the built-in open() function."""
    return _fileobject(self._sock, mode, bufsize)

回复收藏 0 原文