Node.js 中 fs.createReadStream 与 fs.readFile 的优缺点是什么?

发布于 2024-10-10 08:22:58 字数 690 浏览 0 评论 0原文

我正在研究 Node.js,并发现了两种读取文件并将其发送到网络的方法,一旦我确定文件存在并使用 writeHead 发送了正确的 MIME 类型:

// read the entire file into memory and then spit it out

fs.readFile(filename, function(err, data){
  if (err) throw err;
  response.write(data, 'utf8');
  response.end();
});

// read and pass the file as a stream of chunks

fs.createReadStream(filename, {
  'flags': 'r',
  'encoding': 'binary',
  'mode': 0666,
  'bufferSize': 4 * 1024
}).addListener( "data", function(chunk) {
  response.write(chunk, 'binary');
}).addListener( "close",function() {
  response.end();
});

我是否正确假设 fs如果相关文件很大(例如视频),.createReadStream 可能会提供更好的用户体验?感觉它可能不那么块状;这是真的吗?还有其他我需要知道的优点、缺点、注意事项或陷阱吗?

I'm mucking about with node.js and have discovered two ways of reading a file and sending it down the wire, once I've established that it exists and have sent the proper MIME type with writeHead:

// read the entire file into memory and then spit it out

fs.readFile(filename, function(err, data){
  if (err) throw err;
  response.write(data, 'utf8');
  response.end();
});

// read and pass the file as a stream of chunks

fs.createReadStream(filename, {
  'flags': 'r',
  'encoding': 'binary',
  'mode': 0666,
  'bufferSize': 4 * 1024
}).addListener( "data", function(chunk) {
  response.write(chunk, 'binary');
}).addListener( "close",function() {
  response.end();
});

Am I correct in assuming that fs.createReadStream might provide a better user experience if the file in question was something large, like a video? It feels like it might be less block-ish; is this true? Are there other pros, cons, caveats, or gotchas I need to know?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

少女情怀诗 2024-10-17 08:22:58

如果您只是要将“data”连接到“write()”并将“close”连接到“end()”,则更好的方法是:

// 0.3.x style
fs.createReadStream(filename, {
  'bufferSize': 4 * 1024
}).pipe(response)

// 0.2.x style
sys.pump(fs.createReadStream(filename, {
  'bufferSize': 4 * 1024
}), response)

read.pipe(write)sys.pump(read, write) 方法的优点是还可以添加流量控制。因此,如果写入流无法快速接受数据,它会告诉读取流后退,以最大限度地减少内存中缓冲的数据量。

flags:"r"mode:0666 暗示它是一个 FileReadStreambinary 编码已被弃用——如果未指定编码,它将仅适用于原始数据缓冲区。

此外,您还可以添加一些其他功能,使您的文件服务更加顺畅:

  1. 嗅探 req.headers.range 并查看它是否与 /bytes=([0 -9]+)-([0-9]+)/。如果是这样,您只想从该开始位置流式传输到结束位置。 (缺少的数字表示 0 或“结束”。)
  2. 将 stat() 调用中的 inode 和创建时间散列到 ETag 标头中。如果您收到的请求标头中包含与该标头匹配的“if-none-match”,请发回 304 Not Modified
  3. 根据 stat 对象上的 mtime 日期检查 if-modified-since 标头。 304 如果自提供的日期以来未进行修改。

另外,一般来说,如果可以的话,请发送 Content-Length 标头。 (你正在stat-ing该文件,所以你应该有这个。)

A better approach, if you are just going to hook up "data" to "write()" and "close" to "end()":

// 0.3.x style
fs.createReadStream(filename, {
  'bufferSize': 4 * 1024
}).pipe(response)

// 0.2.x style
sys.pump(fs.createReadStream(filename, {
  'bufferSize': 4 * 1024
}), response)

The read.pipe(write) or sys.pump(read, write) approach has the benefit of also adding flow control. So, if the write stream cannot accept data as quickly, it'll tell the read stream to back off, so as to minimize the amount of data getting buffered in memory.

The flags:"r" and mode:0666 are implied by the fact that it is a FileReadStream. The binary encoding is deprecated -- if an encoding is not specified, it'll just work with the raw data buffers.

Also, you could add some other goodies that will make your file serving a whole lot slicker:

  1. Sniff for req.headers.range and see if it matches a string like /bytes=([0-9]+)-([0-9]+)/. If so, you want to just stream from that start to end location. (Missing number means 0 or "the end".)
  2. Hash the inode and creation time from the stat() call into an ETag header. If you get a request header with "if-none-match" matching that header, send back a 304 Not Modified.
  3. Check the if-modified-since header against the mtime date on the stat object. 304 if it wasn't modified since the date provided.

Also, in general, if you can, send a Content-Length header. (You're stat-ing the file, so you should have this.)

冷血 2024-10-17 08:22:58

正如您所指出的,fs.readFile 会将整个文件加载到内存中,而 fs.createReadStream 将以您指定的大小块读取文件。

客户端还将使用 fs.createReadStream 开始更快地接收数据,因为数据在读取时以块的形式发送出去,而 fs.readFile 将读取整个文件然后才开始将其发送给客户端。这可能可以忽略不计,但如果文件很大并且磁盘速度很慢,则可能会产生影响。

但想一想,如果您在 100MB 文件上运行这两个函数,第一个函数将使用 100MB 内存来加载文件,而后者最多只使用 4KB。

编辑:我真的不明白你为什么要使用 fs.readFile 特别是因为你说你将打开大文件。

fs.readFile will load the entire file into memory as you pointed out, while as fs.createReadStream will read the file in chunks of the size you specify.

The client will also start receiving data faster using fs.createReadStream as it is sent out in chunks as it is being read, while as fs.readFile will read the entire file out and only then start sending it to the client. This might be negligible, but can make a difference if the file is very big and the disks are slow.

Think about this though, if you run these two functions on a 100MB file, the first one will use 100MB memory to load up the file while as the latter would only use at most 4KB.

Edit: I really don't see any reason why you'd use fs.readFile especially since you said you will be opening large files.

2024-10-17 08:22:58

如果它是一个大文件,那么“readFile”会占用内存,因为它会缓冲内存中的所有文件内容,并可能会挂起您的系统。
而ReadStream则分块读取。

运行此代码并观察任务管理器性能选项卡中的内存使用情况。

 var fs = require('fs');

const file = fs.createWriteStream('./big_file');


for(let i=0; i<= 1000000000; i++) {
  file.write('Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n');
}

file.end();


//..............
fs.readFile('./big_file', (err, data) => {
  if (err) throw err;
  console.log("done !!");
});

事实上,你不会看到“完成!!”信息。
“readFile”将无法读取文件内容,因为缓冲区不够大,无法容纳文件内容。

现在,使用 readStream 代替“readFile”并监视内存使用情况。

注意:代码取自 Pluralsight 上的 Samer Buna Node 课程

If it's a big file then "readFile" would hog the memory as it buffer all the file content in the memory and may hang your system.
While ReadStream read in chunks.

Run this code and observe the memory usage in performance tab of task manager.

 var fs = require('fs');

const file = fs.createWriteStream('./big_file');


for(let i=0; i<= 1000000000; i++) {
  file.write('Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n');
}

file.end();


//..............
fs.readFile('./big_file', (err, data) => {
  if (err) throw err;
  console.log("done !!");
});

Infact,you won't see "done !!" message.
"readFile" wouldn't be able to read the file content as buffer is not big enough to hold the file content.

Now instead of "readFile", use readStream and monitor memory usage.

Note : code is taken from Samer buna Node course on Pluralsight

玉环 2024-10-17 08:22:58

另一个可能不太为人所知的事情是,我相信与 fs.createReadStream 相比,使用 fs.readFile 后 Node 更擅长清理未使用的内存。您应该对此进行测试以验证哪种效果最好。另外,我知道,随着 Node 的每一个新版本的出现,这一点都变得更好(即垃圾收集器在处理这些类型的情况时变得更加智能)。

Another, perhaps not so well known thing, is that I believe that Node is better at cleaning up non-used memory after using fs.readFile compared to fs.createReadStream. You should test this to verify what works best. Also, I know that by every new version of Node, this has gotten better (i.e. the garbage collector has become smarter with these types of situations).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文