Ruby on Rails 3:通过 Rails 将数据流式传输到客户端
我正在开发一个 Ruby on Rails 应用程序,该应用程序与 RackSpace cloudfiles 进行通信(类似于 Amazon S3,但缺少一些功能)。
由于缺乏每个对象的访问权限和查询字符串身份验证,用户的下载必须通过应用程序进行协调。
在 Rails 2.3 中,看起来您可以按如下方式动态构建响应:(
# Streams about 180 MB of generated data to the browser.
render :text => proc { |response, output|
10_000_000.times do |i|
output.write("This is line #{i}\n")
end
}
来自 http ://api.rubyonrails.org/classes/ActionController/Base.html#M000464)
我可以将我的cloudfiles流生成代码转储到那里,而不是10_000_000.times...
。
问题是,这是我尝试在 Rails 3 中使用此技术时得到的输出。
#<Proc:0x000000010989a6e8@/Users/jderiksen/lt/lt-uber/site/app/controllers/prospect_uploads_controller.rb:75>
看起来可能 proc 对象的 call
方法没有被调用?还有其他想法吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
分配给
response_body
一个响应#each
的对象:如果您使用的是 1.9.x 或 反向移植 gem,您可以使用
Enumerator.new
:请注意,何时以及是否刷新数据取决于所使用的 Rack 处理程序和底层服务器。例如,我已经确认 Mongrel 会传输数据,但其他用户报告说 WEBrick 会缓冲数据,直到响应关闭。没有办法强制响应刷新。
在 Rails 3.0.x 中,还有几个额外的问题:
Rack 和 Rails 之间交互中的错误导致每个请求调用
#each
两次。这是另一个未解决的错误。您可以使用以下猴子补丁来解决这个问题:这两个问题都在 Rails 3.1 中得到了解决,其中 HTTP 流是一项重要功能。
请注意另一个常见的建议, self.response_body = proc {|response, output| ...},在 Rails 3.0.x 中可以工作,但在 3.1 中已被弃用(并且不再实际传输数据)。分配一个响应
#each
的对象适用于所有 Rails 3 版本。Assign to
response_body
an object that responds to#each
:If you are using 1.9.x or the Backports gem, you can write this more compactly using
Enumerator.new
:Note that when and if the data is flushed depends on the Rack handler and underlying server being used. I have confirmed that Mongrel, for instance, will stream the data, but other users have reported that WEBrick, for instance, buffers it until the response is closed. There is no way to force the response to flush.
In Rails 3.0.x, there are several additional gotchas:
A bug in the interaction between Rack and Rails causes
#each
to be called twice for each request. This is another open bug. You can work around it with the following monkey patch:Both problems are fixed in Rails 3.1, where HTTP streaming is a marquee feature.
Note that the other common suggestion,
self.response_body = proc {|response, output| ...}
, does work in Rails 3.0.x, but has been deprecated (and will no longer actually stream the data) in 3.1. Assigning an object that responds to#each
works in all Rails 3 versions.感谢以上所有帖子,这里提供了用于传输大型 CSV 的完整工作代码。此代码:
ruby 1.9.3 和 heroku 使用 unicorn,带有单个 dyno。
允许的内存。
控制器方法:
config/unicorn.rb
Model.rb
Thanks to all the posts above, here is fully working code to stream large CSVs. This code:
ruby 1.9.3 and heroku using unicorn, with single dyno.
allowed memory.
Controller Method:
config/unicorn.rb
Model.rb
看起来这在 Rails 3 中不可用
https:// rails.lighthouseapp.com/projects/8994/tickets/2546-render-text-proc
这似乎在我的控制器中对我有用:
It looks like this isn't available in Rails 3
https://rails.lighthouseapp.com/projects/8994/tickets/2546-render-text-proc
This appeared to work for me in my controller:
如果您分配给response_body一个响应#each方法的对象,并且它会缓冲直到响应关闭,请尝试在操作控制器中:
self.response.headers['Last-Modified'] = Time.now.to_s
In case you are assigning to response_body an object that responds to #each method and it's buffering until the response is closed, try in in action controller:
self.response.headers['Last-Modified'] = Time.now.to_s
仅供记录,rails >= 3.1 有一种简单的方法来流数据,方法是将响应 #each 方法的对象分配给控制器的响应。
一切都在这里解释: http://blog.sparqcode.com/2012/02/04/streaming-data-with-rails-3-1-or-3-2/
Just for the record, rails >= 3.1 has an easy way to stream data by assigning an object that respond to #each method to the controller's response.
Everything is explained here: http://blog.sparqcode.com/2012/02/04/streaming-data-with-rails-3-1-or-3-2/
是的,response_body 是 Rails 3 目前执行此操作的方式: https://rails.lighthouseapp.com/projects/8994/tickets/4554-render-text-proc-regression
Yes, response_body is the Rails 3 way of doing this for the moment: https://rails.lighthouseapp.com/projects/8994/tickets/4554-render-text-proc-regression
这也解决了我的问题 - 我有 gzip 的 CSV 文件,想以解压的 CSV 形式发送给用户,所以我使用 GzipReader 一次读取一行。
如果您尝试下载大文件,这些行也很有帮助:
self.response.headers["Content-Type"] = "application/octet-stream"
self.response.headers["Content-Disposition"] = "附件;文件名=#{文件名}"
This solved my problem as well - I have gzip'd CSV files, want to send to the user as unzipped CSV, so I read them a line at a time using a GzipReader.
These lines are also helpful if you're trying to deliver a big file as a download:
self.response.headers["Content-Type"] = "application/octet-stream"
self.response.headers["Content-Disposition"] = "attachment; filename=#{filename}"
此外,您还必须自行设置'Content-Length'标头。
如果不是,Rack 将不得不等待(将主体数据缓冲到内存中)来确定长度。
使用上述方法将会毁掉你的努力。
就我而言,我可以确定长度。
如果不能,您需要让 Rack 开始发送不带“Content-Length”标头的正文。
尝试在“run”之前的“require”之后添加到config.ru中“use Rack::Chunked”。 (感谢阿卡迪)
In addition, you will have to set the 'Content-Length' header by your self.
If not, Rack will have to wait (buffering body data into memory) to determine the length.
And it will ruin your efforts using the methods described above.
In my case, I could determine the length.
In cases you can't, you need to make Rack to start sending body without a 'Content-Length' header.
Try to add into config.ru "use Rack::Chunked" after 'require' before the 'run'. (Thanks arkadiy)
我在灯塔票中评论,只是想说 self.response_body = proc 方法对我有用,尽管我需要使用 Mongrel 而不是 WEBrick 才能成功。
马丁
I commented in the lighthouse ticket, just wanted to say the self.response_body = proc approach worked for me though I needed to use Mongrel instead of WEBrick to succeed.
Martin
应用约翰的解决方案和埃克斯奎尔的建议对我有用。
该语句
将响应标记为在机架中不可缓存。
经过进一步调查后,我认为人们也可以使用这个:
对我来说,这只是稍微更直观一些。它将消息传达给可能正在阅读我的代码的任何其他人。另外,如果未来版本的rack停止检查Last-Modified,很多代码可能会被破坏,人们可能需要一段时间才能弄清楚原因。
Applying John's solution along with Exequiel's suggestion worked for me.
The statement
marks the response as non-cacheable in rack.
After investigating further, I figured one could also use this :
This, to me, is just slightly more intuitive. It conveys the message to any1 else who may be reading my code. Also, in case a future version of rack stops checking for Last-Modified , a lot of code may break and it may be a while for folks to figure out why.