ActiveResource“慢”时出现 EOFError应用程序编程接口

发布于 2024-11-15 01:58:16 字数 2278 浏览 6 评论 0原文

我正在努力解决这个问题，任何帮助将不胜感激！

我有两个 Rails 应用程序，我们称它们为客户端和服务，所有这些都非常简单，正常的 REST 接口 - 这是基本场景：

客户端向服务发出 POST /resources.json 请求
服务运行一个进程，该进程创建资源并返回客户端的 ID

同样，一切都非常简单，只是服务处理非常耗时，可能需要几分钟。如果发生这种情况，客户端会在发出请求后 60 秒内引发 EOFError（无论 ActiveResource::Base.timeout 设置为多少），同时服务正确处理请求并响应与 200/201。这是我们在日志中看到的内容（按时间顺序）：

C 00:00:00: POST /resources.json
S 00:00:00: Received POST /resources.json => resources#create
C 00:01:00: EOFError: end of file reached
  /usr/ruby1.8.7/lib/ruby/1.8/net/protocol.rb:135:in `sysread'
  /usr/ruby1.8.7/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill'
  /usr/ruby1.8.7/lib/ruby/1.8/timeout.rb:62:in `timeout'
  ...
S 00:02:23: Response POST /resources.json, 201, after 143s

显然，服务响应从未到达客户端。我将错误追溯到套接字级别，并在脚本中重新创建了该场景，其中我打开 TCPSocket 并尝试检索数据。由于我不请求任何内容，所以我不应该得到任何返回，并且我的请求应该在 70 秒后超时（请参阅底部的完整脚本）：

Timeout::timeout(70) { TCPSocket.open(domain, 80).sysread(16384) }

这些是一些域的结果：

www.amazon.com     => Timeout after 70s
github.com         => EOFError after 60s
www.nytimes.com    => Timeout after 70s
www.mozilla.org    => EOFError after 13s
www.googlelabs.com => Timeout after 70s
maps.google.com    => Timeout after 70s

如您所见，某些服务器允许我们“等待”了整整 70 秒，而其他人终止了我们的连接，引发了 EOFErrors。当我们对我们的服务进行此测试时，我们（预期）在 60 秒后收到 EOFError。

有谁知道为什么会发生这种情况？有什么方法可以防止这些或延长服务器端超时吗？由于我们的服务继续“工作”，即使在套接字关闭之后，我认为它必须在代理级别终止？

每一个提示将不胜感激！

PS：完整脚本：

require 'socket'
require 'benchmark'
require 'timeout'

def test_socket(domain)
  puts "Connecting to #{domain}"
  message = nil
  time    = Benchmark.realtime do
    begin
      Timeout::timeout(70) { TCPSocket.open(domain, 80).sysread(16384) }
      message = "Successfully received data" # Should never happen
    rescue => e
      message = "Server terminated connection: #{e.class} #{e.message}"
    rescue Timeout::Error
      message = "Controlled client-side timeout"
    end
  end
  puts "  #{message} after #{time.round}s"
end

test_socket 'www.amazon.com'
test_socket 'github.com'
test_socket 'www.nytimes.com'
test_socket 'www.mozilla.org'
test_socket 'www.googlelabs.com'
test_socket 'maps.google.com'

原文

I'm seriously struggling to solve this one, any help would be appreciated!

I have two Rails apps, let's call them Client and Service, all very simple, normal REST interface - here's the basic scenario:

Client makes a POST /resources.json request to the Service
The Service runs a process which creates the resource and returns an ID to the Client

Again, all very simple, just that Service processing is very time-intensive and can take several minutes. If that happens, an EOFError is raised on the Client, exactly 60s after the request was made (no matter what the ActiveResource::Base.timeout is set to) while the service correctly processed the request and responds with 200/201. This is what we see in the logs (chronologically):

C 00:00:00: POST /resources.json
S 00:00:00: Received POST /resources.json => resources#create
C 00:01:00: EOFError: end of file reached
  /usr/ruby1.8.7/lib/ruby/1.8/net/protocol.rb:135:in `sysread'
  /usr/ruby1.8.7/lib/ruby/1.8/net/protocol.rb:135:in `rbuf_fill'
  /usr/ruby1.8.7/lib/ruby/1.8/timeout.rb:62:in `timeout'
  ...
S 00:02:23: Response POST /resources.json, 201, after 143s

Obviously the service response never reached the client. I traced the error down to the socket level and recreated the scenario in a script, where I open a TCPSocket and try to retrieve data. Since I don't request anything, I shouldn't get anything back and my request should time out after 70 seconds (see full script at the bottom):

Timeout::timeout(70) { TCPSocket.open(domain, 80).sysread(16384) }

These were the results for a few domain:

www.amazon.com     => Timeout after 70s
github.com         => EOFError after 60s
www.nytimes.com    => Timeout after 70s
www.mozilla.org    => EOFError after 13s
www.googlelabs.com => Timeout after 70s
maps.google.com    => Timeout after 70s

As you can see, some servers allowed us to "wait" for the full 70 seconds, while others terminated our connection, raising EOFErrors. When we did this test against our service, we (expectedly) got an EOFError after 60 seconds.

Does anyone know why this happens? Is there any way to prevent these or extend the server-side time-out? Since our service continues "working", even after the socket was closed, I assume it must be terminated on the proxy-level?

Every hint would be greatly appreciated!

PS: The full script:

require 'socket'
require 'benchmark'
require 'timeout'

def test_socket(domain)
  puts "Connecting to #{domain}"
  message = nil
  time    = Benchmark.realtime do
    begin
      Timeout::timeout(70) { TCPSocket.open(domain, 80).sysread(16384) }
      message = "Successfully received data" # Should never happen
    rescue => e
      message = "Server terminated connection: #{e.class} #{e.message}"
    rescue Timeout::Error
      message = "Controlled client-side timeout"
    end
  end
  puts "  #{message} after #{time.round}s"
end

test_socket 'www.amazon.com'
test_socket 'github.com'
test_socket 'www.nytimes.com'
test_socket 'www.mozilla.org'
test_socket 'www.googlelabs.com'
test_socket 'maps.google.com'

分享到QQ

分享到微博