体验Mongo::OperationTimeout每20分钟-2小时一次

发布于 2024-12-27 16:54:26 字数 770 浏览 1 评论 0原文

我似乎每 20 分钟 - 1 小时就会经历一次 Mongo::OperationTimeout 我的堆栈:

  • Rails 3.1.3
  • Mongoid 3 (git edge)
  • Unicorn 4.1.1
  • 2 X MongoDB 2.0.2(应该正确设置 KeepAlive 默认值)配置为 ReplicaSet
  • Ubuntu m1.large EC2

我尝试将 EC2 上的 KeepAlive 设置为 300就像http://www.mongodb.org/display/DOCS/Amazon+EC2 但仍然没有帮助

我尝试仅使用一个主要配置而不是 ReplicaSet,但这并没有帮助也有帮助。

下面是 mongoid.conf:

production:
  database: my-app-name
  op_timeout: 10
  read_secondary: true
  max_retries_on_connection_failure: 3
  identity_map_enabled: true
  allow_dynamic_fields: false
  hosts:
    - - ip-XXX.ec2.internal
      - 27017
    - - ip-XXX.ec2.internal
      - 27017

I seem to be experiencing a Mongo::OperationTimeout every ~20 mins - 1 Hour
My stack:

  • Rails 3.1.3
  • Mongoid 3 (git edge)
  • Unicorn 4.1.1
  • 2 X MongoDB 2.0.2 (which should have the KeepAlive default set right) configured as ReplicaSet
  • Ubuntu m1.large EC2

I have tried setting KeepAlive on EC2 to 300 like said in http://www.mongodb.org/display/DOCS/Amazon+EC2 but still did not help

I have tried working with just one primary configuration instead of the ReplicaSet, but this did not help either.

Below is mongoid.conf:

production:
  database: my-app-name
  op_timeout: 10
  read_secondary: true
  max_retries_on_connection_failure: 3
  identity_map_enabled: true
  allow_dynamic_fields: false
  hosts:
    - - ip-XXX.ec2.internal
      - 27017
    - - ip-XXX.ec2.internal
      - 27017

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

半城柳色半声笛 2025-01-03 16:54:26

经过一番集体思考后,我们针对我们的情况提出了一些观点:

  • 我们使用 mongoid 3.0 和 op_timeout: 30(2.3 及更低版本的 Mongoid 没有启用 op_timeout),这实际上使 OperationTimeout 浮动。许多其他用户可能正在经历这种情况,但实际上并没有在日志中得到这种情况,而只是被困在独角兽工人中。
  • 我们正在使用 Unicorn,它提前生成进程并让它们等待,这与动态扩展的 Passenger 不同。由于我们目前仅处于测试模式,并且没有实际流量,因此许多工作人员可能会闲置,并且他们的 mongo 连接会变得陈旧。大多数人可能也没有意识到这一点,但可能会时不时地遇到这种情况。
  • 看起来 www.mongodb.org/display/DOCS/Troubleshooting#Troubleshooting-Socketerrorsinshardedclustersandreplicasets 中描述的 Linux KeepAlive 没有帮助
  • 现在,我创建了一个虚拟 Rack 中间件来执行初始 mongo 查询并在需要时处理异常。这是代码 https://gist.github.com/1647879

After some group thinking, here are some points we came up with regarding our situation:

  • We are using mongoid 3.0 with op_timeout: 30 (versions 2.3 and less of Mongoid did not have op_timeout enabled) which actually floats the OperationTimeout. It is possible that many other users are experiencing this but do not actually get this in the logs, but rather just stuck unicorn workers.
  • We are using Unicorn, which spawns processes ahead of time and keep them waiting, unlike Passenger which scales dynamically. Since we currently are just in test mode, and do not have real traffic, it is possible that many of the workers become idle, and their mongo connection becomes stale. Most people are probably not getting to this either, but might experience this every now and then.
  • It seems like the Linux KeepAlive described in here www.mongodb.org/display/DOCS/Troubleshooting#Troubleshooting-Socketerrorsinshardedclustersandreplicasets does not help
  • For now, I have created a dummy Rack middleware to do an initial mongo query and handle the exception if needed. Here's the code https://gist.github.com/1647879
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文