bluepill 没有检测到进程实际上已成功启动,因此创建了新进程
我有一台 (EC2) Ubuntu 服务器,其中 bluepill
工作得很好,可以启动和监视 resque
进程(过去在其他节点上也这样做过)。
我正在设置一个新节点,由于某种原因,在该节点上 bluepill
无法识别进程已启动并正在运行,因此不断创建新进程。我对造成这种情况的原因有点困惑。这两个节点几乎相同;它们都是由相同 chef
脚本配置的 EC2 服务器。确实,不起作用的一个是“制作”,另一个是“舞台”,但几乎没有什么区别。
在我分叉 github 项目并开始插入更多监控以尝试弄清楚发生了什么之前,有什么想法或建议吗?过去曾在此列表中讨论过有关 bluepill
和 resque
的问题,但正如我所说,这在我的临时服务器上运行良好,并且在之前也运行良好生产服务器(尽管我会注意到这个新的生产服务器是 ruby 1.9.3(相对于 1.9.2)和 Rails 3.2(相对于 3.1))。
这是我的 .pill
文件(或更具体地说,是我的 chef
食谱的模板文件):
ENV["RAILS_ENV"] = "<%= node.chef_environment %>"
ENV["QUEUE"] = "*"
Bluepill.application("zmx_app") do |app|
app.working_dir = "/srv/zmx/current"
app.uid = "root"
app.gid = "root"
2.times do |i|
app.process("resque-#{i}") do |process|
process.group = "resque"
process.start_command = "rake resque:work"
process.pid_file = "/srv/zmx/current/tmp/pids/resque_workers-#{i}.pid"
process.stop_command = "kill -QUIT {{PID}}"
process.daemonize = true
end
end
end
I have one (EC2) Ubuntu server where bluepill
is working just fine to start and monitoring resque
processes (and it has done so on other nodes in the past).
I'm setting up a new node, and for some reason on this node bluepill
does not recognize that the processes have started and are running, and so keeps creating new ones. I'm a little baffled by what's causing this. The 2 nodes are almost identical; they're both EC2 servers provisioned by the same chef
scripts. It is true that the one not working is 'production' and the other 'staging', but there's almost no difference due to that.
Any thoughts or suggestions before I fork the github project and start inserting more monitoring, to try and figure out what's going on? There's been discussion on this list in the past about troubles w/ bluepill
and resque
, but as I said this is working fine on my staging server, and has worked fine on earlier production servers (although I will note that this new production server is ruby 1.9.3 (vs 1.9.2) and rails 3.2 (vs. 3.1)).
Here's my .pill
file (or more specifically, my chef
cookbook's template file):
ENV["RAILS_ENV"] = "<%= node.chef_environment %>"
ENV["QUEUE"] = "*"
Bluepill.application("zmx_app") do |app|
app.working_dir = "/srv/zmx/current"
app.uid = "root"
app.gid = "root"
2.times do |i|
app.process("resque-#{i}") do |process|
process.group = "resque"
process.start_command = "rake resque:work"
process.pid_file = "/srv/zmx/current/tmp/pids/resque_workers-#{i}.pid"
process.stop_command = "kill -QUIT {{PID}}"
process.daemonize = true
end
end
end
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
事实证明,这是 bluepill 中的一个错误,我已分叉、修复了该错误,并提交了拉取请求< /a>.
我不知道为什么我没有意识到我的两个环境之间实际上存在差异:staging/旧产品在 bluepill 0.0.55 上,我的新生产环境在 0.0.58 上。
This turned out to be a bug in bluepill, which I have forked, fixed, and submitted a pull request.
And I'm not sure why I didn't realize that there was, in fact, a difference between my two environments: staging/old prod was on bluepill 0.0.55, my new production environment on 0.0.58.