独角兽吃记忆
我在亚马逊有一个 m1.small 实例,有 8GB 硬盘空间,我的 Rails 应用程序在其上运行。它顺利运行了两周,之后就崩溃了,说内存已满。 应用程序在 Rails 3.1.1、unicorn 和 nginx 上运行
我根本不明白什么占用了 13G ?
我杀死了独角兽,“free”命令显示了一些可用空间,而 df 仍然显示 100%
我重新启动了实例,一切开始正常工作。
free(杀死独角兽之前)
total used free shared buffers cached
Mem: 1705192 1671580 33612 0 321816 405288
-/+ buffers/cache: 944476 760716
Swap: 917500 50812 866688
df -l (杀死独角兽之前)
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8256952 7837520 4 100% /
none 847464 120 847344 1% /dev
none 852596 0 852596 0% /dev/shm
none 852596 56 852540 1% /var/run
none 852596 0 852596 0% /var/lock
/dev/xvda2 153899044 192068 145889352 1% /mnt
/dev/xvdf 51606140 10276704 38707996 21% /data
sudo du -hc --max-depth=1 (杀死独角兽之前)
28K ./root
6.6M ./etc
4.0K ./opt
9.7G ./data
1.7G ./usr
4.0K ./media
du: cannot access `./proc/27220/task/27220/fd/4': No such file or directory
du: cannot access `./proc/27220/task/27220/fdinfo/4': No such file or directory
du: cannot access `./proc/27220/fd/4': No such file or directory
du: cannot access `./proc/27220/fdinfo/4': No such file or directory
0 ./proc
14M ./boot
120K ./dev
1.1G ./home
66M ./lib
4.0K ./selinux
6.5M ./sbin
6.5M ./bin
4.0K ./srv
148K ./tmp
16K ./lost+found
20K ./mnt
0 ./sys
253M ./var
13G .
13G total
free (杀死独角兽之后 )杀死独角兽)
total used free shared buffers cached
Mem: 1705192 985876 **719316** 0 365536 228576
-/+ buffers/cache: 391764 1313428
Swap: 917500 46176 871324
df -l (杀死独角兽后)
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8256952 7837516 8 100% /
none 847464 120 847344 1% /dev
none 852596 0 852596 0% /dev/shm
none 852596 56 852540 1% /var/run
none 852596 0 852596 0% /var/lock
/dev/xvda2 153899044 192068 145889352 1% /mnt
/dev/xvdf 51606140 10276704 38707996 21% /data
unicorn.rb
rails_env = 'production'
working_directory "/home/user/app_name"
worker_processes 5
preload_app true
timeout 60
rails_root = "/home/user/app_name"
listen "#{rails_root}/tmp/sockets/unicorn.sock", :backlog => 2048
# listen 3000, :tcp_nopush => false
pid "#{rails_root}/tmp/pids/unicorn.pid"
stderr_path "#{rails_root}/log/unicorn/unicorn.err.log"
stdout_path "#{rails_root}/log/unicorn/unicorn.out.log"
GC.copy_on_write_friendly = true if GC.respond_to?(:copy_on_write_friendly=)
before_fork do |server, worker|
ActiveRecord::Base.connection.disconnect!
##
# When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
# immediately start loading up a new version of itself (loaded with a new
# version of our app). When this new Unicorn is completely loaded
# it will begin spawning workers. The first worker spawned will check to
# see if an .oldbin pidfile exists. If so, this means we've just booted up
# a new Unicorn and need to tell the old one that it can now die. To do so
# we send it a QUIT.
#
# Using this method we get 0 downtime deploys.
old_pid = "#{rails_root}/tmp/pids/unicorn.pid.oldbin"
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
after_fork do |server, worker|
ActiveRecord::Base.establish_connection
worker.user('rails', 'rails') if Process.euid == 0 && rails_env == 'production'
end
I have a m1.small instance in amazon with 8GB hard disk space on which my rails application runs. It runs smoothly for 2 weeks and after that it crashes saying the memory is full.
App is running on rails 3.1.1, unicorn and nginx
I simply dont understand what is taking 13G ?
I killed unicorn and 'free' command is showing some free space while df is still saying 100%
I rebooted the instance and everything started working fine.
free (before killing unicorn)
total used free shared buffers cached
Mem: 1705192 1671580 33612 0 321816 405288
-/+ buffers/cache: 944476 760716
Swap: 917500 50812 866688
df -l (before killing unicorn)
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8256952 7837520 4 100% /
none 847464 120 847344 1% /dev
none 852596 0 852596 0% /dev/shm
none 852596 56 852540 1% /var/run
none 852596 0 852596 0% /var/lock
/dev/xvda2 153899044 192068 145889352 1% /mnt
/dev/xvdf 51606140 10276704 38707996 21% /data
sudo du -hc --max-depth=1 (before killing unicorn)
28K ./root
6.6M ./etc
4.0K ./opt
9.7G ./data
1.7G ./usr
4.0K ./media
du: cannot access `./proc/27220/task/27220/fd/4': No such file or directory
du: cannot access `./proc/27220/task/27220/fdinfo/4': No such file or directory
du: cannot access `./proc/27220/fd/4': No such file or directory
du: cannot access `./proc/27220/fdinfo/4': No such file or directory
0 ./proc
14M ./boot
120K ./dev
1.1G ./home
66M ./lib
4.0K ./selinux
6.5M ./sbin
6.5M ./bin
4.0K ./srv
148K ./tmp
16K ./lost+found
20K ./mnt
0 ./sys
253M ./var
13G .
13G total
free (after killing unicorn)
total used free shared buffers cached
Mem: 1705192 985876 **719316** 0 365536 228576
-/+ buffers/cache: 391764 1313428
Swap: 917500 46176 871324
df -l (after killing unicorn)
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8256952 7837516 8 100% /
none 847464 120 847344 1% /dev
none 852596 0 852596 0% /dev/shm
none 852596 56 852540 1% /var/run
none 852596 0 852596 0% /var/lock
/dev/xvda2 153899044 192068 145889352 1% /mnt
/dev/xvdf 51606140 10276704 38707996 21% /data
unicorn.rb
rails_env = 'production'
working_directory "/home/user/app_name"
worker_processes 5
preload_app true
timeout 60
rails_root = "/home/user/app_name"
listen "#{rails_root}/tmp/sockets/unicorn.sock", :backlog => 2048
# listen 3000, :tcp_nopush => false
pid "#{rails_root}/tmp/pids/unicorn.pid"
stderr_path "#{rails_root}/log/unicorn/unicorn.err.log"
stdout_path "#{rails_root}/log/unicorn/unicorn.out.log"
GC.copy_on_write_friendly = true if GC.respond_to?(:copy_on_write_friendly=)
before_fork do |server, worker|
ActiveRecord::Base.connection.disconnect!
##
# When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
# immediately start loading up a new version of itself (loaded with a new
# version of our app). When this new Unicorn is completely loaded
# it will begin spawning workers. The first worker spawned will check to
# see if an .oldbin pidfile exists. If so, this means we've just booted up
# a new Unicorn and need to tell the old one that it can now die. To do so
# we send it a QUIT.
#
# Using this method we get 0 downtime deploys.
old_pid = "#{rails_root}/tmp/pids/unicorn.pid.oldbin"
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
after_fork do |server, worker|
ActiveRecord::Base.establish_connection
worker.user('rails', 'rails') if Process.euid == 0 && rails_env == 'production'
end
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我刚刚发布了 'unicorn-worker-killer' gem。这使您能够根据 1) 最大请求数和 2) 进程内存大小 (RSS) 杀死 Unicorn 工作程序,而不影响请求。
它真的很容易使用。不需要外部工具。首先,请将这一行添加到您的
Gemfile
中。然后,请将以下行添加到您的
config.ru
中。强烈建议随机化阈值,以避免一次杀死所有工作人员。
i've just released 'unicorn-worker-killer' gem. This enables you to kill Unicorn worker based on 1) Max number of requests and 2) Process memory size (RSS), without affecting the request.
It's really easy to use. No external tool is required. At first, please add this line to your
Gemfile
.Then, please add the following lines to your
config.ru
.It's highly recommended to randomize the threshold to avoid killing all workers at once.
我认为您将内存使用情况和磁盘空间使用情况混为一谈。看起来 Unicorn 及其子级使用了大约 500 MB 的内存,您可以查看第二个“-/+ buffers/cache:”数字来查看真正的可用内存。就磁盘空间而言,我的赌注是某种日志文件或类似的东西会变得疯狂。您应该在数据目录中执行 du -h 来找出到底是什么使用了这么多存储空间。最后的建议是,一个鲜为人知的事实是,如果 Ruby 分配了内存,它永远不会将内存返回给操作系统。它仍然在内部使用它,但是一旦 Ruby 获取了一些内存,让它将未使用的内存返回给操作系统的唯一方法就是退出该进程。例如,如果您碰巧有一个进程将内存使用量激增至 500 MB,那么您将无法再次使用这 500 MB,即使在请求完成且 GC 周期已运行之后也是如此。然而,Ruby 会为未来的请求重用分配的内存,因此它不太可能进一步增长。
最后,Sergei 提到上帝要监控进程内存。如果您有兴趣使用它,此处已经有一个很好的配置文件。请务必阅读相关文章,因为独角兽配置文件中有一些关键内容,这个上帝配置假设你有。
I think you are conflating memory usage and disk space usage. It looks like Unicorn and its children were using around 500 MB of memory, you look at the second "-/+ buffers/cache:" number to see the real free memory. As far as the disk space goes, my bet goes on some sort of log file or something like that going nuts. You should do a du -h in the data directory to find out what exactly is using so much storage. As a final suggestion, it's a little known fact that Ruby never returns memory back to the OS if it allocates it. It DOES still use it internally, but once Ruby grabs some memory the only way to get it to yield the unused memory back to the OS is to quit the process. For example, if you happen to have a process that spikes your memory usage to 500 MB, you won't be able to use that 500 MB again, even after the request has completed and the GC cycle has run. However, Ruby will reuse that allocated memory for future requests, so it is unlikely to grow further.
Finally, Sergei mentions God to monitor the process memory. If you are interested in using this, there is already a good config file here. Be sure to read the associated article as there are key things in the unicorn config file that this god config assumes you have.
正如 Preston 提到的,你没有内存问题(超过 40% 可用),但你有磁盘已满问题。 du 报告大部分存储空间消耗在 /root/data 中。
您可以使用 find 来识别非常大的文件,例如,以下将显示该目录下大小大于 100MB 的所有文件。
如果 unicorn 仍在运行,lsof (LiSt Open Files) 可以显示正在运行的程序或特定进程集 (-p PID) 正在使用哪些文件,例如:
将显示打开的文件大小大于 100MB
As Preston mentioned you don't have a memory problem (over 40% free), you have a disk full problem. du reports most of the storage is consumed in /root/data.
You could use find to identify very large files, eg, the following will show all files under that dir greater than 100MB in size.
If unicorn is still running, lsof (LiSt Open Files) can show what files are in use by your running programs or by a specific set of processes (-p PID), eg:
will show you open files greater than 100MB in size
你可以设置 god 来监视你的独角兽工人,如果他们吃掉太多内存就杀死他们。然后,Unicorn master 进程将派生另一个工作进程来取代这个工作进程。问题解决了。 :-)
You can set up god to watch your unicorn workers and kill them if they eat too much memory. Unicorn master process will then fork another worker to replace this one. Problem worked around. :-)
如果您正在使用 newrelic,请尝试删除应用程序的 newrelic。 Newrelic rpm gem 本身会泄漏内存。我也遇到了同样的问题,我花了近 10 天的时间来解决这个问题。
希望对您有帮助。
我联系了 newrelic 支持团队,下面是他们的回复。
即使我尝试更新 newrelic gem 但它仍然泄漏内存。最后我必须删除 rewrelic,虽然它是一个很棒的工具,但我们不能以这样的代价使用它(内存泄漏)。
希望对您有帮助。
Try removing newrelic for your app if you are using newrelic. Newrelic rpm gem itself leaking the memory. I had the same issue and I stratched my head for almost 10day to figure out the issue.
Hope that help you.
I contact newrelic support team and below is their reply.
Even if I tried update newrelic gem but it still leaking the memory. Finally I have to remove the rewrelic although it is a great tool but we can not use it at such cost(memory leak).
Hope that help you.