独角兽吃记忆

发布于 2024-12-18 15:28:17 字数 4368 浏览 3 评论 0原文

我在亚马逊有一个 m1.small 实例，有 8GB 硬盘空间，我的 Rails 应用程序在其上运行。它顺利运行了两周，之后就崩溃了，说内存已满。应用程序在 Rails 3.1.1、unicorn 和 nginx 上运行

我根本不明白什么占用了 13G ？
我杀死了独角兽，“free”命令显示了一些可用空间，而 df 仍然显示 100%
我重新启动了实例，一切开始正常工作。

free（杀死独角兽之前）

             total       used       free     shared    buffers     cached  
Mem:       1705192    1671580      33612          0     321816     405288  
-/+ buffers/cache:     944476     760716   
Swap:       917500      50812     866688

df -l （杀死独角兽之前）

Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837520         4 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data

sudo du -hc --max-depth=1 （杀死独角兽之前）

28K ./root  
6.6M    ./etc  
4.0K    ./opt  
9.7G    ./data  
1.7G    ./usr  
4.0K    ./media  
du: cannot access `./proc/27220/task/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/task/27220/fdinfo/4': No such file or directory  
du: cannot access `./proc/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/fdinfo/4': No such file or directory  
0   ./proc  
14M ./boot  
120K    ./dev  
1.1G    ./home  
66M ./lib  
4.0K    ./selinux  
6.5M    ./sbin  
6.5M    ./bin  
4.0K    ./srv  
148K    ./tmp  
16K ./lost+found  
20K ./mnt  
0   ./sys  
253M    ./var  
13G .  
13G total

free （杀死独角兽之后）杀死独角兽）

             total       used       free     shared    buffers     cached    
Mem:       1705192     985876     **719316**          0     365536     228576    
-/+ buffers/cache:     391764    1313428    
Swap:       917500      46176     871324

df -l （杀死独角兽后）

Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837516         8 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data

unicorn.rb

rails_env = 'production'  

working_directory "/home/user/app_name"  
worker_processes 5  
preload_app true  
timeout 60  

rails_root = "/home/user/app_name"  
listen "#{rails_root}/tmp/sockets/unicorn.sock", :backlog => 2048  
# listen 3000, :tcp_nopush => false  

pid "#{rails_root}/tmp/pids/unicorn.pid"  
stderr_path "#{rails_root}/log/unicorn/unicorn.err.log"  
stdout_path "#{rails_root}/log/unicorn/unicorn.out.log"  

GC.copy_on_write_friendly = true if GC.respond_to?(:copy_on_write_friendly=)  

before_fork do |server, worker|  
  ActiveRecord::Base.connection.disconnect!  

  ##  
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and  
  # immediately start loading up a new version of itself (loaded with a new  
  # version of our app). When this new Unicorn is completely loaded  
  # it will begin spawning workers. The first worker spawned will check to  
  # see if an .oldbin pidfile exists. If so, this means we've just booted up  
  # a new Unicorn and need to tell the old one that it can now die. To do so  
  # we send it a QUIT.  
  #  
  # Using this method we get 0 downtime deploys.  

  old_pid = "#{rails_root}/tmp/pids/unicorn.pid.oldbin"  
  if File.exists?(old_pid) && server.pid != old_pid  
    begin  
      Process.kill("QUIT", File.read(old_pid).to_i)  
    rescue Errno::ENOENT, Errno::ESRCH  
      # someone else did our job for us  
    end  
  end  
end  


after_fork do |server, worker|  
  ActiveRecord::Base.establish_connection  
  worker.user('rails', 'rails') if Process.euid == 0 && rails_env == 'production'  
end

原文

I have a m1.small instance in amazon with 8GB hard disk space on which my rails application runs. It runs smoothly for 2 weeks and after that it crashes saying the memory is full.
App is running on rails 3.1.1, unicorn and nginx

I simply dont understand what is taking 13G ?
I killed unicorn and 'free' command is showing some free space while df is still saying 100%
I rebooted the instance and everything started working fine.

free (before killing unicorn)

             total       used       free     shared    buffers     cached  
Mem:       1705192    1671580      33612          0     321816     405288  
-/+ buffers/cache:     944476     760716   
Swap:       917500      50812     866688

df -l (before killing unicorn)

Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837520         4 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data

sudo du -hc --max-depth=1 (before killing unicorn)

28K ./root  
6.6M    ./etc  
4.0K    ./opt  
9.7G    ./data  
1.7G    ./usr  
4.0K    ./media  
du: cannot access `./proc/27220/task/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/task/27220/fdinfo/4': No such file or directory  
du: cannot access `./proc/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/fdinfo/4': No such file or directory  
0   ./proc  
14M ./boot  
120K    ./dev  
1.1G    ./home  
66M ./lib  
4.0K    ./selinux  
6.5M    ./sbin  
6.5M    ./bin  
4.0K    ./srv  
148K    ./tmp  
16K ./lost+found  
20K ./mnt  
0   ./sys  
253M    ./var  
13G .  
13G total

free (after killing unicorn)

             total       used       free     shared    buffers     cached    
Mem:       1705192     985876     **719316**          0     365536     228576    
-/+ buffers/cache:     391764    1313428    
Swap:       917500      46176     871324

df -l (after killing unicorn)

Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837516         8 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data

unicorn.rb

rails_env = 'production'  

working_directory "/home/user/app_name"  
worker_processes 5  
preload_app true  
timeout 60  

rails_root = "/home/user/app_name"  
listen "#{rails_root}/tmp/sockets/unicorn.sock", :backlog => 2048  
# listen 3000, :tcp_nopush => false  

pid "#{rails_root}/tmp/pids/unicorn.pid"  
stderr_path "#{rails_root}/log/unicorn/unicorn.err.log"  
stdout_path "#{rails_root}/log/unicorn/unicorn.out.log"  

GC.copy_on_write_friendly = true if GC.respond_to?(:copy_on_write_friendly=)  

before_fork do |server, worker|  
  ActiveRecord::Base.connection.disconnect!  

  ##  
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and  
  # immediately start loading up a new version of itself (loaded with a new  
  # version of our app). When this new Unicorn is completely loaded  
  # it will begin spawning workers. The first worker spawned will check to  
  # see if an .oldbin pidfile exists. If so, this means we've just booted up  
  # a new Unicorn and need to tell the old one that it can now die. To do so  
  # we send it a QUIT.  
  #  
  # Using this method we get 0 downtime deploys.  

  old_pid = "#{rails_root}/tmp/pids/unicorn.pid.oldbin"  
  if File.exists?(old_pid) && server.pid != old_pid  
    begin  
      Process.kill("QUIT", File.read(old_pid).to_i)  
    rescue Errno::ENOENT, Errno::ESRCH  
      # someone else did our job for us  
    end  
  end  
end  


after_fork do |server, worker|  
  ActiveRecord::Base.establish_connection  
  worker.user('rails', 'rails') if Process.euid == 0 && rails_env == 'production'  
end

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

紅太極 2024-12-25 15:28:17

我刚刚发布了 'unicorn-worker-killer' gem。这使您能够根据 1) 最大请求数和 2) 进程内存大小 (RSS) 杀死 Unicorn 工作程序，而不影响请求。

它真的很容易使用。不需要外部工具。首先，请将这一行添加到您的 Gemfile 中。

gem 'unicorn-worker-killer'

然后，请将以下行添加到您的 config.ru 中。

# Unicorn self-process killer
require 'unicorn/worker_killer'

# Max requests per worker
use Unicorn::WorkerKiller::MaxRequests, 10240 + Random.rand(10240)

# Max memory size (RSS) per worker
use Unicorn::WorkerKiller::Oom, (96 + Random.rand(32)) * 1024**2

强烈建议随机化阈值，以避免一次杀死所有工作人员。

i've just released 'unicorn-worker-killer' gem. This enables you to kill Unicorn worker based on 1) Max number of requests and 2) Process memory size (RSS), without affecting the request.

It's really easy to use. No external tool is required. At first, please add this line to your Gemfile.

gem 'unicorn-worker-killer'

Then, please add the following lines to your config.ru.

# Unicorn self-process killer
require 'unicorn/worker_killer'

# Max requests per worker
use Unicorn::WorkerKiller::MaxRequests, 10240 + Random.rand(10240)

# Max memory size (RSS) per worker
use Unicorn::WorkerKiller::Oom, (96 + Random.rand(32)) * 1024**2

It's highly recommended to randomize the threshold to avoid killing all workers at once.

回复收藏 0 原文

森末i 2024-12-25 15:28:17

我认为您将内存使用情况和磁盘空间使用情况混为一谈。看起来 Unicorn 及其子级使用了大约 500 MB 的内存，您可以查看第二个“-/+ buffers/cache:”数字来查看真正的可用内存。就磁盘空间而言，我的赌注是某种日志文件或类似的东西会变得疯狂。您应该在数据目录中执行 du -h 来找出到底是什么使用了这么多存储空间。最后的建议是，一个鲜为人知的事实是，如果 Ruby 分配了内存，它永远不会将内存返回给操作系统。它仍然在内部使用它，但是一旦 Ruby 获取了一些内存，让它将未使用的内存返回给操作系统的唯一方法就是退出该进程。例如，如果您碰巧有一个进程将内存使用量激增至 500 MB，那么您将无法再次使用这 500 MB，即使在请求完成且 GC 周期已运行之后也是如此。然而，Ruby 会为未来的请求重用分配的内存，因此它不太可能进一步增长。

最后，Sergei 提到上帝要监控进程内存。如果您有兴趣使用它，此处已经有一个很好的配置文件。请务必阅读相关文章，因为独角兽配置文件中有一些关键内容，这个上帝配置假设你有。

回复收藏 0 原文

爱你是孤单的心事 2024-12-25 15:28:17

正如 Preston 提到的，你没有内存问题（超过 40% 可用），但你有磁盘已满问题。 du 报告大部分存储空间消耗在 /root/data 中。

您可以使用 find 来识别非常大的文件，例如，以下将显示该目录下大小大于 100MB 的所有文件。

sudo find /root/data -size +100M

如果 unicorn 仍在运行，lsof (LiSt Open Files) 可以显示正在运行的程序或特定进程集 (-p PID) 正在使用哪些文件，例如：

sudo lsof | awk  '$5 ~/REG/ && $7 > 100000000 { print }'

将显示打开的文件大小大于 100MB

As Preston mentioned you don't have a memory problem (over 40% free), you have a disk full problem. du reports most of the storage is consumed in /root/data.

You could use find to identify very large files, eg, the following will show all files under that dir greater than 100MB in size.

sudo find /root/data -size +100M

If unicorn is still running, lsof (LiSt Open Files) can show what files are in use by your running programs or by a specific set of processes (-p PID), eg:

sudo lsof | awk  '$5 ~/REG/ && $7 > 100000000 { print }'

will show you open files greater than 100MB in size

回复收藏 0 原文

残花月 2024-12-25 15:28:17

你可以设置 god 来监视你的独角兽工人，如果他们吃掉太多内存就杀死他们。然后，Unicorn master 进程将派生另一个工作进程来取代这个工作进程。问题解决了。 :-)

回复收藏 0 原文

蓝色星空 2024-12-25 15:28:17

如果您正在使用 newrelic，请尝试删除应用程序的 newrelic。 Newrelic rpm gem 本身会泄漏内存。我也遇到了同样的问题，我花了近 10 天的时间来解决这个问题。

希望对您有帮助。

我联系了 newrelic 支持团队，下面是他们的回复。

感谢您联系支持人员。对于令人沮丧的事情，我深表歉意
你有过的经历。作为性能监控工具，我们的
我们的初衷是“首先不造成伤害”，我们非常重视此类问题
认真的。
我们最近确定了此问题的原因并发布了
补丁来解决它。（请参阅https://newrelic.com/docs/releases/ruby）。我们
希望您考虑通过此修复恢复使用 New Relic 进行监控。
如果您有兴趣这样做，请确保您至少使用
v3.6.8.168 即日起。
如果您还有任何其他问题或疑虑，请告诉我们。
我们渴望解决这些问题。

即使我尝试更新 newrelic gem 但它仍然泄漏内存。最后我必须删除 rewrelic，虽然它是一个很棒的工具，但我们不能以这样的代价使用它（内存泄漏）。

希望对您有帮助。

回复收藏 0 原文

~没有更多了~

关于作者

优雅的叶子

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

独角兽吃记忆

free（杀死独角兽之前）

df -l （杀死独角兽之前）

sudo du -hc --max-depth=1 （杀死独角兽之前）

free （杀死独角兽之后）杀死独角兽）

df -l （杀死独角兽后）

unicorn.rb

free (before killing unicorn)

df -l (before killing unicorn)

sudo du -hc --max-depth=1 (before killing unicorn)

free (after killing unicorn)

df -l (after killing unicorn)

unicorn.rb

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

独角兽吃记忆

free（杀死独角兽之前）

df -l （杀死独角兽之前）

sudo du -hc --max-depth=1 （杀死独角兽之前）

free （杀死独角兽之后 ）杀死独角兽）

df -l （杀死独角兽后）

unicorn.rb

free (before killing unicorn)

df -l (before killing unicorn)

sudo du -hc --max-depth=1 (before killing unicorn)

free (after killing unicorn)

df -l (after killing unicorn)

unicorn.rb

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

free （杀死独角兽之后）杀死独角兽）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。