erlang OTP Supervisor 崩溃

发布于 2024-08-31 14:05:50 字数 5195 浏览 4 评论 0原文

我正在研究 Erlang 文档,试图了解设置 OTP gen_server 和 Supervisor 的基础知识。每当我的 gen_server 崩溃时,我的主管也会崩溃。事实上,每当我在命令行上出错时,我的主管就会崩溃。

我希望 gen_server 在崩溃时能够重新启动。我希望命令行错误对我的服务器组件没有任何影响。我的主管根本不应该崩溃。

我正在使用的代码是一个基本的“回显服务器”,它会回复您发送的任何内容,以及一个每分钟最多重新启动 echo_server 5 次的主管(one_for_one)。我的代码:

echo_server.erl

-module(echo_server).
-behaviour(gen_server).

-export([start_link/0]).
-export([echo/1, crash/0]).
-export([init/1, handle_call/3, handle_cast/2]).

start_link() ->
    gen_server:start_link({local, echo_server}, echo_server, [], []).

%% public api
echo(Text) ->
    gen_server:call(echo_server, {echo, Text}).
crash() ->
    gen_server:call(echo_server, crash)..

%% behaviours
init(_Args) ->
    {ok, none}.
handle_call(crash, _From, State) ->
    X=1,
    {reply, X=2, State}.
handle_call({echo, Text}, _From, State) ->
    {reply, Text, State}.
handle_cast(_, State) ->
    {noreply, State}.

echo_sup.erl

-module(echo_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).

start_link() ->
    supervisor:start_link(echo_sup, []).
init(_Args) ->
    {ok,  {{one_for_one, 5, 60},
       [{echo_server, {echo_server, start_link, []},
             permanent, brutal_kill, worker, [echo_server]}]}}.

使用 erlc *.erl 编译,这是一个示例运行:

Erlang R13B01 (erts-5.7.2) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-p
oll:false]

Eshell V5.7.2  (abort with ^G)
1> echo_sup:start_link().
{ok,<0.37.0>}
2> echo_server:echo("hi").
"hi"
3> echo_server:crash().   

=ERROR REPORT==== 5-May-2010::10:05:54 ===
** Generic server echo_server terminating 
** Last message in was crash
** When Server state == none
** Reason for termination == 
** {'function not exported',
       [{echo_server,terminate,
            [{{badmatch,2},
              [{echo_server,handle_call,3},
               {gen_server,handle_msg,5},
               {proc_lib,init_p_do_apply,3}]},
             none]},
        {gen_server,terminate,6},
        {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 5-May-2010::10:05:54 ===
** Generic server <0.37.0> terminating 
** Last message in was {'EXIT',<0.35.0>,
                           {{{undef,
                                 [{echo_server,terminate,
                                      [{{badmatch,2},
                                        [{echo_server,handle_call,3},
                                         {gen_server,handle_msg,5},
                                         {proc_lib,init_p_do_apply,3}]},
                                       none]},
                                  {gen_server,terminate,6},
                                  {proc_lib,init_p_do_apply,3}]},
                             {gen_server,call,[echo_server,crash]}},
                            [{gen_server,call,2},
                             {erl_eval,do_apply,5},
                             {shell,exprs,6},
                             {shell,eval_exprs,6},
                             {shell,eval_loop,3}]}}
** When Server state == {state,
                            {<0.37.0>,echo_sup},
                            one_for_one,
                            [{child,<0.41.0>,echo_server,
                                 {echo_server,start_link,[]},
                                 permanent,brutal_kill,worker,
                                 [echo_server]}],
                            {dict,0,16,16,8,80,48,
                                {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                 []},
                                {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                  [],[]}}},
                            5,60,
                            [{1273,79154,701110}],
                            echo_sup,[]}
** Reason for termination == 
** {{{undef,[{echo_server,terminate,
                          [{{badmatch,2},
                            [{echo_server,handle_call,3},
                             {gen_server,handle_msg,5},
                             {proc_lib,init_p_do_apply,3}]},
                           none]},
             {gen_server,terminate,6},
             {proc_lib,init_p_do_apply,3}]},
     {gen_server,call,[echo_server,crash]}},
    [{gen_server,call,2},
     {erl_eval,do_apply,5},
     {shell,exprs,6},
     {shell,eval_exprs,6},
     {shell,eval_loop,3}]}
** exception exit: {{undef,
                        [{echo_server,terminate,
                             [{{badmatch,2},
                               [{echo_server,handle_call,3},
                                {gen_server,handle_msg,5},
                                {proc_lib,init_p_do_apply,3}]},
                              none]},
                         {gen_server,terminate,6},
                         {proc_lib,init_p_do_apply,3}]},
                    {gen_server,call,[echo_server,crash]}}
     in function  gen_server:call/2
4> echo_server:echo("hi").
** exception exit: {noproc,{gen_server,call,[echo_server,{echo,"hi"}]}}
     in function  gen_server:call/2
5>

I'm working through the Erlang documentation, trying to understand the basics of setting up an OTP gen_server and supervisor. Whenever my gen_server crashes, my supervisor crashes as well. In fact, whenever I have an error on the command line, my supervisor crashes.

I expect the gen_server to be restarted when it crashes. I expect command line errors to have no bearing whatsoever on my server components. My supervisor shouldn't be crashing at all.

The code I'm working with is a basic "echo server" that replies with whatever you send in, and a supervisor that will restart the echo_server 5 times per minute at most (one_for_one). My code:

echo_server.erl

-module(echo_server).
-behaviour(gen_server).

-export([start_link/0]).
-export([echo/1, crash/0]).
-export([init/1, handle_call/3, handle_cast/2]).

start_link() ->
    gen_server:start_link({local, echo_server}, echo_server, [], []).

%% public api
echo(Text) ->
    gen_server:call(echo_server, {echo, Text}).
crash() ->
    gen_server:call(echo_server, crash)..

%% behaviours
init(_Args) ->
    {ok, none}.
handle_call(crash, _From, State) ->
    X=1,
    {reply, X=2, State}.
handle_call({echo, Text}, _From, State) ->
    {reply, Text, State}.
handle_cast(_, State) ->
    {noreply, State}.

echo_sup.erl

-module(echo_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).

start_link() ->
    supervisor:start_link(echo_sup, []).
init(_Args) ->
    {ok,  {{one_for_one, 5, 60},
       [{echo_server, {echo_server, start_link, []},
             permanent, brutal_kill, worker, [echo_server]}]}}.

Compiled using erlc *.erl, and here's a sample run:

Erlang R13B01 (erts-5.7.2) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-p
oll:false]

Eshell V5.7.2  (abort with ^G)
1> echo_sup:start_link().
{ok,<0.37.0>}
2> echo_server:echo("hi").
"hi"
3> echo_server:crash().   

=ERROR REPORT==== 5-May-2010::10:05:54 ===
** Generic server echo_server terminating 
** Last message in was crash
** When Server state == none
** Reason for termination == 
** {'function not exported',
       [{echo_server,terminate,
            [{{badmatch,2},
              [{echo_server,handle_call,3},
               {gen_server,handle_msg,5},
               {proc_lib,init_p_do_apply,3}]},
             none]},
        {gen_server,terminate,6},
        {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 5-May-2010::10:05:54 ===
** Generic server <0.37.0> terminating 
** Last message in was {'EXIT',<0.35.0>,
                           {{{undef,
                                 [{echo_server,terminate,
                                      [{{badmatch,2},
                                        [{echo_server,handle_call,3},
                                         {gen_server,handle_msg,5},
                                         {proc_lib,init_p_do_apply,3}]},
                                       none]},
                                  {gen_server,terminate,6},
                                  {proc_lib,init_p_do_apply,3}]},
                             {gen_server,call,[echo_server,crash]}},
                            [{gen_server,call,2},
                             {erl_eval,do_apply,5},
                             {shell,exprs,6},
                             {shell,eval_exprs,6},
                             {shell,eval_loop,3}]}}
** When Server state == {state,
                            {<0.37.0>,echo_sup},
                            one_for_one,
                            [{child,<0.41.0>,echo_server,
                                 {echo_server,start_link,[]},
                                 permanent,brutal_kill,worker,
                                 [echo_server]}],
                            {dict,0,16,16,8,80,48,
                                {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                 []},
                                {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                  [],[]}}},
                            5,60,
                            [{1273,79154,701110}],
                            echo_sup,[]}
** Reason for termination == 
** {{{undef,[{echo_server,terminate,
                          [{{badmatch,2},
                            [{echo_server,handle_call,3},
                             {gen_server,handle_msg,5},
                             {proc_lib,init_p_do_apply,3}]},
                           none]},
             {gen_server,terminate,6},
             {proc_lib,init_p_do_apply,3}]},
     {gen_server,call,[echo_server,crash]}},
    [{gen_server,call,2},
     {erl_eval,do_apply,5},
     {shell,exprs,6},
     {shell,eval_exprs,6},
     {shell,eval_loop,3}]}
** exception exit: {{undef,
                        [{echo_server,terminate,
                             [{{badmatch,2},
                               [{echo_server,handle_call,3},
                                {gen_server,handle_msg,5},
                                {proc_lib,init_p_do_apply,3}]},
                              none]},
                         {gen_server,terminate,6},
                         {proc_lib,init_p_do_apply,3}]},
                    {gen_server,call,[echo_server,crash]}}
     in function  gen_server:call/2
4> echo_server:echo("hi").
** exception exit: {noproc,{gen_server,call,[echo_server,{echo,"hi"}]}}
     in function  gen_server:call/2
5>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

命比纸薄 2024-09-07 14:05:50

从 shell 测试主管的问题是主管进程链接到 shell 进程。当 gen_server 进程崩溃时,退出信号会传播到崩溃并重新启动的 shell。

为了避免这个问题,向主管添加这样的内容:

start_in_shell_for_testing() ->
    {ok, Pid} = supervisor:start_link(echo_sup, []),
    unlink(Pid).

The problem testing supervisors from the shell is that the supervisor process is linked to the shell process. When gen_server process crashes the exit signal is propagated up to the shell which crashes and get restarted.

To avoid the problem add something like this to the supervisor:

start_in_shell_for_testing() ->
    {ok, Pid} = supervisor:start_link(echo_sup, []),
    unlink(Pid).
深府石板幽径 2024-09-07 14:05:50

我建议您调试/跟踪您的应用程序以检查发生了什么。这对于理解 OTP 中的工作原理非常有帮助。

对于您的情况,您可能需要执行以下操作。

启动跟踪器:

dbg:tracer().

跟踪您的主管和 gen_server 的所有函数调用:

dbg:p(all,c).
dbg:tpl(echo_server, x).
dbg:tpl(echo_sup, x).

检查进程正在传递哪些消息:

dbg:p(new, m).

查看进程发生了什么(崩溃等):

dbg:p(new, p).

有关跟踪的更多信息:

http://www.erlang.org/doc/man/dbg.html

http://aloiroberto.wordpress.com/2009/02/23/tracing-erlang-functions/

希望这对当前和未来的情况有所帮助。

提示: gen_server 行为期望定义并导出回调terminate/2 ;)

更新: 在定义terminate/2之后从痕迹中可以看出事故原因。它看起来是这样的:

我们 (75) 调用 crash/0 函数。这由 gen_server (78) 接收。

(<0.75.0>) call echo_server:crash()
(<0.75.0>) <0.78.0> ! {'$gen_call',{<0.75.0>,#Ref<0.0.0.358>},crash}
(<0.78.0>) << {'$gen_call',{<0.75.0>,#Ref<0.0.0.358>},crash}
(<0.78.0>) call echo_server:handle_call(crash,{<0.75.0>,#Ref<0.0.0.358>},none)

呃,句柄调用有问题。我们有一个不匹配...

(<0.78.0>) exception_from {echo_server,handle_call,3} {error,{badmatch,2}}

调用终止函数。服务器退出并且取消注册。

(<0.78.0>) call echo_server:terminate({{badmatch,2},
 [{echo_server,handle_call,3},
  {gen_server,handle_msg,5},
  {proc_lib,init_p_do_apply,3}]},none)
(<0.78.0>) returned from echo_server:terminate/2 -> ok
(<0.78.0>) exit {{badmatch,2},
 [{echo_server,handle_call,3},
  {gen_server,handle_msg,5},
  {proc_lib,init_p_do_apply,3}]}
(<0.78.0>) unregister echo_server

主管 (77) 收到来自 gen_server 的退出信号并完成其工作:

(<0.77.0>) << {'EXIT',<0.78.0>,
                      {{badmatch,2},
                       [{echo_server,handle_call,3},
                        {gen_server,handle_msg,5},
                        {proc_lib,init_p_do_apply,3}]}}
(<0.77.0>) getting_unlinked <0.78.0>
(<0.75.0>) << {'DOWN',#Ref<0.0.0.358>,process,<0.78.0>,
                      {{badmatch,2},
                       [{echo_server,handle_call,3},
                        {gen_server,handle_msg,5},
                        {proc_lib,init_p_do_apply,3}]}}
(<0.77.0>) call echo_server:start_link()

好吧,它尝试......因为它发生了 Filippo 所说的......

I would suggest you to debug/trace your application to check what's going on. It's very helpful in understanding how things work in OTP.

In your case, you might want to do the following.

Start the tracer:

dbg:tracer().

Trace all function calls for your supervisor and your gen_server:

dbg:p(all,c).
dbg:tpl(echo_server, x).
dbg:tpl(echo_sup, x).

Check which messages the processes are passing:

dbg:p(new, m).

See what's happening to your processes (crash, etc):

dbg:p(new, p).

For more information about tracing:

http://www.erlang.org/doc/man/dbg.html

http://aloiroberto.wordpress.com/2009/02/23/tracing-erlang-functions/

Hope this can help for this and future situations.

HINT: The gen_server behaviour is expecting the callback terminate/2 to be defined and exported ;)

UPDATE: After the definition of the terminate/2 the reason of the crash is evident from the trace. This is how it looks:

We (75) call the crash/0 function. This is received by the gen_server (78).

(<0.75.0>) call echo_server:crash()
(<0.75.0>) <0.78.0> ! {'$gen_call',{<0.75.0>,#Ref<0.0.0.358>},crash}
(<0.78.0>) << {'$gen_call',{<0.75.0>,#Ref<0.0.0.358>},crash}
(<0.78.0>) call echo_server:handle_call(crash,{<0.75.0>,#Ref<0.0.0.358>},none)

Uh, problem on the handle call. We have a badmatch...

(<0.78.0>) exception_from {echo_server,handle_call,3} {error,{badmatch,2}}

The terminate function is called. The server exits and it gets unregistered.

(<0.78.0>) call echo_server:terminate({{badmatch,2},
 [{echo_server,handle_call,3},
  {gen_server,handle_msg,5},
  {proc_lib,init_p_do_apply,3}]},none)
(<0.78.0>) returned from echo_server:terminate/2 -> ok
(<0.78.0>) exit {{badmatch,2},
 [{echo_server,handle_call,3},
  {gen_server,handle_msg,5},
  {proc_lib,init_p_do_apply,3}]}
(<0.78.0>) unregister echo_server

The Supervisor (77) receive the exit signal from the gen_server and it does its job:

(<0.77.0>) << {'EXIT',<0.78.0>,
                      {{badmatch,2},
                       [{echo_server,handle_call,3},
                        {gen_server,handle_msg,5},
                        {proc_lib,init_p_do_apply,3}]}}
(<0.77.0>) getting_unlinked <0.78.0>
(<0.75.0>) << {'DOWN',#Ref<0.0.0.358>,process,<0.78.0>,
                      {{badmatch,2},
                       [{echo_server,handle_call,3},
                        {gen_server,handle_msg,5},
                        {proc_lib,init_p_do_apply,3}]}}
(<0.77.0>) call echo_server:start_link()

Well, it tries... Since it happens what Filippo said...

错々过的事 2024-09-07 14:05:50

另一方面,如果必须从控制台内测试重新启动策略,请使用控制台启动主管并使用 pman 检查以终止进程。

您会看到 pman 使用相同的管理程序 Pid 进行刷新,但使用不同的工作程序 Pid,具体取决于您在重启策略中设置的 MaxR 和 MaxT。

On the other hand, if at all restart-strategy has to be tested from within console, use console to start the supervisor and check with pman to kill the process.

You would see that pman refreshes with same supervisor Pid but with different worker Pids depending upon the MaxR and MaxT you have set in restart-strategy.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文