Erlang:无法在另一个节点上启动主管

发布于 2025-01-12 06:23:22 字数 2480 浏览 0 评论 0原文

我有一个简单的主管,看起来像这样

-module(a_sup).
-behaviour(supervisor).

%% API
-export([start_link/0, init/1]).

start_link() ->
  supervisor:start_link({local,?MODULE}, ?MODULE, []).

init(_Args) ->
  RestartStrategy = {simple_one_for_one, 5, 3600},
  ChildSpec = {
    a_gen_server,
    {a_gen_server, start_link, []},
    permanent,
    brutal_kill,
    worker,
    [a_gen_server]
  },
  {ok, {RestartStrategy,[ChildSpec]}}.

当我在 shell 上运行它时,它工作得很好。但现在我想在不同的节点上运行该管理程序的不同实例,称为 foo 和 bar (以 erl -sname fooerl -sname bar 开头,来自一个单独的节点称为 main erl -sname main)。这就是我尝试启动此 rpc:call('foo@My-MacBook-Pro', a_sup, start_link, []). 的方式,但是在回复 ok 后,它立即失败并显示此消息

{ok,<9098.117.0>}
=ERROR REPORT==== 7-Mar-2022::16:05:45.416820 ===
** Generic server a_sup terminating 
** Last message in was {'EXIT',<9098.116.0>,
                               {#Ref<0.3172713737.1597505552.87599>,return,
                                {ok,<9098.117.0>}}}
** When Server state == {state,
                            {local,a_sup},
                            simple_one_for_one,
                            {[a_gen_server],
                             #{a_gen_server =>
                                   {child,undefined,a_gen_server,
                                       {a_gen_server,start_link,[]},
                                       permanent,false,brutal_kill,worker,
                                       [a_gen_server]}}},
                            {maps,#{}},
                            5,3600,[],0,never,a_sup,[]}
** Reason for termination ==
** {#Ref<0.3172713737.1597505552.87599>,return,{ok,<9098.117.0>}}

(main@Prachis-MacBook-Pro)2> =CRASH REPORT==== 7-Mar-2022::16:05:45.416861 ===
  crasher:
    initial call: supervisor:a_sup/1
    pid: <9098.117.0>
    registered_name: a_sup
    exception exit: {#Ref<0.3172713737.1597505552.87599>,return,
                     {ok,<9098.117.0>}}
      in function  gen_server:decode_msg/9 (gen_server.erl, line 481)
    ancestors: [<9098.116.0>]
    message_queue_len: 0
    messages: []
    links: []
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 610
    stack_size: 29
    reductions: 425
  neighbours:

From这条消息看起来像是调用期望主管是 gen_server ?当我尝试在像这样的节点上启动 gen_server 时,效果很好,但对于主管则不然。我似乎无法弄清楚尝试在本地/远程节点上启动主管是否有什么不同,如果是,我们应该做什么来解决这个问题?

I have a simple supervisor that looks like this

-module(a_sup).
-behaviour(supervisor).

%% API
-export([start_link/0, init/1]).

start_link() ->
  supervisor:start_link({local,?MODULE}, ?MODULE, []).

init(_Args) ->
  RestartStrategy = {simple_one_for_one, 5, 3600},
  ChildSpec = {
    a_gen_server,
    {a_gen_server, start_link, []},
    permanent,
    brutal_kill,
    worker,
    [a_gen_server]
  },
  {ok, {RestartStrategy,[ChildSpec]}}.

When I run this on the shell, it works perfectly fine. But now I want to run different instances of this supervisor on different nodes, called foo and bar (started as erl -sname foo and erl -sname bar, from a separate node called main erl -sname main). This is how I try to initiate this rpc:call('foo@My-MacBook-Pro', a_sup, start_link, [])., but after replying with ok it immediately fails with this message

{ok,<9098.117.0>}
=ERROR REPORT==== 7-Mar-2022::16:05:45.416820 ===
** Generic server a_sup terminating 
** Last message in was {'EXIT',<9098.116.0>,
                               {#Ref<0.3172713737.1597505552.87599>,return,
                                {ok,<9098.117.0>}}}
** When Server state == {state,
                            {local,a_sup},
                            simple_one_for_one,
                            {[a_gen_server],
                             #{a_gen_server =>
                                   {child,undefined,a_gen_server,
                                       {a_gen_server,start_link,[]},
                                       permanent,false,brutal_kill,worker,
                                       [a_gen_server]}}},
                            {maps,#{}},
                            5,3600,[],0,never,a_sup,[]}
** Reason for termination ==
** {#Ref<0.3172713737.1597505552.87599>,return,{ok,<9098.117.0>}}

(main@Prachis-MacBook-Pro)2> =CRASH REPORT==== 7-Mar-2022::16:05:45.416861 ===
  crasher:
    initial call: supervisor:a_sup/1
    pid: <9098.117.0>
    registered_name: a_sup
    exception exit: {#Ref<0.3172713737.1597505552.87599>,return,
                     {ok,<9098.117.0>}}
      in function  gen_server:decode_msg/9 (gen_server.erl, line 481)
    ancestors: [<9098.116.0>]
    message_queue_len: 0
    messages: []
    links: []
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 610
    stack_size: 29
    reductions: 425
  neighbours:

From the message it looks like the call expects the supervisor to be a gen_server instead? And when I try to initiat a gen_server on the node like this, it works out just fine, but not with supervisors. I can't seem to figure out if there's something different in trying to initiate supervisor on local/remote nodes, and if yes, what should we do to fix the issue?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

谁的年少不轻狂 2025-01-19 06:23:24

根据 @JoséM 的建议,远程节点中的主管也链接到临时 RPC 进程。但是由于supervisor没有提供start方法,修改start_link()方法可以

start_link() ->
  Pid = supervisor:start_link({local,?MODULE}, ?MODULE, []).
  unlink(Pid),
  {ok, Pid}.

解决问题。

As per @JoséM's suggestion, the supervisor in the remote node is also linked to the ephemeral RPC process. However since supervisor does not provide a start method, modifying the start_link() method as

start_link() ->
  Pid = supervisor:start_link({local,?MODULE}, ?MODULE, []).
  unlink(Pid),
  {ok, Pid}.

solves the issue.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文