Mnesia 无法连接到另一个节点

发布于 2024-11-28 06:08:43 字数 1650 浏览 7 评论 0原文

我正在设置一个rabbitmq集群,并在该过程的一步中遇到了问题。它直接来自rabbitmq 集群指南。

root@celery:~# rabbitmqctl status
Status of node celery@celery ...
[{pid,20410},
 {running_applications,[{rabbit,"RabbitMQ","2.5.1"},
                        {os_mon,"CPO  CXC 138 46","2.2.4"},
                        {sasl,"SASL  CXC 138 11","2.1.8"},
                        {mnesia,"MNESIA  CXC 138 12","4.4.12"},
                        {stdlib,"ERTS  CXC 138 10","1.16.4"},
                        {kernel,"ERTS  CXC 138 10","2.13.4"}]},
 {os,{unix,linux}},
 {erlang_version,"Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:30] [hipe] [kernel-poll:true]\n"},
 {memory,[{total,25296704},
          {processes,9680280},
          {processes_used,9662720},
          {system,15616424},
          {atom,1099393},
          {atom_used,1082732},
          {binary,89768},
          {code,11606637},
          {ets,726848}]}]
...done.
root@celery:~# rabbitmqctl cluster_status
Cluster status of node celery@celery ...
[{nodes,[{disc,[celery@celery]}]},{running_nodes,[celery@celery]}]
...done.
root@celery:~# rabbitmqctl stop_app
Stopping node celery@celery ...
...done.
root@celery:~# rabbitmqctl reset
Resetting node celery@celery ...
...done.
root@celery:~# rabbitmqctl cluster worker1@worker1
Clustering node celery@celery with [worker1@worker1] ...
Error: {failed_to_cluster_with,[worker1@worker1],
                               "Mnesia could not connect to some nodes."}

一个节点无法连接到另一节点的可能原因有哪些?

这是我遵循的指南: http://www.rabbitmq.com/clustering.html

I am setting up a rabbitmq cluster and ran into an issue during the one step in the process. Its straight out of the rabbitmq clustering guide.

root@celery:~# rabbitmqctl status
Status of node celery@celery ...
[{pid,20410},
 {running_applications,[{rabbit,"RabbitMQ","2.5.1"},
                        {os_mon,"CPO  CXC 138 46","2.2.4"},
                        {sasl,"SASL  CXC 138 11","2.1.8"},
                        {mnesia,"MNESIA  CXC 138 12","4.4.12"},
                        {stdlib,"ERTS  CXC 138 10","1.16.4"},
                        {kernel,"ERTS  CXC 138 10","2.13.4"}]},
 {os,{unix,linux}},
 {erlang_version,"Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:30] [hipe] [kernel-poll:true]\n"},
 {memory,[{total,25296704},
          {processes,9680280},
          {processes_used,9662720},
          {system,15616424},
          {atom,1099393},
          {atom_used,1082732},
          {binary,89768},
          {code,11606637},
          {ets,726848}]}]
...done.
root@celery:~# rabbitmqctl cluster_status
Cluster status of node celery@celery ...
[{nodes,[{disc,[celery@celery]}]},{running_nodes,[celery@celery]}]
...done.
root@celery:~# rabbitmqctl stop_app
Stopping node celery@celery ...
...done.
root@celery:~# rabbitmqctl reset
Resetting node celery@celery ...
...done.
root@celery:~# rabbitmqctl cluster worker1@worker1
Clustering node celery@celery with [worker1@worker1] ...
Error: {failed_to_cluster_with,[worker1@worker1],
                               "Mnesia could not connect to some nodes."}

What are the possible reasons one node wouldn't be able to connect to another?

Here's the guide I'm following: http://www.rabbitmq.com/clustering.html

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

放低过去 2024-12-05 06:08:43

我跳进了 freenode 上的 #rabbitmq 频道。以下是接下来的讨论:

14:29 shakakai: hey all, i'm having a little issue with clustering rabbitmq http://stackoverflow.com/questions/6948624/mnesia-cant-connect-to-another-node
14:30 shakakai: has anyone run into that problem before?
14:30 daysmen has left IRC (Read error: Connection reset by peer)
14:30 antares_: shakakai: make sure that epmd is running on every node
14:30 antares_: shakakai: and that port it uses (4369) is open in your firewall
14:31 |Blaze|: shakakai: is your dns correct?  Can you ping worker1 from celery and celery from worker1
14:31 shakakai: |Blaze|: hmm...i'll check
14:31 daysmen has joined ([email protected])
14:32 shakakai: |Blaze|: this is where I'm a little confused, the rabbitmq nodename is worker1@worker1 but the fqdn to ping the box is "ping worker1.mydomain.com"
14:33 |Blaze|: can you "ping worker1"
14:34 shakakai: |Blaze|: no
14:34 |Blaze|: k, you'll need to fix that
14:34 hyperboreean has left IRC (Ping timeout: 250 seconds)
14:37 shakakai: |Blaze|: gotcha, so I setup a hosts file and i should be good
14:37 |Blaze|: yup
14:37 |Blaze|: in both directions

TL;DR

确保您可以从您正在集群的每个盒子中 ping 通兔子节点名。如果不能,请为每个兔子节点名设置一个主机文件。

I jumped into the #rabbitmq channel on freenode. Here's the discussion that followed:

14:29 shakakai: hey all, i'm having a little issue with clustering rabbitmq http://stackoverflow.com/questions/6948624/mnesia-cant-connect-to-another-node
14:30 shakakai: has anyone run into that problem before?
14:30 daysmen has left IRC (Read error: Connection reset by peer)
14:30 antares_: shakakai: make sure that epmd is running on every node
14:30 antares_: shakakai: and that port it uses (4369) is open in your firewall
14:31 |Blaze|: shakakai: is your dns correct?  Can you ping worker1 from celery and celery from worker1
14:31 shakakai: |Blaze|: hmm...i'll check
14:31 daysmen has joined ([email protected])
14:32 shakakai: |Blaze|: this is where I'm a little confused, the rabbitmq nodename is worker1@worker1 but the fqdn to ping the box is "ping worker1.mydomain.com"
14:33 |Blaze|: can you "ping worker1"
14:34 shakakai: |Blaze|: no
14:34 |Blaze|: k, you'll need to fix that
14:34 hyperboreean has left IRC (Ping timeout: 250 seconds)
14:37 shakakai: |Blaze|: gotcha, so I setup a hosts file and i should be good
14:37 |Blaze|: yup
14:37 |Blaze|: in both directions

TL;DR

Make sure you can ping the rabbit nodename from each of the boxes you are clustering. If you can't, setup a hosts file for each rabbit nodename.

绿萝 2024-12-05 06:08:43

我在安装Docker RabbitMQ的过程中也遇到了类似的问题。

主要原因是/var/lib/RabbitMQ/mnesia/rabbit/cluster_nodes.config配置文件出现错误,无法连接。

Mnesia 是一个用 Erlang 编程语言编写的分布式软实时数据库管理系统

有几种方法可以修复这个问题:

  1. 修复配置文件,使用正确的集群节点名称,从日志中我们看到我们的节点名称是 < code>rabbit@cb43449d5d72
// log info 
...
rabbitmq    |   Starting broker...2019-11-27 16:18:22.621 [info] <0.304.0>
rabbitmq    |  node           : rabbit@cb43449d5d72
...

// This is the wrong configuration file:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72,rabbit@dc3288264c34],[rabbit@dc3288264c34]}.

// Update it with correctly config node name, and restart RabbitMQ server:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72],[rabbit@cb43449d5d72]}.
  1. 最简单的方法是删除 mnesia 目录并配置正确的 node 名称,例如rabbit@my-rabbit,在/etc/hosts中为127.0.0.1 my-rabbit,运行后应该看到如下配置细节
$ find . -name cluster_nodes.config
./mnesia/rabbit/cluster_nodes.config
./mnesia/rabbit@my-rabbit/cluster_nodes.config

$ cat ./mnesia/rabbit@my-rabbit/cluster_nodes.config
{['rabbit@my-rabbit'],['rabbit@my-rabbit']}.

I installed the Docker RabbitMQ also encountered similar problems in the process.

The main reason is /var/lib/RabbitMQ/mnesia/rabbit/cluster_nodes.config configuration file on errors cannot be connected to.

Mnesia is a distributed, soft real-time database management system written in the Erlang programming language

There are several ways to repair this problem:

  1. Fix the configure file,using the correct cluster node name, from the log we see that our Node name is rabbit@cb43449d5d72
// log info 
...
rabbitmq    |   Starting broker...2019-11-27 16:18:22.621 [info] <0.304.0>
rabbitmq    |  node           : rabbit@cb43449d5d72
...

// This is the wrong configuration file:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72,rabbit@dc3288264c34],[rabbit@dc3288264c34]}.

// Update it with correctly config node name, and restart RabbitMQ server:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72],[rabbit@cb43449d5d72]}.
  1. The simplest way is to remove the mnesia directory and configure the correct node name, which like rabbit@my-rabbit, in /etc/hosts is 127.0.0.1 my-rabbit, after the operation, you should see the following configuration details
$ find . -name cluster_nodes.config
./mnesia/rabbit/cluster_nodes.config
./mnesia/rabbit@my-rabbit/cluster_nodes.config

$ cat ./mnesia/rabbit@my-rabbit/cluster_nodes.config
{['rabbit@my-rabbit'],['rabbit@my-rabbit']}.
攒眉千度 2024-12-05 06:08:43

在让集群正常工作之前,需要检查几件事:
0) 确保您在每个节点上运行完全相同的rabbitmq版本
1) 设置网络,直到能够互相 ping 通每台服务器
2) cookie - 您必须在每台服务器上的 .erlang.cookie 文件中获取完全相同的 erlang cookie
一个有用的技巧是从一个节点尝试此命令,看看是否可以从rabbitmq到达另一个节点
rabbitmqctl eval 'net_adm:ping(rabbit@othernode).'

如果是 nok,则应显示 Pang;如果可以,则应显示 pong
请小心不要忘记靠近 eval 表达式末尾的点。

经过几个小时的不成功尝试后,我发现它工作得很好。

3) 请记住,如果集群的节点不是最后一个停止的节点,则在重新启动该节点时可能会出现问题 - 在最后一个停止重新启动之前,它不会启动。
当以上所有(0 到 2)都正确时,3 很可能是您问题的根本原因......

希望这有所帮助,
干杯,
新山

There are several things to check before you can get the cluster to work well:
0) Ensure you are running the exact same rabbitmq version on each node
1) set up network until you are able to ping each server from each other
2) cookies - You have to get the exact same erlang cookie in the .erlang.cookie file on each server
One trick is useful is to try this command from one node to see if you can reach another one from rabbitmq
rabbitmqctl eval 'net_adm:ping(rabbit@othernode).'

this should say Pang if it's nok or pong if it's ok
be careful to not forget the dot close to the end of the eval expression.

I got it working fine after several hours of unsuccessful trials.

3) Bear in mind that there may be an issue when restarting a node of a cluster if this node was not the last that was stop - it wont start before the last that stop was restarted.
When all the above (0 to 2) are correct, 3 may well be the root cause of your problem...

Hope this help,
cheers,
jb

无戏配角 2024-12-05 06:08:43

我读到的一件事是,erlang cookie 需要位于所有集群节点上,以便它们可以进行通信。我相信它位于 /var/lib/rabbitmq/.erlang.cookie

One thing I've read is that the erlang cookie needs to be on all cluster nodes so that they can communicate. i believe it lives in /var/lib/rabbitmq/.erlang.cookie

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文