Mnesia 无法连接到另一个节点

发布于 2024-11-28 06:08:43 字数 1650 浏览 7 评论 0原文

我正在设置一个rabbitmq集群，并在该过程的一步中遇到了问题。它直接来自rabbitmq 集群指南。

root@celery:~# rabbitmqctl status
Status of node celery@celery ...
[{pid,20410},
 {running_applications,[{rabbit,"RabbitMQ","2.5.1"},
                        {os_mon,"CPO  CXC 138 46","2.2.4"},
                        {sasl,"SASL  CXC 138 11","2.1.8"},
                        {mnesia,"MNESIA  CXC 138 12","4.4.12"},
                        {stdlib,"ERTS  CXC 138 10","1.16.4"},
                        {kernel,"ERTS  CXC 138 10","2.13.4"}]},
 {os,{unix,linux}},
 {erlang_version,"Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:30] [hipe] [kernel-poll:true]\n"},
 {memory,[{total,25296704},
          {processes,9680280},
          {processes_used,9662720},
          {system,15616424},
          {atom,1099393},
          {atom_used,1082732},
          {binary,89768},
          {code,11606637},
          {ets,726848}]}]
...done.
root@celery:~# rabbitmqctl cluster_status
Cluster status of node celery@celery ...
[{nodes,[{disc,[celery@celery]}]},{running_nodes,[celery@celery]}]
...done.
root@celery:~# rabbitmqctl stop_app
Stopping node celery@celery ...
...done.
root@celery:~# rabbitmqctl reset
Resetting node celery@celery ...
...done.
root@celery:~# rabbitmqctl cluster worker1@worker1
Clustering node celery@celery with [worker1@worker1] ...
Error: {failed_to_cluster_with,[worker1@worker1],
                               "Mnesia could not connect to some nodes."}

一个节点无法连接到另一节点的可能原因有哪些？

这是我遵循的指南： http://www.rabbitmq.com/clustering.html

原文

I am setting up a rabbitmq cluster and ran into an issue during the one step in the process. Its straight out of the rabbitmq clustering guide.

root@celery:~# rabbitmqctl status
Status of node celery@celery ...
[{pid,20410},
 {running_applications,[{rabbit,"RabbitMQ","2.5.1"},
                        {os_mon,"CPO  CXC 138 46","2.2.4"},
                        {sasl,"SASL  CXC 138 11","2.1.8"},
                        {mnesia,"MNESIA  CXC 138 12","4.4.12"},
                        {stdlib,"ERTS  CXC 138 10","1.16.4"},
                        {kernel,"ERTS  CXC 138 10","2.13.4"}]},
 {os,{unix,linux}},
 {erlang_version,"Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:30] [hipe] [kernel-poll:true]\n"},
 {memory,[{total,25296704},
          {processes,9680280},
          {processes_used,9662720},
          {system,15616424},
          {atom,1099393},
          {atom_used,1082732},
          {binary,89768},
          {code,11606637},
          {ets,726848}]}]
...done.
root@celery:~# rabbitmqctl cluster_status
Cluster status of node celery@celery ...
[{nodes,[{disc,[celery@celery]}]},{running_nodes,[celery@celery]}]
...done.
root@celery:~# rabbitmqctl stop_app
Stopping node celery@celery ...
...done.
root@celery:~# rabbitmqctl reset
Resetting node celery@celery ...
...done.
root@celery:~# rabbitmqctl cluster worker1@worker1
Clustering node celery@celery with [worker1@worker1] ...
Error: {failed_to_cluster_with,[worker1@worker1],
                               "Mnesia could not connect to some nodes."}

What are the possible reasons one node wouldn't be able to connect to another?

Here's the guide I'm following: http://www.rabbitmq.com/clustering.html

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

放低过去 2024-12-05 06:08:43

我跳进了 freenode 上的 #rabbitmq 频道。以下是接下来的讨论：

14:29 shakakai: hey all, i'm having a little issue with clustering rabbitmq http://stackoverflow.com/questions/6948624/mnesia-cant-connect-to-another-node
14:30 shakakai: has anyone run into that problem before?
14:30 daysmen has left IRC (Read error: Connection reset by peer)
14:30 antares_: shakakai: make sure that epmd is running on every node
14:30 antares_: shakakai: and that port it uses (4369) is open in your firewall
14:31 |Blaze|: shakakai: is your dns correct?  Can you ping worker1 from celery and celery from worker1
14:31 shakakai: |Blaze|: hmm...i'll check
14:31 daysmen has joined ([email protected])
14:32 shakakai: |Blaze|: this is where I'm a little confused, the rabbitmq nodename is worker1@worker1 but the fqdn to ping the box is "ping worker1.mydomain.com"
14:33 |Blaze|: can you "ping worker1"
14:34 shakakai: |Blaze|: no
14:34 |Blaze|: k, you'll need to fix that
14:34 hyperboreean has left IRC (Ping timeout: 250 seconds)
14:37 shakakai: |Blaze|: gotcha, so I setup a hosts file and i should be good
14:37 |Blaze|: yup
14:37 |Blaze|: in both directions

TL;DR

确保您可以从您正在集群的每个盒子中 ping 通兔子节点名。如果不能，请为每个兔子节点名设置一个主机文件。

I jumped into the #rabbitmq channel on freenode. Here's the discussion that followed:

14:29 shakakai: hey all, i'm having a little issue with clustering rabbitmq http://stackoverflow.com/questions/6948624/mnesia-cant-connect-to-another-node
14:30 shakakai: has anyone run into that problem before?
14:30 daysmen has left IRC (Read error: Connection reset by peer)
14:30 antares_: shakakai: make sure that epmd is running on every node
14:30 antares_: shakakai: and that port it uses (4369) is open in your firewall
14:31 |Blaze|: shakakai: is your dns correct?  Can you ping worker1 from celery and celery from worker1
14:31 shakakai: |Blaze|: hmm...i'll check
14:31 daysmen has joined ([email protected])
14:32 shakakai: |Blaze|: this is where I'm a little confused, the rabbitmq nodename is worker1@worker1 but the fqdn to ping the box is "ping worker1.mydomain.com"
14:33 |Blaze|: can you "ping worker1"
14:34 shakakai: |Blaze|: no
14:34 |Blaze|: k, you'll need to fix that
14:34 hyperboreean has left IRC (Ping timeout: 250 seconds)
14:37 shakakai: |Blaze|: gotcha, so I setup a hosts file and i should be good
14:37 |Blaze|: yup
14:37 |Blaze|: in both directions

TL;DR

Make sure you can ping the rabbit nodename from each of the boxes you are clustering. If you can't, setup a hosts file for each rabbit nodename.

回复收藏 0 原文

绿萝 2024-12-05 06:08:43

我在安装Docker RabbitMQ的过程中也遇到了类似的问题。

主要原因是/var/lib/RabbitMQ/mnesia/rabbit/cluster_nodes.config配置文件出现错误，无法连接。

Mnesia 是一个用 Erlang 编程语言编写的分布式软实时数据库管理系统

有几种方法可以修复这个问题：

修复配置文件，使用正确的集群节点名称，从日志中我们看到我们的节点名称是 < code>rabbit@cb43449d5d72

// log info 
...
rabbitmq    |   Starting broker...2019-11-27 16:18:22.621 [info] <0.304.0>
rabbitmq    |  node           : rabbit@cb43449d5d72
...

// This is the wrong configuration file:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72,rabbit@dc3288264c34],[rabbit@dc3288264c34]}.

// Update it with correctly config node name, and restart RabbitMQ server:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72],[rabbit@cb43449d5d72]}.

最简单的方法是删除 mnesia 目录并配置正确的 node 名称，例如rabbit@my-rabbit，在/etc/hosts中为127.0.0.1 my-rabbit，运行后应该看到如下配置细节

$ find . -name cluster_nodes.config
./mnesia/rabbit/cluster_nodes.config
./mnesia/rabbit@my-rabbit/cluster_nodes.config

$ cat ./mnesia/rabbit@my-rabbit/cluster_nodes.config
{['rabbit@my-rabbit'],['rabbit@my-rabbit']}.

I installed the Docker RabbitMQ also encountered similar problems in the process.

The main reason is /var/lib/RabbitMQ/mnesia/rabbit/cluster_nodes.config configuration file on errors cannot be connected to.

Mnesia is a distributed, soft real-time database management system written in the Erlang programming language

There are several ways to repair this problem:

Fix the configure file，using the correct cluster node name, from the log we see that our Node name is rabbit@cb43449d5d72

// log info 
...
rabbitmq    |   Starting broker...2019-11-27 16:18:22.621 [info] <0.304.0>
rabbitmq    |  node           : rabbit@cb43449d5d72
...

// This is the wrong configuration file:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72,rabbit@dc3288264c34],[rabbit@dc3288264c34]}.

// Update it with correctly config node name, and restart RabbitMQ server:
$ cat ./mnesia/rabbit/cluster_nodes.config
{[rabbit@cb43449d5d72],[rabbit@cb43449d5d72]}.

The simplest way is to remove the mnesia directory and configure the correct node name, which like rabbit@my-rabbit, in /etc/hosts is 127.0.0.1 my-rabbit, after the operation, you should see the following configuration details

$ find . -name cluster_nodes.config
./mnesia/rabbit/cluster_nodes.config
./mnesia/rabbit@my-rabbit/cluster_nodes.config

$ cat ./mnesia/rabbit@my-rabbit/cluster_nodes.config
{['rabbit@my-rabbit'],['rabbit@my-rabbit']}.

回复收藏 0 原文

攒眉千度 2024-12-05 06:08:43

在让集群正常工作之前，需要检查几件事：
0) 确保您在每个节点上运行完全相同的rabbitmq版本
1) 设置网络，直到能够互相 ping 通每台服务器
2) cookie - 您必须在每台服务器上的 .erlang.cookie 文件中获取完全相同的 erlang cookie
一个有用的技巧是从一个节点尝试此命令，看看是否可以从rabbitmq到达另一个节点
rabbitmqctl eval 'net_adm:ping(rabbit@othernode).'

如果是 nok，则应显示 Pang；如果可以，则应显示 pong
请小心不要忘记靠近 eval 表达式末尾的点。

经过几个小时的不成功尝试后，我发现它工作得很好。

3) 请记住，如果集群的节点不是最后一个停止的节点，则在重新启动该节点时可能会出现问题 - 在最后一个停止重新启动之前，它不会启动。
当以上所有（0 到 2）都正确时，3 很可能是您问题的根本原因......

希望这有所帮助，
干杯，
新山