RabbitMQ Erlang distribution failed - windows

I have two Windows Server 2012 R2 machines located in one of the client's datacenters. Both servers are domain-joined. They both have RabbitMQ 3.6.0. installed on them. RabbitMQ is running as Windows Service on both machines. I've tried to cluster these two machines for a long time now without success. I always get the following error when I try to cluster them.
One the first machine nodeA I run the command 'rabbitmqctl join_cluster rabbit#nodeB'. This is what I get:
Clustering node 'rabbit#nodeA' with 'rabbit#nodeB' ...
Error: unable to connect to nodes ['rabbit#nodeB']: nodedown
`DIAGNOSTICS`
===========
attempted to contact: ['rabbit#nodeB']
rabbit#nodeB:
* connected to epmd (port 4369) on nodeB
* epmd reports node 'rabbit' running on port 25672
* TCP connection succeeded but Erlang distribution failed
* suggestion: hostname mismatch?
* suggestion: is the cookie set correctly?
* suggestion: is the Erlang distribution using TLS?
current node details:
- node name: 'rabbitmq-cli-3892#nodeA'
- home dir: C:\Users\mydirectory
- cookie hash: l+SSu57+cRyAQ03AJdwAbQ==
I've tried this setup with Azure Virtual Machines within Azure Virtual Network and I succeeded to cluster the two VM's, however it seems I cannot connect these two (customer's machines) together.
This is what I have done and ensured:
There isn't any firewall blocking connections
Added host names to hosts file located on C:\Windows\system32\drivers\etc
Tried to refer to host names as FQDN without adding anything to hosts file
Tried to refer to host names with CAPITAL letters and without
Copied the same exact .erlang.cookie to C:\Windows and C:\Users\mydirectory on both machines.
I've read, understood and applied RabbitMQ Clustering Guide https://www.rabbitmq.com/clustering.html
Stopped, restarted, reinstalled RabbitMQ on both machines.
It seems I can't get it to work. On Azure machines, which were not domain-joined clustering worked beautifully. I am really running out of options... Any help?

i had the same problem you need to install rabbitmq as a admin. uninstall then reinstall as admin and it should work fine

Try to connect to each of RabbitMQ nodes via remote shell and check if value of cookie is the same (cookie can be set in 3 different ways: .erlang.cookie is one of them).
erl -remsh 'rabbitmq-cli-3892#nodeA' -name 'test#nodeA'
erlang:get_cookie().

Related

Unable to launch rabbitmq management console in Windows

On a Windows 7 Enterprise 64 Bit OS, I installed Erlang (otp_win64_20.0.exe) and RabbitMQ 3.6.9 (64bit) as standalone one. I have set System Variable for ERLANG_HOME. The installation was successful and RabbitMQ service is running.
But when I trying to enable rabbitmq_management, I am getting following error.
C:\Program Files\RabbitMQ Server\rabbitmq_server-3.6.9\sbin>rabbitmq-plugins.bat enable rabbitmq_management
Plugin configuration unchanged.
Applying plugin configuration to rabbit#machinename... failed.
* Could not contact node rabbit#machinename.
Changes will take effect at broker restart.
* Options: --online - fail if broker cannot be contacted.
--offline - do not try to contact broker.
C:\Program Files\RabbitMQ Server\rabbitmq_server-3.6.9\sbin>rabbitmqctl status
Status of node rabbit#machinename ...
Error: unable to connect to node rabbit#machinename: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit#machinename]
rabbit#machinename:
* connected to epmd (port 4369) on machinename
* epmd reports node 'rabbit' running on port 25672
* TCP connection succeeded but Erlang distribution failed
* Authentication failed (rejected by the remote node), please check the Erlang cookie
current node details:
- node name: 'rabbitmq-cli-45#machinename'
- home dir: C:\
- cookie hash: LLCyvm2Dd7VpUhtY9jxerg==
I am going through various posts in stackoverflow and still could not figure out what is the root cause of this issue with node and management plugin.
Any help to resolve this is highly appreciated.
It looks like you have problem with `erlang.cookie. It contains key that allows connecting to Erlang node. You can read more about it in official documentation, but simplest solution can be found here
Installing as a non-administrator user leaves .erlang.cookie in the wrong place
This makes it impossible to use rabbitmqctl.
Workarounds:
Run the installer as an administrator or
Copy the file .erlang.cookie manually from %SystemRoot% to %HOMEDRIVE%%HOMEPATH%.
Where %SystemRoot% is normally C:\WINDOWS\.erlang.cookie and %HOMEDRIVE%%HOMEPATH%should be something like C:\Documents and Settings\%USERNAME%\.erlang.cookie or C:\Users\%USERNAME%\.erlang.cookie
This should solve your problem.

rabbitmqctl Error: unable to connect to node rabbit#myserver nodedown

I am running RabbitMQ v3.3.5 with Erlang OTP 17.1 on Windows 2008 R2. My Dev and QA environments are stand-alone. My staging and production environments are clustered.
I am finding this one problem happening often where the RabbitMQ service is running, the RabbitMQ management console is seeing everything, but when I try running rabbitmqctl from the command line it fails with an error saying that the node is down (tried locally and on a remote server).
This problem is resolved if I restart the Windows service.
I see no error message in the RabbitMQ error log. The last message indicated that the node was up.
Below is an example output of the issue that I recently experienced on node 2 of our staging windows cluster:
PS C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.3.5\sbin> .\rabbitmqctl.bat status
Status of node rabbit#MYSERVER2 ...
Error: unable to connect to node rabbit#MYSERVER2: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit#MYSERVER2]
rabbit#MYSERVER2:
* connected to epmd (port 4369) on MYSERVER2
* epmd reports: node 'rabbit' not running at all
no other nodes on MYSERVER2
* suggestion: start the node
current node details:
- node name: rabbitmqctl2199771#MYSERVER2
- home dir: C:\Users\RabbitMQ
- cookie hash: mn6OaTX9mS4DnZaiOzg8pA==
at this point I restart the RabbitMQ service and then try again
PS C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.3.5\sbin> .\rabbitmqctl.bat status
Status of node rabbit#MYSERVER2...
[{pid,3784},
{running_applications,
[{rabbitmq_management_agent,"RabbitMQ Management Agent","3.3.5"},
{rabbit,"RabbitMQ","3.3.5"},
{os_mon,"CPO CXC 138 46","2.2.15"},
{mnesia,"MNESIA CXC 138 12","4.12.1"},
{xmerl,"XML parser","1.3.7"},
{sasl,"SASL CXC 138 11","2.4"},
{stdlib,"ERTS CXC 138 10","2.1"},
{kernel,"ERTS CXC 138 10","3.0.1"}]},
{os,{win32,nt}},
{erlang_version,
"Erlang/OTP 17 [erts-6.1] [64-bit] [smp:4:4] [async-threads:30]\n"},
{memory,
[{total,35960208},
{connection_procs,2704},
{queue_procs,5408},
{plugins,111936},
{other_proc,13695792},
{mnesia,102296},
{mgmt_db,0},
{msg_index,21816},
{other_ets,884704},
{binary,25776},
{code,16672826},
{atom,602729},
{other_system,3834221}]},
{alarms,[]},
{listeners,[{clustering,25672,"::"},{amqp,5672,"::"},{amqp,5672,"0.0.0.0"}]},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,3435787059},
{disk_free_limit,50000000},
{disk_free,74911649792},
{file_descriptors,
[{total_limit,8092},
{total_used,4},
{sockets_limit,7280},
{sockets_used,2}]},
{processes,[{limit,1048576},{used,139}]},
{run_queue,0},
{uptime,5}]
...done.
Any idea as to what causes this and how to automatically detect the situation?
Is this specifically a problem with running RabbitMQ on Windows?
Hostnames are case-insensitives when you are trying to resolve them. For example, LOCALHOST and localhost are the same host.
However, when Erlang constructs the name of a node (eg. rabbit#<hostname> in the case of RabbitMQ), this name is case-sensitive. So rabbit#LOCALHOST and rabbit#localhost are two different node names, even if they run on the same host.
Recently, we (the RabbitMQ team) found out that, on Windows, the node name constructed for RabbitMQ was inconsistent. Therefore, sometimes, RabbitMQ started as a Windows service could be named rabbit#MYHOST but rabbitmqctl would try to reach rabbit#myhost and fail.
Since RabbitMQ 3.6.0, the node name should be consistent.
To anyone else getting this error, this was my fix. I installed Erlang, but overlooked the instructions on setting up the Environmental Variable.
I was reading the manual install page:
https://www.rabbitmq.com/install-windows-manual.html
and found the following:
Set ERLANG_HOME to where you actually put your Erlang installation,
e.g. C:\Program Files\erlx.x.x (full path). The RabbitMQ batch files
expect to execute %ERLANG_HOME%\bin\erl.exe.
Go to Start > Settings > Control Panel > System > Advanced >
Environment Variables. Create the system environment variable
ERLANG_HOME and set it to the full path of the directory which
contains bin\erl.exe.
For some reason, the auto install assigned the wrong path name to the ERLANG_HOME variable - see image below. I simply added \bin on the end.
I had a similar problem on my linux box and am posting the answer here, because rabbitmq on windows may handle things similarly.
My post and solution: rabbtimqadmin - Could not connect: [Errno -2] Name or service not known
The core issue was changing the servername after rabbitmq was configured. When installed, rabbitmq references the servers name, making it part of its configuration. I can see this being a similar issue on windows.
In short, you can change server's name back to the name it was when you first installed rabbitmq or you can add a rabbitmq-env.conf file, I'm not sure where it would go in windows, but the following gives details for linux: https://www.rabbitmq.com/man/rabbitmq-env.conf.5.man.html
Note that on linux the name of the server was CaSe SENiTivE! So you may or may not have a similar issue with windows.
Hope this helps and good luck!
If you are using linux try to change permission of /var/lib/rabbitmq/mnesia folder.

Jmeter remote connection throwing "Connection refused to host"

I setup a distributed load testing environment using JMeter in unbundu machines.
->Master: the system running JMeter GUI, control each slave.
->Slave: the system running jmeter-server, receive command from the master and send a request to server under test.
->Target: the web server under test, get request from slaves.
Basic requirements are done:
-The firewalls on the systems are turned off
-All the planned master and Slaves are in the same subnet
-The JMeter server can access the target.
-Same version of JMeter on all the systems (version 2.3.4 ).
I did the following:
1) Tried pinging form master to slave and vice versa through ubundu terminal. its happening ..
2) Added the following to client (master) jmeter.properties:
# Remote hosts and RMI configuration
remote_hosts=192.168.0.139:1099
# RMI port to be used by the server (must start rmiregistry with same port)
server_port=1099
3) Added the following to server (Slave) jmeter.properties:
# On the server(s)
set server_port=1234
start rmiregistry with port 1234
4) Now started the Jmeter engine on Master.
a) Started Jmeter on master machine (GUI)
b) Created test plan--> (added tread group , samplers and required listners)
c) Now start the Slave(s) from the GUI
-click Run at the top
-select Remote start
-select the IP address
But error popup came as :-
"Connection refused to host : 192.168.0.139; nested exception is : java.net.ConnectionException : Connection Refused"
what may be the reason for not connecting with the remote salve (say here : 192.168.0.139)
DO i need to do any more configuration in jmeter.properties file or in any other files (in both slave and master)?
I think you forgot to start the slave in "slave mode".
In command line mode, go to jmeter/bin directory and execute jmeter-server.bat
That will start the slave process and will keeps it listening for commands.
Then you can go forward, loading amd launching the script.
have a look at:
http://jmeter.apache.org/usermanual/jmeter_distributed_testing_step_by_step.pdf
Also be aware that:
- the two systems MUST run the same Jmeter version
- the two systems MUST be on the same subnetwork
- the two systems SHOULD be as similar as possible: same OS, same directory tree, etc
- "remote_hosts" only require the address. The port is specified by "server_port" parameter.

Error in Cloudera Cluster installation process?

I have installed Cloudera manager successfully. It shows Currently managed hosts as 127.0.0.1 and it is active.
When I search and install cluster using the cloudera manager after the loads it shows the following error.
Installation failed. Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accessible on the Cloudera Manager server (check firewall rules).
Ensure that ports 9000 and 9001 are free on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).
The following image clearly shows the problem while installing my cluster on cloudera manager.
I had a similar problem and it turned out the issue was conveniently skipping (unfortunately) the ...password-less SSH key ... step
After several hours breaking my head over it, I realised this.
At the terminal do,
ls -al ~/.ssh
You must see files like,
abc
abc.pub
These are you public/private key pairs. [Not necessarily the same names as mine above].The file name you used in Setting up SSH public/private keys step for your machine.
You need to copy the data in abc.pub to a file authorized_keys in this same folder. If its not there, create authorized_keys.
Incase you don't have you public/private key pair see here
For ubuntu, the problem is usually because of the association of "ubuntu 127.0.1.1." in your /etc/hosts file. For me, after changing it to "ubuntu 127.0.0.1", which is the standard local loopback, I can add the cluster successfully. Hope this helps!
I was struggling with this problem for two days. Fixing /etc/hosts as suggested by "khoadoan" worked for me.
/etc/hosts was looking like this when I had the problem
127.0.0.1 localhost
127.0.1.1 ubuntu
I changed it like this:
127.0.0.1 localhost
127.0.0.1 ubuntu
Restarted the machine.
sudo init 6
Launched the Cloudera Manager Admin page. This time the host status was already showing up "Managed = Yes". And I got an additional tab "Currently Managed Hosts(1)", where the local host was listed.

Setting up RabbitMQ cluster on Windows servers

I am trying to set up a RabbitMQ cluster on Windows servers, and this requires using shared Erlang cookie file. According to the documentation, all I need to do is to ensure that the root directories on different machines contain the same .erlang.cookie file. So what I did is found these files on both machines and overwrote them with the same shared version.
After that all rabbitmqctl commands failed on the machine with new file version with "unable to connect to node..." error message. I tried to restart RabbitMQ Windows service, but still rabbitmqctl complained. I even reinstalled RabbitMQ on that machine, but then .erlang.cookie was reset back to the old version. Whenever I tried to use new version of cookie file, rabbitmqctl failed. When I restored an old version, it worked fine.
Basically I am stuck and can not proceed with cluster setup until I resolve this issue. Any help is appreciated.
UPDATE: Received an answer from RabbitMQ:
"rabbitmqctl will pick up the cookie from the user home directory while the service will pick it up from C:\windows. So you will need to synchronise those with each other, as well as with the other machine."
This basically means that cookie file needs to be repaced in two places: C:\Windows and current_user.
You have the above correct. The service will use the cookie at C:\Windows and when you use rabbitmqctl.bat to query the status it is using the cookie in your user directory (%USERPROFILE%).
When the cookies don't match the error look like
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-2.8.2\sbin>rabbitmqctl.bat status
Status of node 'rabbit#PC-FOOBAR' ...
Error: unable to connect to node 'rabbit#PC-FOOBAR': nodedown
DIAGNOSTICS
===========
nodes in question: ['rabbit#PC-FOOBAR']
hosts, their running nodes and ports:
- PC-FOOBAR: [{rabbit,49186},{rabbitmqctl30566,63150}]
current node details:
- node name: 'rabbitmqctl30566#pc-foobar'
- home dir: U:\
- cookie hash: Vp52cEvPP1PukagWi5S/fQ==
There is one more gotcha for RabbitMQ cookies on Windows... If you have a %HOMEDIR% and %HOMEPATH% environment variables (as we do in our current test environment, and sets homedir above to U:\), then RabbitMQ will get the cookie there and if there isn't one it makes one up and writes it there. This left me banging my head on my desk for quite a while when trying to get this working. Once I found this gotcha it was obvious the cookie files were the problem (as documented) they were just at an odd location (not documented AFAIK).
Hope this solves someones pain setting up RabbitMQ Clustering on Windows.

Resources