RSH connection refused while running MPI program - shell

I'm trying to run MPI programs on 8 machines, but I get the error
connect to address 127.0.0.1 port 544: Connection refused
Trying krb4 rsh...
connect to address 127.0.0.1 port 544: Connection refused
trying normal rsh (/usr/bin/rsh)
lagrid02: Connection refused
When I run it with a machinefile option, I get the error lagrid03: No route to host where lagrid03 is the neighbouring node connected to master node.
How should I rectify this ?

Regarding your first error, is rsh running on (all) the machine(s)? You'll need rsh or password-less ssh configured (and ask your mpi job launcher use ssh) before you can start jobs on different machines.
The second error indicates that there is no way to reach the machine lagrid03 with the current network config. I guess you have a /etc/hosts entry with the IP addresses for lagrid03, but you do not have an interface configured in that network. For a more detailed answer you'll need to post details about your network configuration.

The issue is with authentication, if you go into the /etc/pam.d/rsh file and move rlogin and rsh to the top and make it look like this, it would work just fine.
/* For root login to succeed here with pam_securetty, "rsh" must be listed in /etc/securetty.*/
auth required pam_nologin.so
auth required pam_securetty.so
auth required pam_env.so
auth required pam_rhosts_auth.so
account include system-auth
session optional pam_keyinit.so force revoke
session include system-auth

Related

Kubernetes on Windows Error: Unable to connect to the server: dial tcp some ip

I have downloaded Docker and then enabled Kubernetes on the Desktop. When I execute 'Kubectl version' command on the PowerShell it says:
kubectl : Unable to connect to the server: dial tcp : connectex:
A connection attempt failed because the connected party did not properly respond
after a period of time, or established connection failed because connected host has
failed to respond.
At line:1 char:1
kubectl version
The same issue started to occur today when I run anything related to kubectl on Windows. However, it previously worked fine. Maybe there are some recent updates in Windows/Docker.
UPD
Actually, my network sharing options were reset for some reasons. Please try the solution described below (works for me)
SOLUTION:
Check your Network and Sharing settings:
Control Panel > Network and Sharing > [YOUR_NETWORK] (For me it's my Wi-Fi connection) > Properties > Sharing
On the Sharing tab make sure that you have all checkboxes checked and that you selected the correct virtual network in the "Home network connection" field. If not, please use the correct one.

Cassandra: target machine actively refused it

I am trying to run Cassandra (CQL Shell) and I am receiving the following error, I have tried all the google responses to existing questions, nothing has fixed it so far.
Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(10061, "Tried connecting to [('127.0.0.1', 9042)]. Last error: No connection could be made because the target machine actively refused it")})
Before installing Apache Cassandra, JDK must be installed.
Can you make sure the IP address is set correctly on your rpc_address setting in your cassandra.yaml file, on your cassandra server.
Also, you need to make sure port 9042 is open and available for incoming traffic (if your IT department is setting up servers, it is possible this port is blocked, unless otherwise specified...)
Hope it helps.
I also faced the same issue , but may be the below 2 way's can help :
Option 1 :
In my case i haven't started the Cassandra Server and was directly trying to connect to Cassandra.
(a) Firstly start the cassandra server via cmd --> \bin>cassandra.bat -f
and then
(b) Try to connect to it's node --> \bin>cqlsh.bat -u cassandra
Option 2:
Try changing the rpc_address in your cassandra.yaml file to eihter 127.0.0.1 instead of localhost
or to 0.0.0.0 instead of localhost
and then again start the server from new CMD.

Unable to Retrieve Directory Using ProFTPD(WHM)

Well, after looking for many solutions. I came here now.
I am setting up WHM/cPanel for hosting website. Everything was going smooth but I am stuck on FTP connection (Server sent passive reply with unroutable address. Using server address instead.)
Server Details:
CentOS Linux release 7.2.1511 (Core)
WHM/cPanel Version 11.58.0.13
FTP Server: PureFTPD
Acutal error while connecting
To fix this issue and get FTP working you need to open up more numbered ports so FTP can connect. I assume you are using CSF.
Login to WHM then go to CSF >> Firewall Configuration >>
allow TCP_In 30000:50000 and TCP_Out 30000:50000
Once you made the changes Restart the firewall
Now you need to make changes in FTP config file to use these ports, you will find this file to this location /etc/pure-ftpd.conf
Now you will see a line as follows and you will need to uncomment it
# Port range for passive connections replies. - for firewalling.
PassivePortRange 30000 50000
Restart FTP Service and should work.

Adding Elastic IP causes shell login to fail

After associating Elastic IP on a Cloud server instance I cannot login anymore
ssh -i "ec2.pem" ubuntu#1.2.3.4
###########################################################
# WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! #
###########################################################
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is...
Please contact your system administrator.
How can I asssign a static IP (Elastic IP) with my EC2 Cloud server and still be able to login with the system / console?
This is merely a warning that you are connecting to a system that had a different SSH fingerprint, as stored in your local .ssh/known_hosts file. If you know things are okay, just delete the appropriate entry from that file and you can connect again.

distributed load testing on aws with jmeter

I have been trying to setup aws ec2 machines for load testing of my web server using jmeter but I am stuck. I have a jmeter client on my local machine and I want to set up multiple jmeter-server nodes on ec2 to do the load testing and I am, thus far, just trying to get one server node up and running. But it hasn't worked out for me yet.
I have the same jmeter running on my local machine and the server and the java version was a little different but I don't think that is the problem. Most of the people have had problems with getting the correct ip for connecting between the client and the server nodes but I, after a lot of searching, have gotten through all those problems. I am stuck at when the server node attempts to return the result and tries to connect to the client, my local machine. The server tries to connect to the external ip address of my local machine. But it throws a connection refused error, which apparently was caused by connection timeout. I guess it's some firewall issue but I tried turning off the firewall on my local machine but it still throws the same error. I am not sure how can I get past this and it's taking way too much time then it should.
Could somebody please suggest me something to solve this? Thanks!
My local machine is a Mac OS X 10.7.5 and my server nodes are on ubuntu.
This is the error that it throws:
2013/01/29 12:23:37 ERROR - jmeter.samplers.RemoteListenerWrapper: testStarted(host) java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.10; nested exception is:
java.net.ConnectException: Connection refused
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:128)
at org.apache.jmeter.samplers.RemoteSampleListenerImpl_Stub.testStarted(Unknown Source)
at org.apache.jmeter.samplers.RemoteListenerWrapper.testStarted(RemoteListenerWrapper.java:83)
at org.apache.jmeter.engine.StandardJMeterEngine.notifyTestListenersOfStart(StandardJMeterEngine.java:226)
at org.apache.jmeter.engine.StandardJMeterEngine.run(StandardJMeterEngine.java:349)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at java.net.Socket.connect(Socket.java:495)
at java.net.Socket.<init>(Socket.java:392)
at java.net.Socket.<init>(Socket.java:206)
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
... 8 more
Well, I finally solved the problem. I ended up using ssh reverse tunnels. I am not sure if there is a better way to do this though. So, in case anyone has a similar problem, this is how I did it:
Create a reverse ssh tunnel from the server to the client. So, at client side:
ssh -Nf -R [client.rmi.localport]:localhost:[client.rmi.localport on serverside] user#server
start server and have a client.rmi.localport as well; the port at which the tunnel was created;
start the client as: ./bin/jmeter-server -Djava.rmi.server.hostname=127.0.0.1.
And that's it! You have your distributed testing ready.
Solution that worked for me on Linux/OSX:
1.On the client edit bin/jmeter.properties and add:
remote_hosts=127.0.0.1:55501
client.rmi.localport=55512
mode=Batch
num_sample_threshold=250
2.On the server edit bin/jmeter.properties and add:
server_port=55501
server.rmi.localhostname=127.0.0.1
server.rmi.localport=55511
3.Now connect to the server using this ssh tunel:
ssh -L 55501:127.0.0.1:55501 -L 55511:127.0.0.1:55511 -R 55512:127.0.0.1:55512 user#hostname
4.Edit jmeter-server script to start jmeter.sh
${DIRNAME}/jmeter.sh ${RMI_HOST_DEF} -Dserver_port=${SERVER_PORT:-1099} -s -j jmeter-server.log "$#"
5.Now run on the server:
bin/jmeter-server -Djava.rmi.server.hostname=127.0.0.1
6.And on the client run jmeter with gui or add -n if gui is not needed:
bin/jmeter.sh -Djava.rmi.server.hostname=127.0.0.1
or, with test plan:
bin/jmeter.sh -Djava.rmi.server.hostname=127.0.0.1 -t /path/to/test-plan.jmx
Looks like you have to move your jmeter-master instance (jmeter client) to EC2 instance too.
As per JMeter Distributed Testing Step-by-step:
2. check all the clients are on the same subnet;
For distributed testing to work, the systems must be on the same subnet, otherwise RMI will not be able to connect.
Looks like to be your case: jmeter-slaves are in one subnet (EC2) and jmeter-master in another (your local workstation).
I wrote a free, open source script to help do exactly this. I went through the same issues listed by the OP and, even though I did get things working in the end, it was never great and I wanted something to automate away the hassle.

Resources