Ok so allegedly Redis Setninels now work with TLS.
I have the Master and Slaves replicating fine with stunnel.
However, I'm unable to get the Sentinels to communicate with eachother
as well as the Master.
I have 1 Master, 2 Slaves and 3 Sentinels
Sample of my stunnel.conf
pid = /run/stunnel.pid
output = /etc/stunnel/stunnel.log
[Redis server]
cert = /etc/stunnel/ABC_private.pem
accept = xxx.xx.160.77:26280
connect = 127.0.0.1:26280
[Client XYZ Redis Server]
client=yes
cert = /etc/stunnel/XYZ_private.pem
accept = 127.0.0.1:8000
connect = xxx.xx.161.78:6480
# SENTINEL SERVERS
[Client 123 Sentinel Server]
client=yes
cert = /etc/stunnel/123_private.pem
accept = 127.0.0.1:8001
connect = xxx.xx.160.77:26280
Sample of my Sentinel configs
protected-mode no
bind 127.0.0.1
port 26280
sentinel monitor redisftdev 127.0.0.1 8002 2
When I run the following command on the local sentinel:
127.0.0.1:26280> sentinel sentinels redisftdev
(empty list or set)
127.0.0.1:26280>
I can connect no problem to a remote Sentinel, but of course I get the same response
127.0.0.1:8005> sentinel sentinels redisftdev
(empty list or set)
Sorry new to this -
Ok I got this. Yes Sentinel works with Stunnel. I'm using 4.02. I didn't annouce the ports for my sentinels and slaves.
Specifically -
sentinel announce-port 8003
Where port 8003 is the client bound 127.0.0.1 port in your stunnel.conf
accept = 127.0.0.1:8003
connect = 12.34.56.7:6379
Same with the slaves the redis.conf
slave-announce-port 8000
Related
I'm attempting to setup a connection to our Hadoop cluster via DBVisualizer.
In order to connect I need to SSH into a server on the domain and then I need to run the command to a remote server (I've not ssh'd onto the Hadoop cluster directly)
I have (figuratively)
Database Server: abcd.efg
Database Port: 12345
Database: Hello
configured for the Database section
SSH Host: hijk.efg
SSH Port: 678
When I attempt a connection, it returns
Could not open client transport with JDBC Uri:
jdbc:hive2://127.0.0.1:-----
Where 127.0.0.1 and ----- appear to be the defaults instead of what I entered.
Any idea how I get the SSH tunnel to use the server configuration I specify?
The SSH Tunnel is set up locally on the client, so connecting to the port on localhost tunnels you to the SSH Host/Port, which then sets up a connection to the database server/port you have specified. This page may help:
http://confluence.dbvis.com/display/UG100/Using+an+SSH+Tunnel
Best Regards,
Hans
I'm experiencing intermittent failed to response when make an outbound connection such as RPC call, it is logged by my application (Java) like this :
org.apache.http.NoHttpResponseException: RPC_SERVER.com:443 failed to respond !
Outbound connection flow
Kubernetes Node -> ELB for internal NGINX -> internal NGINX ->[Upstream To]-> ELB RPC server -> RPC server instance
This problem is not occurred on usual EC2 (AWS).
I'm able to reproduce on my localhost by doing this
Run main application which act as client in port 9200
Run RPC server in port 9205
Client will make a connection to server using port 9202
Run $ socat TCP4-LISTEN:9202,reuseaddr TCP4:localhost:9205 that will listen on port 9202 and then forward it to 9205 (RPC Server)
Add rules on iptables using $ sudo iptables -A INPUT -p tcp --dport 9202 -j DROP
Trigger a RPC calling, and it will return the same error message as I desrcibe before
Hypothesis
Caused by NAT on kubernetes, as far as I know, NAT is using conntrack, conntrack and break the TCP connection if it was idle for some period of time, client will assume the connection is still established although it isn't. (CMIIW)
I also have tried scaling kube-dns into 10 replica, and the problem still occurred.
Node Specification
Use calico as network plugin
$ sysctl -a | grep conntrack
net.netfilter.nf_conntrack_acct = 0
net.netfilter.nf_conntrack_buckets = 65536
net.netfilter.nf_conntrack_checksum = 1
net.netfilter.nf_conntrack_count = 1585
net.netfilter.nf_conntrack_events = 1
net.netfilter.nf_conntrack_expect_max = 1024
net.netfilter.nf_conntrack_generic_timeout = 600
net.netfilter.nf_conntrack_helper = 1
net.netfilter.nf_conntrack_icmp_timeout = 30
net.netfilter.nf_conntrack_log_invalid = 0
net.netfilter.nf_conntrack_max = 262144
net.netfilter.nf_conntrack_tcp_be_liberal = 0
net.netfilter.nf_conntrack_tcp_loose = 1
net.netfilter.nf_conntrack_tcp_max_retrans = 3
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 3600
net.netfilter.nf_conntrack_tcp_timeout_established = 86400
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
net.netfilter.nf_conntrack_timestamp = 0
net.netfilter.nf_conntrack_udp_timeout = 30
net.netfilter.nf_conntrack_udp_timeout_stream = 180
net.nf_conntrack_max = 262144
Kubelet config
[Service]
Restart=always
Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CLOUD_ARGS=--cloud-provider=aws"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_EXTRA_ARGS $KUBELET_CLOUD_ARGS
Kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T09:14:02Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.7", GitCommit:"8e1552342355496b62754e61ad5f802a0f3f1fa7", GitTreeState:"clean", BuildDate:"2017-09-28T23:56:03Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Kube-proxy Log
W1004 05:34:17.400700 8 server.go:190] WARNING: all flags other than --config, --write-config-to, and --cleanup-iptables are deprecated. Please begin using a config file ASAP.
I1004 05:34:17.405871 8 server.go:478] Using iptables Proxier.
W1004 05:34:17.414111 8 server.go:787] Failed to retrieve node info: nodes "ip-172-30-1-20" not found
W1004 05:34:17.414174 8 proxier.go:483] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
I1004 05:34:17.414288 8 server.go:513] Tearing down userspace rules.
I1004 05:34:17.443472 8 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_max' to 262144
I1004 05:34:17.443518 8 conntrack.go:52] Setting nf_conntrack_max to 262144
I1004 05:34:17.443555 8 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I1004 05:34:17.443584 8 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I1004 05:34:17.443851 8 config.go:102] Starting endpoints config controller
I1004 05:34:17.443888 8 config.go:202] Starting service config controller
I1004 05:34:17.443890 8 controller_utils.go:994] Waiting for caches to sync for endpoints config controller
I1004 05:34:17.443916 8 controller_utils.go:994] Waiting for caches to sync for service config controller
I1004 05:34:17.544155 8 controller_utils.go:1001] Caches are synced for service config controller
I1004 05:34:17.544155 8 controller_utils.go:1001] Caches are synced for endpoints config controller
$ lsb_release -s -d
Ubuntu 16.04.3 LTS
Check the value of sysctl net.netfilter.nf_conntrack_tcp_timeout_close_wait inside the pod that contains your program. It is possible that the value on the node that you listed (3600) isn't the same as the value inside the pod.
If the value in the pod is too small (e.g. 60), and your Java client half-closes the TCP connection with a FIN when it finishes transmitting, but the response takes longer than the close_wait timeout to arrive, nf_conntrack will lose the connection state and your client program will not receive the response.
You may need to change the behavior of the client program to not use a TCP half-close, OR modify the value of net.netfilter.nf_conntrack_tcp_timeout_close_wait to be larger. See https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/.
I have an EC2 instance which is running with the following security groups:
HTTP - TCP - 80 - 0.0.0.0/0
Custom UDP Rule - UDP - 1194 - 0.0.0.0/0
SSH - TCP - 22 - 0.0.0.0/0
Custom TCP Rule - TCP - 943 - 0.0.0.0/0
HTTPS - TCP - 443 - 0.0.0.0/0
However, when I try to access http://{PUBLIC_IP} or https://{PUBLIC_IP} in the browser, I get a "{IP} refused to connect" error. I'm new to AWS. Am I missing something here? What should I do to debug?
One way to debug this particular class of problem is to use netcat in order to determine where the problem lies.
If you run netcat against port 80 on the public IP address of your instance and just get a hang (no output at all), then most likely your security group isn't allowing traffic through. Here is an example from an EC2 instance that is in a security group that doesn't allow port 80 traffic inbound:
% nc -v 55.35.300.45 80
<just hangs>
Whereas if the security group is changed to allow port 80, but the EC2 instance doesn't have any process listening on port 80, you'll get the following:
% nc -v 55.35.300.45 80
nc: connectx to 52.38.300.43 port 80 (tcp) failed: Connection refused
Given that your browser gave you a similar "connection refused", most likely the problem is that there is no web server running on your instance. You can verify this by ssh'ing into the instance and seeing if you can connect to port 80 there:
ssh ec2-user#55.35.300.45
% nc -v localhost 80
nc: connect to localhost port 80 (tcp) failed: Connection refused
If you get something like the above, you're definitely not running a webserver.
I'm not sure if it's too late to help but I was stuck with a similar issue with my test server
SG Inbound: ssh -> 22
HTTP -> 80
NACL: default allow/deny settings
but still couldn't ping to the server from my browser, then I realize there's nothing running on the server that can serve the request, and I started httpd server (webserver) and it worked.
sudo yum -y install httpd
sudo service httpd start
this way you can test the connectivity if you are playing with SGs and NACLs and of course it's not the only way, just an example if you're figuring your System N/W out.
Have you installed webserver(ngingx/apache) to serve your requests. If so please share your the config files. (So that it will help to troubleshoot)
I think the reason is probably that you did not set up a web server for your EC2 instance, because if you try to access http://{PUBLIC_IP} or https://{PUBLIC_IP}, you need to have a background server to serve the http request as #Niranj Rajasekaran said.
By the way, by simply pinging the {PUBLIC_IP}, you could see if your connection to your EC2 instance is normal or not.
In command prompt or terminal, type
ping {PUBLIC_IP}
In my case, the server was running but available on just 127.0.0.1 so it refused connections from external hosts. To see if this is your situation, you can run
netstat -an | grep <port number>
If it says 127.0.0.1:<port number> instead of 0.0.0.0:<port number>, you have this problem.
Usually there's a flag or an argument in your server code somewhere to set the host to 0.0.0.0:
app.run(host='0.0.0.0') # flask example
However, in my case, I had already set this so I thought that couldn't possibly be the issue, which is how I ended up on this thread, which asks more generally about the problem. Unfortunately, I was using docker, and had set 0.0.0.0 on the container but was mapping that explicitly to 127.0.0.1 on the host in the docker-compose port-mapping:
ports:
- "127.0.0.1:<port number>:<port number>"
Changing that line to remove the host IP specification fixed the problem upon re-deploy:
ports:
- "<port number>:<port number>"
I am trying to send an email using the postfix server on amazon EC2 instance.
The command is: sendmail xxxxxx#gmail.com
FROM:localhost
SUBJECT:Welcome
this is a test email....
.
However I am getting the following error in the /var/log/maillog file.
the error is:
Jan 13 09:00:37 ip-172-31-32-76 postfix/pickup[26635]: C43AE62D00: uid=222
from=
Jan 13 09:00:37 ip-172-31-32-76 postfix/cleanup[26727]: C43AE62D00:
message-id=<20140113090037.C43AE62D00#"HOSTNAME">
Jan 13 09:00:37 ip-172-31-32-76 postfix/qmgr[26636]: C43AE62D00:
from=<"MYHOSTNAME">, size=435, nrcpt=1 (queue active)
Jan 13 09:00:37 ip-172-31-32-76 postfix/smtp[26729]:
connect to 127.0.0.1[127.0.0.1]:2525: Connection refused
Jan 13 09:00:37 ip-172-31-32-76 postfix/smtp[26729]: C43AE62D00:
to=, relay=none, delay=22, delays=22/0.02/0/0, dsn=4.4.1, status=deferred (connect to 127.0.0.1[127.0.0.1]:2525: Connection refused)
I have hidden the details for hostname and the email ID to which I want to send.
please help me out in thus regard.
I have also added the port 25 in the outbound and inbound port in the security groups for my instance.
Regards,
Anurag
I think the other service is running in the same port,
"netstat -tap" run the command and check whether the same port is using for something.
connect to 127.0.0.1[127.0.0.1]:2525: Connection refused
Something is preventing Postfix from using this port. (Port 2525 is sometimes being used instead of 587 as an alternative smtp port. )
Verify which ports are listening:
netstat -tanp | grep LISTEN
If you see sendmail (or any other MTA except for Postfix):
tcp 0 0 127.0.0.1:2525 0.0.0.0:* LISTEN 1014/sendmail
get rid of it:
service sendmail stop
yum remove sendmail
Verify settings on the first table row in:
/etc/postfix/master.cf
If it says:
smtp inet n - n - - smtpd
postfix listens on port 25 and your security group settings make sense. IF the line says
2525 inet n - n - - smtpd
you are telling postfix to listen on port 2525 for incoming smtpd connections.
The line that says:
submission inet n - n - - smtpd
does not begin with a comment.
Verify iptables rules, adjust if necessary:
iptables -L -n
This could be unrelated but I'm going to post it here because I had a hard time finding the answer to my question. I was able to get outbound email working from a vagrant virtual box by editing my /etc/resolv.conf to use Google's nameserver rather than the 10.0.x.x IP it was set to:
sudo nano /etc/resolv.conf
Change the nameserver IP:
nameserver 8.8.8.8
Then you'll need to restart postfix:
sudo /etc/init.d/postfix restart
Edited:-
i have done with single node cluster on two different machine,I have made one as master(192.168.1.1) and other m/c as slave(192.168.1.2), I am successfully able to ping between two machine,I have made the following changes to get into 2 node cluster Update :-
/etc/hosts on both machines hosts.allow
All : Ashish-PC 192.168.1.1 : allow
All : slave 192.168.1.2 : allow
master file with
Ashish-PC
slaves file with
Ashish-PC
slave
I am getting an error while copying local host public key to remote host(slave): port 22
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop#slave
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: ERROR: ssh: connect to host slave port 22: Connection timed out
as well as when i start all dfs at master services then also :-
bin/start-dfs.sh
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-Ashish-namenode- Ashish-PC.out
slave: ssh: connect to host slave port 22: Connection timed out
Ashish-PC: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-Ashish-secondarynamenode-Ashish-PC.out
slave: ssh: connect to host slave port 22: Connection timed out
while copying key:-
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop#slave
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: ERROR: ssh: connect to host slave port 22: Connection timed out
i have used cygwin and ssh is working fine on both the PC and I went through some suggestion to change the port number 22(because of ISP problem) but i dont want do that just because.
thanks in advance for your help and response.
Allow master to communicate through Windows Firewall by adding sshd in home as well as public...
make sure your sshd services are started on each node for communication.
This worked for me:
1.
sudo vi /etc/ssh/sshd_config
2.
Remove the comment
#Port 22
#Protocol 2