telnet: connect to address 127.0.0.1: Connection refused on MacOS - macos

I'm trying to connect to my database from my java application, but it refuses the connection, the error given is this one.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:400)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1038)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:339)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2247)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2280)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2079)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:794)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:44)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:400)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:399)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:325)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at MainTest.main(MainTest.java:23)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:214)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:298)
... 20 more
Searching I found that the problem is not the code, but the connection to my localchost, since I can connect to localhost if I go to my web browser and just going to 'localhost', I can't make a telnet to localhost or 127.0.0.1, it refuses the connection.
MacBook-Pro:etc alejandro-trabajo$ telnet localhost
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
Trying fe80::1...
telnet: connect to address fe80::1: Connection refused
telnet: Unable to connect to remote host
I've tried everything I found in the web, make a new 'hosts' file, flushing the dns, restarting the apache server, uncommenting the "ServerName localhost" in the httpd.conf file... I really don't know what to do.
PS: The ping to localhost works perfectly
MacBook-Pro:etc alejandro-trabajo$ ping localhost
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.048 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.075 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.126 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.121 ms
^C
--- localhost ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.048/0.092/0.126/0.032 ms

Maybe this will help.
~ $ ifconfig
...
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 8c:85:90:94:fb:70
inet6 fe80::14aa:f1f1:ede3:b942%en0 prefixlen 64 secured scopeid 0x8
inet 192.168.1.100
~ $ telnet localhost 1234
Trying ::1...`enter code here`
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
telnet: Unable to connect to remote host
~ $ telnet 127.0.0.1 1234
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
telnet: Unable to connect to remote host
~ $ telnet 192.168.1.100 1234
Trying 192.168.1.100...
Connected to 192.168.1.100.
Escape character is '^]'.
This is a simple server

Related

MQTT java spring app accessing activemq on docker host fails to connect in brigde mode

I have a Spring java app with Paho MQTT v3 connecting ActiveMQ.
The app is working well out of eclipse and started via java -jar and also inside my docker container as long as the network is in host mode. I tried the host mode because the bridge mode is not working. (my issue: connection reset)
I want to use the bridge mode because host mode I see as security issue. The app runs inside the container with limited rights. For testing purposes I have deactivated this so that I tested with uid 0. But this is not the problem.
The issue is, when I am running in bridge mode I get:
2020-11-30 19:58:54.192 ERROR 13 [ main] n.w.s.s.s.MqttSender.startPublisher:53 : MqttException while starting mqtt message publisher. (resons code: 32103) : Unable to connect to server
org.eclipse.paho.client.mqttv3.MqttException: Unable to connect to server
at org.eclipse.paho.client.mqttv3.internal.TCPNetworkModule.start(TCPNetworkModule.java:80)
at org.eclipse.paho.client.mqttv3.internal.ClientComms$ConnectBG.run(ClientComms.java:722)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.eclipse.paho.client.mqttv3.internal.TCPNetworkModule.start(TCPNetworkModule.java:74)
... 2 common frames omitted
As I wrote I tried this:
runing without docker => ok
running in docker container in host mode => ok
running in docker container in bridge mode => nok
I determined the mqqt host (docker host) by a script using "ip route" (alpine image)
I see in my log, that the default gateway ip is successfully determined and used.
I checked the firewalld setting and tested with netcat, if this could be the issue. But with nc I could not see an issue.
I checked with tcpdump and have seen that the connection is established. But then the mqtt client sends
unsubscribe request
disconnect request
I suppose that ActiveMQ says something like unauthorized because it is not connected of a private network (192...). Instead the network in docker is something with 172.17..*
Otherwise the activemq listens on 0.0.0.0:1883.
Even via ssh tunnel I could connect.
I added the paho reson code to log. I got 32103.
Has anybody an idea what could happen here?
This is the traffic grabbed by tcpdump
1 0.000000 172.17.0.2 172.17.0.1 TCP 74 43482 ? 1883 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=715574721 TSecr=0 WS=128
2 0.000112 172.17.0.1 172.17.0.2 TCP 74 1883 ? 43482 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=715574722 TSecr=715574721 WS=128
3 0.000148 172.17.0.2 172.17.0.1 TCP 66 43482 ? 1883 [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=715574722 TSecr=715574722
4 0.328363 172.17.0.2 172.17.0.1 MQTT 100 Connect Command
5 0.328505 172.17.0.1 172.17.0.2 TCP 66 1883 ? 43482 [ACK] Seq=1 Ack=35 Win=29056 Len=0 TSval=715575050 TSecr=715575050
6 0.330538 172.17.0.1 172.17.0.2 MQTT 70 Connect Ack
7 0.330612 172.17.0.2 172.17.0.1 TCP 66 43482 ? 1883 [ACK] Seq=35 Ack=5 Win=29312 Len=0 TSval=715575052 TSecr=715575052
8 0.341795 172.17.0.2 172.17.0.1 MQTT 83 Subscribe Request (id=1) [sensordata]
9 0.343407 172.17.0.1 172.17.0.2 MQTT 71 Subscribe Ack (id=1)
10 0.383106 172.17.0.2 172.17.0.1 TCP 66 43482 ? 1883 [ACK] Seq=52 Ack=10 Win=29312 Len=0 TSval=715575105 TSecr=715575065
11 3.289301 172.17.0.2 172.17.0.1 MQTT 82 Unsubscribe Request (id=2)
12 3.290162 172.17.0.1 172.17.0.2 MQTT 70 Unsubscribe Ack (id=2)
13 3.290252 172.17.0.2 172.17.0.1 TCP 66 43482 ? 1883 [ACK] Seq=68 Ack=14 Win=29312 Len=0 TSval=715578012 TSecr=715578012
14 3.293894 172.17.0.2 172.17.0.1 MQTT 68 Disconnect Req
15 3.295862 172.17.0.1 172.17.0.2 TCP 66 1883 ? 43482 [FIN, ACK] Seq=14 Ack=70 Win=29056 Len=0 TSval=715578017 TSecr=715578015
16 3.335121 172.17.0.2 172.17.0.1 TCP 66 43482 ? 1883 [ACK] Seq=70 Ack=15 Win=29312 Len=0 TSval=715578057 TSecr=715578017
In ActiveMQ I have just enabled the MQTT transort connector. Security I don't enabled.
Do I have to configure something more in ActiveMQ?

Use CNTLM running on host system on hyper-v vm

I have a windows 10 machine that is running a cntlm proxy on port 3128.
When starting a vm using hyper-v and a Virtual Switch configured with the external network setting.
I can ping the host machine from the VM using its IP:
$ ping $hostIp
PING $hostIP ($hostIP): 56 data bytes
64 bytes from $hostIp: seq=0 ttl=128 time=1.480 ms
64 bytes from $hostIp: seq=1 ttl=128 time=0.893 ms
64 bytes from $hostIp: seq=2 ttl=128 time=1.179 ms
64 bytes from $hostIp: seq=3 ttl=128 time=0.997 ms
64 bytes from $hostIp: seq=4 ttl=128 time=1.318 ms
...
but I can't for example execute curl using this IP as proxy IP.
curl -x http://$hostIp:3128 https://google.de
On the host machine this command results in
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
On the VM this results in
curl: (7) Failed to connect to $hostIp port 3128: Connection timed out

Cannot connect to remote Neo4j server using neo4j-shell

I have Neo4j 2.3.2 installed on a server whose firewall has ports 1337, 7474, 7473 open (verified using ncat). I can access the web-based console remotely and can access the shell locally, but I cannot access the shell remotely. Namely, when running path/to/neo4j-shell -v -host destination.example.org I get
ERROR (-v for expanded information):
Connection refused
java.rmi.ConnectException: Connection refused to host: <server ip addres>; nested exception is:
java.net.ConnectException: Verbinding is geweigerd
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:129)
at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:227)
at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:179)
at com.sun.proxy.$Proxy1.welcome(Unknown Source)
at org.neo4j.shell.impl.AbstractClient.sayHi(AbstractClient.java:257)
at org.neo4j.shell.impl.RemoteClient.findRemoteServer(RemoteClient.java:70)
at org.neo4j.shell.impl.RemoteClient.<init>(RemoteClient.java:62)
at org.neo4j.shell.impl.RemoteClient.<init>(RemoteClient.java:45)
at org.neo4j.shell.ShellLobby.newClient(ShellLobby.java:204)
at org.neo4j.shell.StartClient.startRemote(StartClient.java:355)
at org.neo4j.shell.StartClient.start(StartClient.java:226)
at org.neo4j.shell.StartClient.main(StartClient.java:145)
Caused by: java.net.ConnectException: Verbinding is geweigerd
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:208)
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:147)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
... 14 more
In /var/lib/neo4j/conf/neo4j.properties, I have
# Enable shell server so that remote clients can connect via Neo4j shell.
remote_shell_enabled=true
# The network interface IP the shell will listen on (use 0.0.0.0 for all interfaces).
remote_shell_host=0.0.0.0
# The port the shell will listen on, default is 1337.
remote_shell_port=1337
Do I need to open other ports apart from the three mentioned?
Is there any further configuration related to the shell I have missed?
neo4j-shell is based on Java RMI. There are couple of resources out there describing how to do firewalling for RMI. In fact it's pretty complex since RMI is doing dynamic port allocation.
Typically I run neo4j-shell locally over an ssh connection, this also gives you authentication and encryption - and you just need to open SSH port.

ElasticSearch java.net.NoRouteToHostException in docker

[2015-10-11 13:08:26,587][WARN ][transport.netty ] [Joseph] exception caught on transport layer [[id: 0x7e9f652b]], closing connection
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
I get this exception when launching the elasticsearch in docker (Actually I only have this problem in CentOS7 docker host)
First, my dockefile exposes the UDP ports.
EXPOSE 9200 9300/udp 9301/udp 9302/udp 9303/udp 9304/udp 9305/udp
When I start the docker container, I opened these ports via -p 9200:9200 -p 9300:9300/udp -p 9301:9301/udp -p 9302:9302/udp -p 9303:9303/udp -p 9304:9304/udp -p 9305:9305/udp
Within docker ps, I do see these ports are opened as 0.0.0.0:9300-9305->9300-9305/udp
And here is some lines of my elasticsearch.yml
cluster.name: changsha
discovery.zen.ping.unicast.hosts: [ "10.0.5.241" ]
network.publish_host: 10.0.5.241
10.0.5.241 is my docker host's IP address. Please what is wrong here? it succeeded in CentOS6 host, but failes on this CentOS7 host.
UPDATE
Following this answer, I get the following result from tcpdump -p -nn icmp.
09:26:53.277117 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:53.277494 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:53.277822 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:53.278043 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:26:54.277753 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
09:27:04.280703 IP 10.0.5.241 > 172.17.0.8: ICMP host 10.0.5.241 unreachable - admin prohibited, length 68
First, find out the docker interface ip address
# ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.42.1 netmask 255.255.0.0 broadcast 0.0.0.0
ether 56:84:7a:fe:97:99 txqueuelen 0 (Ethernet)
RX packets 115761 bytes 12605533 (12.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 55687 bytes 22647938 (21.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Then add all of the docker IP addresses into whitelist
firewall-cmd --permanent --zone=trusted --add-source=172.17.0.0/16
firewall-cmd --reload
Problem solved
If someone come across the issue in centos 7.4, it`s because of the conflict between docker service and firewalld service.
you can solve by disable firewalld and then restart docker service.
please refer https://sanenthusiast.com/docker-and-firewalld-mess-in-centos-7/

Hadoop timing out trying to write to Cassandra in AWS multi-region configuration

I am running a multi-DC Cassandra (open-source, not DSE) cluster in AWS, where one DC (us-west-2) is set up for analytics and the other (us-east) is the transactional store. I'm using NetworkTopologyStrategy with the EC2 snitch, and a consistency level of LOCAL_ONE in my Hadoop config. Hadoop can read from Cassandra without issue, but attempting to write produces a timeout exception.
Running nodetool status shows the DCs are properly configured:
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns Host ID Token Rack
UN x.x.x.x 1.01 GB 9.9% 9e7f4393-7ac9-4559-b3ff-de48be50016f -9127921345534057723 2a
UN x.x.x.x 1001.16 MB 11.4% d0760383-c3dd-474c-9261-239b71dba3f1 -9221279003374097975 2b
UN x.x.x.x 1.05 GB 11.7% 3f09fbf5-0d85-4283-9009-0ec0e29223c0 -9140104347498952504 2c
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns Host ID Token Rack
UN x.x.x.x 1.1 GB 11.3% 5bbd2de4-e1d2-4a17-9f40-034f60b35954 -9061054426204373981 1b
UN x.x.x.x 1.15 GB 11.5% e34c590e-6176-45b2-a8f9-18b4a9a80032 -9216519687724118609 1c
UN x.x.x.x 1.18 GB 10.9% fa0b0a1a-f156-40fc-a267-970d1eb9cddb -9207673937991303291 1a
UN x.x.x.x 1.46 GB 10.7% b18ae406-c9ec-42b7-a365-b0c6e2fe582f -9206671929961171506 1a
UN x.x.x.x 1.13 GB 11.4% 1ac9c1c5-55ad-4048-b1ba-3b9768933ecc -9146100851344467112 1c
UN x.x.x.x 1.53 GB 11.2% dad665bb-68d9-4811-b421-f33333261867 -9178920986366339267 1b
Stack trace using ColumnFamilyOutputFormat:
java.io.IOException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:224)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:123)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:215)
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 4 more
... and using CqlOutputFormat:
java.io.IOException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:271)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:123)
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:262)
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 4 more
Both traces ultimately point to AbstractColumnFamilyOutputFormat.createAuthenticatedClient(host, port, conf).
I then opened that source and added some detail to the exception so it would output the host name it's connecting to, which resulted in this trace:
java.io.IOException: java.lang.Exception: Unable to connect to host [hostname]
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:271)
Caused by: java.lang.Exception: Unable to connect to host [hostname]
at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:139)
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:262)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:124)
... 1 more
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 4 more
The problem is [hostname] is a machine that's not in the analytics cluster (it's in us-east). Why doesn't it know this automagically, especially when reads work properly? It seems like it's trying all the nodes in the ring regardless of DC.
For the record, writes fail using CqlOutputFormat, ColumnFamilyOutputFormat, and through Pig using CqlStorage and CassandraStorage.
I'd say, try to set the write_request_timeout_in_ms in cassandra.yaml to some very high number and see if that helps. There can be an issue with the node itself, when it is not responding while still appearing as being up. If it still times out, restart service on that node that you suspect is causing the issue.
This issue came down to two things:
For multi-region EC2 setups, Cassandra requires setting broadcast_address to the public IP and the listen_address to the internal IP. In most cases you'll want rpc_address to be the internal IP, but this potentially breaks Cassandra's Hadoop client, which is determining endpoints to talk to based on broadcast_address.
Cassandra's Hadoop client (RingCache specifically) doesn't respect data center on node discovery, and tries to discover all nodes in the ring--including non-local ones. It respects the consistency level on the actual write, but in our case it never got there due to #1.
I filed a ticket and submitted a patch to address these issues:
https://issues.apache.org/jira/browse/CASSANDRA-7252

Resources