Is a local Zookeeper cluster required to run Apache Storm in local cluster mode? - apache-storm

I have been trying to get a local copy of Storm working, following the guide in the storm-starter repo, and this tutorial.
When trying to run a topology with mvn compile exec:java -Dstorm.topology=org.apache.storm.starter.ExclamationTopology, the output eventually continues looping & spamming:
28534 [Thread-9-SendThread(localhost:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2000. Will not attempt to authenticate using SASL (unknown error)
28534 [Thread-9-SendThread(localhost:2000)] WARN o.a.s.s.o.a.z.ClientCnxn - Session 0x152f7728a6a0011 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_45]
at Sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_45]
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) [storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
It seems it is trying to connect to a local Zookeeper cluster, but I have not seen the dependency or install requirement for Zookeeper in the Storm docs or in this other tutorial.
Do I need to install Zookeeper and is this just missing from the docs? Perhaps I'm mistaken and it is looking for something else at port 2000 on my localhost? If not, what is going wrong in my local setup?

If you run locally and use LocalCluter you do not need to install Zookeeper.
If you run locally in pseudo-distributed mode (ie, start up Nimubs and Supervisor locally) and use StormSubmitter you do need to install Zookeeper locally.

Related

Failed to obtain Jenkins slave

In my scenario Jenkins master is available in Linux machine and i can access this in my Local windows machine also.
I created one windows slave using launch method as "Launch agent by connecting to the master".As per guidelines , i created one folder in my windows and i pasted that slave and agent jars in that folder.
While I am trying to run the slave-agent i am getting error like below. (Screenshot attached)
I tried with 2nd option as well i.e i took the provided command in Jenkins slave and i pasted that into command. again it is giving failed to connect error message. Please find below error message.
I am new to this configuration.
Do i need provide my slave machine ip into the master machine or do i need to install any other things related to this. can someone please help me out.
Failed to obtain http://ip:7394839:computer/winslave1/slave-agent.jnlp?encrypt=true
java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown....etc
Slave-agent error
Issue resolved. In my case whenever we are creating new slave it is generating new IP which is not having server. Here just we are creating another node for the existing server.
Whenever we are launching the slave using agent.jar with the slave IP we are getting timed out error. Instead of going with slave IP if we go with master IP , agent is launching successfully.
Please find below example.
If i try with slave IP(Below IP)
java -jar D:\Jenkins\agent.jar -jnlpUrl we are getting error
http://120.231.140:8080/computer/My_slave_node_name_Windows10/slave-agent.jnlp -secret anHexadecimal_Long_Number5d094b1f577bc772b65b7277ac57 -workDir "D:\Jenkins"
Below IP is master IP. Agent launched successfully.
java -jar D:\Jenkins\agent.jar -jnlpUrl http://120.241.141:8080/computer/My_slave_node_name_Windows10/slave-agent.jnlp -secret anHexadecimal_Long_Number5d094b1f577bc772b65b7277ac57 -workDir "D:\Jenkins"
There is change in IP for slave and master. If i launch the slave ip from the cmd we are getting a timed out error.
The other three possibilities are
Whitelist slave IP in master Instance security groups level(all traffic or required ports).
By default windows server will block with IE security settings so diable by following below.
"Enter Server Manager in Windows search to start Server manager application. Select Local Server. Navigate to the IE Enhanced Security Configuration property, select the current setting to open the property page, select the Off option button for the desired users, and then select OK"
Configure a port number for TCP JNLP connection and whitelist it on instance security groups.
Manage Jenkins > Configure Global Security > Enable security > TCP port for JNLP agents: Fixed.

StreamParse: IOError: Local port: 6627 already in use, unable to open ssh tunnel to nimbus.server.local:6627

Setup:
Storm 0.10.0
Streamparse 2.1.4
Centos 6.5
Python 2.7 (Streamparse needs it)
(Yes i know they are outdated, however i couldnt get anything working with Storm 1.0, its just broken with streamparse 3)
When I attempt to launch a "streamparse submit" from either my nimbus server, or another server in my topology I get the following error:
"IOError: Local port: 6627 already in use, unable to open ssh tunnel
to nimbus.server.local:6627."
But ofcourse 6627 is in use on my nimbus server? Its the Thrify port. So i tried moving the Thrifty port to 6637 and restarting Nimbus. But I get the same error back from the client submitting it:
IOError: Local port: 6627 already in use, unable to open ssh tunnel to
nimbus.server.local:6627.
Even a netstat tuanp shows that 6627 shows that nothing is listening on that port on nimbus or the box executing the submit.
I have a feeling something to do with SSHD config and allowing tunneling, and that isn't being handled properly by Nimbus and giving an incorrect error when trying to establish the tunnel.
Has anyone else experienced this?
This is what I ended up doing to deploy streamparse Storm topology in the local Storm cluster:
> sparse quickstart quickstart-2.1.4
> cd quickstart-2.1.4
> sparse jar
> storm jar _build/quickstart-2.1.4-0.0.1-SNAPSHOT-standalone.jar streamparse.commands.submit_topology topologies/wordcount.clj
This worked with streamparse 2.1.4 and Storm 0.9.5
I got the same error when running storm topology.
I made following changes, then it worked fine,
Added the following property
In config.json,
"use_ssh_for_nimbus": false,
"use_virtualenv": false,
In fabfile.py,
from fabric.api import env
env.use_ssh_config = False
env.password = '****'
from streamparse.ext.fabric import *
And submitted as "sparse submit"
Please let me know, if it worked, or share the config file

Bluemix Docker Container deployment results in "No route to host"

we are deploying a docker-image using this command:
cf ic run -p 8080 -m 512 -e SPRING_PROFILES_ACTIVE=test -e logging.config=classpath:logback-docker-test.xml --name <container-name> registry.eu-gb.bluemix.net/<repository_name>/<container-name>:latest
Within that container we are starting a Java8 Spring-Boot application that uses a connection-pooling provider. The connection-pooling provider connects to an existing PostgreSQL-Database that is accessible on the standard port. We do not use any domain name to connect to PostgreSQL-Database. We only use the IP-Address and the standard postgresql port.
The deployment is working on a machine that uses the standard Docker container daemon and is also working on Amazon WebServices (AWS) without any problems and using the same deployment mechanism.
However, if we are deploying the image to the Bluemix-Container-Service we do get the following error at startup of the spring-boot application:
Caused by: java.net.NoRouteToHostException: No route to host
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.postgresql.core.PGStream.<init>(PGStream.java:61)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:129)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:65)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:146)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:35)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:47)
at org.postgresql.jdbc42.AbstractJdbc42Connection.<init>(AbstractJdbc42Connection.java:21)
at org.postgresql.jdbc42.Jdbc42Connection.<init>(Jdbc42Connection.java:28)
at org.postgresql.Driver.makeConnection(Driver.java:415)
at org.postgresql.Driver.access$100(Driver.java:47)
at org.postgresql.Driver$ConnectThread.run(Driver.java:325)
... 1 more
We don't know why this happens, because if we do a telnet on another Bluemix-Docker-Machine to the PostgreSQL-Database server with the desired port everything is fine.
This is very annoying, since we cannot use this Docker-Image on Bluemix currently and is currently obstructing our planned roll-out.
Can you help us with details what might be wrong and how can fix this?
Any help will be appreciated.
Regards,
Christian
Is this error raised when the container is starting up?
If so, the Docker/IBM Containers on Bluemix take about between 30 up to 60 seconds in networking status: during this phase the container is not able to connect to the network.
It should be really probably the root cause of the error you are getting: if the Java SpringBoot application is trying to connect to the PostgreSQL database when the container is still in networking phase, it will fail with this error.
You should start your application running on the container when the container has completed the networking phase (for example through a bash script that checks the availability of the PostgreSQL server, or simply configure your springboot to manage this exception)
Official bluemix support gave the hint to wait for 120 seconds before starting the Java-Application that needs network access. The suggested way is:
CMD ["/bin/sh", "-c", "sleep 120; exec java $JVM_ARGS -cp /app org.springframework.boot.loader.JarLauncher --spring.main.show_banner=false"]
With that we have got network access and everything is fine.

Ambari Heartbeat Lost during Installation

​We are trying to install , HDP via Ambari 2.1.
The registration is succesfull, but during the install process, the ambari server (ambari-server.log) reports that it has lost an heart beat of the agent.
Error Message :
Heartbeat lost from host amabri.agent.com
The ambari-agent log reports:
Failed to connect to https://amabri-server.com:8440/connection_info due to [Errno 111] Connection refused
We are using openjdk 1.7 on RHEL 6.6 64 bit.
Any pointer to the issue would help immensely ?
Try restarting the ambari-agent on that particular host.

Failure to run commands from apache karaf client

I have downloaded apache karaf2.3.3 (on felix) on several CentOS6.4 machines. I see this issue only in a few machines. When I try to install a feature using the following commands
$KARAF_HOME/bin/start
$KARAF_HOME/bin/client "features:install myfeature"
I get the following stack trace:
WARN org.apache.sshd.client.session.ClientSessionImpl - Exception caught
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:273)
at org.apache.mina.transport.socket.nio.NioProcessor.read(NioProcessor.java:44)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:690)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Looks like client fails to connect to karaf. Firewall is shutdown on all of the machines. Anyone knows why this could be failing? The feature gets happily installed if run karaf in console mode with /bin/karaf and type in the same command
My guess is that the port you defined for the remote Karaf console was already in use by another application before the Karaf installation. As such the wrong application accepts the link, cannot make anything of the data and resets the connection. I would suggest to stop Karaf, check with netstat or via telnet localhost <port> whether the port Karaf is configured to listen on is already in use, and find the related application. As an alternative, you can configure Karaf to use a different (not used) port. See for example this page

Resources