Storm Nimbus is not working - apache-storm

I'm new to Apache Storm and I'm currently working on a project which uses storm. While trying to understand the basics of storm, I came across nimbus and supervisors. I started setting up a remote cluster. I edited the storm.yaml file and setup the nimbus and zookeeper to localhost. I tried running my nimbus, zookeeper in my local machine. I started nimbus using "storm nimbus", but nimbus is not started whereas my zookeeper is actively running. enter image description hereI have the screenshot of what I get.
Can somebody help me in sorting out.

go to the directory where you have installed your storm go to bin folder and run:
storm nimbus
reference image:

You have to run the start command from the bin directory of the storm installation directory.
Navigate to the bin folder inside storm installation folder and then run "storm nimbus".

Related

How to connect Apache Spark with Yarn from the SparkContext?

I have developed a Spark application in Java using Eclipse.
So far, I am using the standalone mode by configuring the master's address to 'local[*]'.
Now I want to deploy this application on a Yarn cluster.
The only official documentation I found is http://spark.apache.org/docs/latest/running-on-yarn.html
Unlike the documentation for deploying on a mesos cluster or in standalone (http://spark.apache.org/docs/latest/running-on-mesos.html), there is not any URL to use within SparkContext for the master's adress.
Apparently, I have to use line commands to deploy spark on Yarn.
Do you know if there is a way to configure the master's adress in the SparkContext like the standalone and mesos mode?
There actually is a URL.
Ensure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster. These configs are used to write to HDFS and connect to the YARN ResourceManager
You should have at least hdfs-site.xml, yarn-site.xml, and core-site.xml files that specify all the settings and URLs for the Hadoop cluster you connect to.
Some properties from yarn-site.xml include yarn.nodemanager.hostname and yarn.nodemanager.address.
Since the address has a default of ${yarn.nodemanager.hostname}:0, you may only need to set the hostname.

Apache Phoenix Installation not done properly

We are trying to install Phoenix 4.4.0 on HBase 1.0.0-cdh5.4.4 (CDH5.5.5 four nodes cluster) via this installation document: Phoenix installation
Based on that we copied our phoenix-server-4.4.0-HBase-1.0.jar to hbase libs on each region server and master server, so that, on each /opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hbase/lib folder in the master and three region servers.
After that we reboot the HBase service via Cloudera Manager.
Everything seems to be ok, but when we are trying to access to phoenix shell via ./sqlline.py localhost command, we get a Zookeeper error in that way:
15/09/09 14:20:51 WARN client.ZooKeeperRegistry: Can't retrieve clusterId from Zookeeper
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
So we are not sure that the installation is properly done. Is necessary any further configuration?
We are not even sure wether we are using the sqlline command properly.
Any help will be appreciated.
After reinstalling the 4 nodes cluster on AWS, phoenix is now working properly.
It's a pitty that we don't know exactly what was really happening, but we think that after several changes in our config, we broke something that made phoenix impossible to work.
One thing to take into consideration is that sqllline command has to be executed with an ip that is in the zookeeper quorum, and this is something we were doing wrong, since we were trying to run it from the namenode, and it wasn't in the zookeeper quorum.Once we run sqlline.py from a datanode, everything is working fine.
Btw, the installation guide that we finally followed is Phoenix Installation

Not able to deploy workers on Spark-1.2.0

I am new to spark and using spark-1.2.0 with hadoop 2.4.1. I have set up master and four slave nodes. But two of my nodes are not starting.
I have defined IP addresses of nodes in slaves file in spark-1.2.0/conf/ directory.
But when I try to run ./sbin/start-all.sh the error is as follows :
failed to launch org.apache.spark.deploy.worker.Worker
could not find or load main class org.apache.spark.deploy.worker.Worker
This is happening for two nodes. Other two are working fine.
I've also setup spark-env.sh in master as well as slaves. The master also has passwordless ssh connectiviy to the slaves.
I've also tried doing ./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT
It gives out the same error as before. Can someone help me with this. Where am I doing mistake?
So I figured out the solution. For all those who are starting new with spark, please check all the jar files in lib folder. I had spark-assembly-1.2.0-hadoop2.4.0.jar file missing in my slave.
I also encountered the same issue. If this is localmode cluster setup then you can run instead:
./sbin/start-master.sh
./sbin/start-slave.sh spark://localhost:7077
Then run:
MASTER=spark://localhost:7077 ./bin/pyspark
I was able to execute my jobs on the shell.
Do remember to setup up conf/slaves and conf/spark-env.sh as per here:
http://pulasthisupun.blogspot.com/2013/11/how-to-set-up-apache-spark-cluster-in.html
Also change localhost to your hostname.

Spark Standalone Mode: Worker not starting properly in cloudera

I am new to the spark, After installing the spark using parcels available in the cloudera manager.
I have configured the files as shown in the below link from cloudera enterprise:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.8.1/Cloudera-Manager-Installation-Guide/cmig_spark_installation_standalone.html
After this setup, I have started all the nodes in the spark by running /opt/cloudera/parcels/SPARK/lib/spark/sbin/start-all.sh. But I couldn't run the worker nodes as I got the specified error below.
[root#localhost sbin]# sh start-all.sh
org.apache.spark.deploy.master.Master running as process 32405. Stop it first.
root#localhost.localdomain's password:
localhost.localdomain: starting org.apache.spark.deploy.worker.Worker, logging to /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain: failed to launch org.apache.spark.deploy.worker.Worker:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
localhost.localdomain: at gnu.java.lang.MainThread.run(libgcj.so.10)
localhost.localdomain: full log in /var/log/spark/spark-root-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost.localdomain:starting org.apac
When I run jps command, I got:
23367 Jps
28053 QuorumPeerMain
28218 SecondaryNameNode
32405 Master
28148 DataNode
7852 Main
28159 NameNode
I couldn't run the worker node properly. Actually I thought to install a standalone spark where the master and worker work on a single machine. In slaves file of spark directory, I given the address as "localhost.localdomin" which is my host name. I am not aware of this settings file. Please any one cloud help me out with this installation process. Actually I couldn't run the worker nodes. But I can start the master node.
Thanks & Regards,
bips
Please notice error info below:
localhost.localdomain: at java.lang.ClassLoader.loadClass(libgcj.so.10)
I met the same error when I installed and started Spark master/workers on CentOS 6.2 x86_64 after making sure that libgcj.x86_64 and libgcj.i686 had been installed on my server, finally I solved it. Below is my solution, wish it can help you.
It seem as if your JAVA_HOME environment parameter didn't set correctly.
Maybe, your JAVA_HOME links to system embedded java, e.g. java version "1.5.0".
Spark needs java version >= 1.6.0. If you are using java 1.5.0 to start Spark, you will see this error info.
Try to export JAVA_HOME="your java home path", then start Spark again.

where is the hadoop task manager UI

I installed the hadoop 2.2 system on my ubuntu box using this tutorial
http://codesfusion.blogspot.com/2013/11/hadoop-2x-core-hdfs-and-yarn-components.html
Everything worked fine for me and now when I do
http://localhost:50070
I can see the management UI for HDFS. Very good!!
But the I am going through another tutorial which tells me that there must be a task manager UI running at http://mymachine.com:50030 and http://mymachine.com:50060
on my machine I cannot open these ports.
I have already done
start-dfs.sh
start-yarn.sh
start-all.sh
is something wrong? why can't I see the task manager UI?
You have installed YARN (MRv2) which runs the ResourceManager. The URL http://mymachine.com:50030 is the web address for the JobTracker daemon that comes with MRv1 and hence you are not able to see it.
To see the ResourceManager UI, check your yarn-site.xml file for the following property:
yarn.resourcemanager.webapp.address
By default, it should point to : resource_manager_hostname:8088
Assuming your ResourceManager runs on mymachine, you should see the ResourceManager UI at http://mymachine.com:8088/
Make sure all your deamons are up and running before you visit the URL for the ResourceManager.
For Hadoop 2[aka YARN/MRV2] - Any hadoop installation version-ed 2.x or higher its at port number 8088. eg. localhost:8088
For Hadoop 1 - Any hadoop installation version-ed lower than 2.x[eg 1.x or 0.x] its at port number 50030. eg localhost:50030
By default HadoopUI location is as below
http://mymachine.com:50070

Resources