Apache storm workers not starting - apache-storm

The storm release is 1.0.1. My nimbus and supervisor are running. When I submit a ExclamationTopology that in storm-starter, the worker is not starting and there isn't a worker.log.
I find the error here from supervisor.log.
Thanks a lot for your suggestion.

Related

How can I stop Apache Storm Nimbus, UI and Supervisor?

I run Apache Storm in a cluster and I was looking for ways to stop and/or restart Nimbus, Supervisor and UI. Would writting a servise help? What should I write in this service file and where should I place it? Thank you in advance
Yes, writing a service is the recommended way to run Storm. The commands you want to run are storm nimbus to start Nimbus (minimum 1 per cluster), storm supervisor to run the supervisor (1 per worker machine), storm ui (1 per cluster) and storm logviewer (1 per worker machine). There are other commands you can also run, but you can find these by simply running storm, it will print a list.
Regarding how to write the service, take a look at the upstart cookbook http://upstart.ubuntu.com/cookbook/.
There's an example script here you can probably use to get started https://unix.stackexchange.com/a/84289
you can make them as service and start them up as the node starts and same can be used to stop them.
/etc/rc.d/SERVICE start or stop or restart
We can easily stop them using the command "ps -aux | grep nimbus" or supervisor etc. Then we have to find the process id and kill it with the “kill” command.

Hadoop job submission using Apache Ignite Hadoop Accelerators

Disclaimer: I am new to both Hadoop and Apache Ignite. sorry for the lengthy background info.
Setup:
I have installed and configured Apache Ignite Hadoop Accelerator. Start-All.sh brings up the below services. I can submit Hadoop jobs. They complete and I can see results as expected. The start all uses traditional core-site, hdfs-site, mapred-site, and yarn-site configuration files.
28336 NodeManager
28035 ResourceManager
27780 SecondaryNameNode
27429 NameNode
28552 Jps
27547 DataNode
I also have installed Apache Ignite 2.6.0. I am able to start ignite nodes, connect to it using web console. I was able to load the cache from MySQL and run SQL queries and java programs against this cache.
For running Hadoop jobs using ignited Hadoop, I created a separate ignite-config directory, in which I have customized core-site and mapred-site configurations as per the instructions in the Apache ignite web site.
Issue:
When I run a Hadoop job using the command:
hadoop --config ~/ignite-conf jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.0.jar wordcount input output1
I get the below error (Note, the same job ran successfully against the Hadoop/without ignite):
java.io.IOException: Failed to get new job ID.
...
...
Caused by: class org.apache.ignite.internal.client.GridClientDisconnectedException: Latest topology update failed.
...
...
Caused by: class org.apache.ignite.internal.client.GridServerUnreachableException: Failed to connect to any of the servers in list: [/:13500]
...
...
It looks like, there was attempt made to lookup the jobtracker (13500) and it was not able to find. From the service list above, it's obvious that job tracker is not running. However, the job ran just fine on non-ignited hadoop over YARN.
Can you help please?
This is resolved in my case.
The job tracker here meant the Apache Ignite memory cache services listening on port 11211.
After making this change in mapred-site.xml, the job ran!

Get list of executed job on Hadoop cluster after cluster reboot

I have a hadoop cluster 2.7.4 version. Due to some reason, I have to restart my cluster. I need job IDs of those jobs that were executed on cluster before cluster reboot. Command mapred -list provide currently running of waiting jobs details only
You can see a list of all jobs on the Yarn Resource Manager Web UI.
In your browser go to http://ResourceManagerIPAdress:8088/
This is how the history looks on the Yarn cluster I am currently testing on (and I restarted the services several times):
See more info here

Storm Nimbus is not working

I'm new to Apache Storm and I'm currently working on a project which uses storm. While trying to understand the basics of storm, I came across nimbus and supervisors. I started setting up a remote cluster. I edited the storm.yaml file and setup the nimbus and zookeeper to localhost. I tried running my nimbus, zookeeper in my local machine. I started nimbus using "storm nimbus", but nimbus is not started whereas my zookeeper is actively running. enter image description hereI have the screenshot of what I get.
Can somebody help me in sorting out.
go to the directory where you have installed your storm go to bin folder and run:
storm nimbus
reference image:
You have to run the start command from the bin directory of the storm installation directory.
Navigate to the bin folder inside storm installation folder and then run "storm nimbus".

unable to see Task tracker and Jobtracker after Hadoop single node installation 2.5.1

Iam new to Hadoop 2.5.1. As i have already installed Hadoop 1.0.4 previously, i thought installation process would be same so followed following tutorial.
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Every thing was fine, even i have given these settings in core-site.xml
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
But i have seen in several sites this value as 9000.
And also changes in yarn.xml.
Still everything works fine when i run a mapreduce job. But my question is
when i run command jps it gives me this output..
hduser#secondmaster:~$ jps
5178 ResourceManager
5038 SecondaryNameNode
4863 DataNode
5301 NodeManager
4719 NameNode
6683 Jps
I dont see task tracker and job tracker in jps. Where are these demons running.
And without these deamons how am i able to run Mapreduce job.
Thanks,
Sreelatha K.
From hadoop version hadoop 2.0 onwards, default processing framework has been changed to YARN from Classic Mapreduce. You are using YARN, where you cannot see Jobtracker, Tasker in YARN. Jobtracker and Tasktracker is replaced by Resource manager and Nodemanager respectively in YARN.
But still you have an option to use Classic Mapreduce framework instead of YARN.
In Hadoop 2 there is an alternative method to run MapReduce jobs, called YARN. Since you have made changes in yarn.xml, MapReduce processing happens using YARN, not using the traditional MapReduce framework. That's probably be the reason why you don't see TaskTracker and JobTracker listed after executing the jps command. Note that ResourceManager and NodeManager are the daemons for YARN.
YARN is next generation of Resource Manager who can able to integrate with Apache spark, storm and many more tools you can use to write map-reduce jobs

Resources