Start hbase in CDH5 VM in standalone mode - hadoop

How can I start my Hbase in Standalone mode in a CDH5 VM. In CDH3 VM, I used to run
'sudo sh start-hbase.sh'
in the below path:
/usr/lib/hbase/bin
But, I can only see 'start-hbase.cmd' in the above path in CDH5 VM. Please let me know how can I start my HBase instance by invoking the above '.cmd' file

We can use the following command to start a service in CDH5 VM
sudo service <(service name)> start
eg:
sudo service zookeeper-server start
or we can also go to the path
/etc/init.d
and execute the same command as above!

Related

start-all.sh command not found

I have just installed Cloudera VM setup for hadoop. But when I open the command prompt and want to start all daemons for hadoop using command 'start-all.sh' , I get an error stating "bash : start-all.sh: command not found".
I have tried 'start-dfs.sh' too yet still gives the same error. When I use 'jps' command, I can see that none of the daemons have been started.
You can find start-all.sh and start-dfs.sh scripts in bin or sbin folders. You can use the following command to find that. Go to hadoop installation folder and run this command.
find . -name 'start-all.sh' # Finds files having name similar to start-all.sh
Then you can specify the path to start all the daemons using bash /path/to/start-all.sh
If you're using the QuickStart VM then the right way to start the cluster (as #cricket_007 hinted) is by restarting it in the Cloudera Manager UI. The start-all.sh scripts will not work since those only apply to the Hadoop servers (Name Node, Data Node, Resource Manager, Node Manager ...) but not all the services in the ecosystem (like Hive, Impala, Spark, Oozie, Hue ...).
You can refer to the YouTube video and the official documentation Starting, Stopping, Refreshing, and Restarting a Cluster

CDH 5.3.2 - Need to restart impala daemon from shell/script

I am using CDH 5.3.2 cluster and have a requirement to be able to start/stop impala daemons from a script. The command mentioned in Cloudera Docs
sudo service impala-server start
works fine on my CDH 5.10 local VM but on CDH 5.3.2 cluster I get an error "impala-server: unrecognized service". On checking in /etc/init.d I see that no such service is listed either (while its listed in 5.10 version)
Then i tried to restart the service directly from impala bin directory
cd /usr/bin
./impalad stop
However running into below error now:
E0918 11:55:27.815739 12046 JniFrontend.java:622] FileSystem is file:///
W0918 11:55:27.817589 12046 JniFrontend.java:534] Cannot detect CDH version. Skipping Hadoop configuration checks
E0918 11:55:27.817620 12046 impala-server.cc:210] Unsupported file system. Impala only supports DistributedFileSystem but the configured filesystem is: LocalFileSystem.fs.defaultFS(file:///) might be set incorrectly
E0918 11:55:27.817631 12046 impala-server.cc:212] Aborting Impala Server startup due to improper configuration
I checked core-site.xml on Cloudera Manager and fs.defaultFS is correctly set so not sure where its picking the value from. Any pointers on how to go further on this?
The init.d service packages to start Impala from the command line are meant to be used for CDH users who do NOT want to use Cloudera Manager. The right way to start and stop Impala on a Cloudera Manager cluster is to use the CM API:
https://cloudera.github.io/cm_api/apidocs/v17/index.html
start cluster service API
stop cluster service API
commands API
The tutorial shows how to use the CM APIs but for your situation you probably need to do:
$ curl -X POST -u USER:PASSWORD \
'CM_URL//api/v1/clusters/CLUSTERNAME/services/IMPALA_SERVICE/commands/stop'
replacing USER, PASSWORD, CM_URL, CLUSTERNAME, IMPALA_SERVICE_NAME with the appropriate values. The curl command will return a command ID.
Then poll this API with the command ID to see that the start/stop operation completed.
$ curl -u USER:PASSWORD 'CM_URL//api/v1/commands/COMMAND_ID'
However, if you still want to use the init.d service packages then you'll need to install the impala-server package.

How to restart yarn on AWS EMR

I am using Hadoop 2.6.0 (emr-4.2.0 image). I have made some changes in yarn-site.xml and want to restart yarn to bring the changes into effect.
Is there a command using which I can do this?
Edit (10/26/2017): A more detailed Knowledge Center article on how to do this has been published here by AWS officially -
https://aws.amazon.com/premiumsupport/knowledge-center/restart-service-emr/.
You can ssh into the master node of your EMR cluster and run -
"sudo /sbin/stop hadoop-yarn-resourcemanager"
"sudo /sbin/start hadoop-yarn-resourcemanager"
commands to restart the Yarn resource manager. EMR AMI 4.x.x uses upstart - /sbin/{start,stop,restart} are all symlinks to /sbin/initctl, which is part of upstart. See the initctl man page for more information.
Alternatively, you can follow the instructions here to propagate your changes to yarn-site.xml - yarn-change-configuration-on-yarn-site-xml
For those who are gonna come from Google
In order to restart a service in EMR, perform the following actions:
Find the name of the service by running the following command:
initctl list
For example, the YARN Resource Manager service is named hadoop-yarn-resourcemanager.
Stop the service by running the following command:
sudo stop hadoop-yarn-resourcemanager
Wait a few seconds, then start the service by running the following command:
sudo start hadoop-yarn-resourcemanager
Note: Stop/start is required; do not use the restart command.
Verify that the process is running by running the following command:
sudo status hadoop-yarn-resourcemanager
Check for the process using ps, and then check the log file for any errors in the log directory /var/log/.
Source : https://aws.amazon.com/premiumsupport/knowledge-center/restart-service-emr/
If what you want to do is to enable log-aggregation, it is actually easier to create the cluster with log-aggregation already enabled, as described in the documentation:
http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-plan-debugging.html
(It is actually enabled by default if you are using emr-4.3.0).
Try restarting this service as well:
hadoop-yarn-nodemanager

Could not find and execute start-all.sh and Stop-all.sh on Cloudera VM for Hadoop

How to start / Stop services from command line CDH4 --. I am new to Hadoop. Installed VM from Cloudera. Could not find start-all.sh and stop-all.sh . How to stop or start the task tracker or data node if I want. It is a single node cluster which I am using on Centos. I haven't dont any modifications.
More over I see there are changes in the directory structures in all flavours. I could not locate these sh files on the VM for my installation.
[cloudera#localhost ~]$ stop-all.sh
bash: stop-all.sh: command not found
Highly appreciate your support.
use Sudo su hdfs to start and to stop just type exit it will stop all the services.

Hadoop CDH3 ERROR. Could not start Hadoop datanode daemon

I'm deploying Hadoop CDH3 in pseudo-distributed mode on a VPS.
So i have installed CDH3, then i have executed
sudo apt-get install hadoop-0.20-conf-pseudo
but if i try to start all daemons with
for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done
it throws
ERROR. Could not start Hadoop datanode daemon
The same installation and starting commands works on my notebook.
I don't understand the cause. In fact the log file is empty. The available RAM is about 900MB, with 98G of available disk space.
Which can be the cause or how can i discover it? I'm excluding that the error is from the configuration files.
Consider using Cloudera Manager, it could save you some time (especially if you use multiple nodes). There is a nice video on Youtube which shows deployment process

Resources