How to Kill Hive Query, without knowing application id? - hadoop

My hive-server2 list a few running jobs, so I can find the various query_id.
But there is not yarn-application information in the Yarn 8088 pages.
My question is how to kill the running job.

If you are using Yarn as resource manager, you can find all running jobs by running the following in shell:
yarn application -list -appStates ALL
You can change ALL to RUNNING etc. depending on what application state you are interested in seeing.
An alternative command to the above to see running applications is:
mapred job -list
In order to kill a specific application/job, with YARN you can run:
yarn application -kill <application_id>
Or otherwise:
mapred job -kill <job_id>

Related

How to get the job id of a specific running hadoop jobs

I need to get the id of a specific hadoop job.
In my case, I lunch a sqoop commande remotely and I went to verify the job status with this commande :
hadoop job -status job_id | grep -w 'state'
I can get this information from the GUI but i went to do something
can any one help me !!!
You can use the Yarn REST apis, via your browser or curl from the command line. It will list all the currently running and previously running jobs, including sqoop and the mapreduce jobs that sqoop generates and executes. Use the UI first, if you have it up and running just point your browser to http:<host>:8088/cluster (not sure if the port is the same on all hadoop distributions. I believe 8088 is the default on apache). Alternatively you can use yarn commands directly, e.g, yarn application -list.

Get list of executed job on Hadoop cluster after cluster reboot

I have a hadoop cluster 2.7.4 version. Due to some reason, I have to restart my cluster. I need job IDs of those jobs that were executed on cluster before cluster reboot. Command mapred -list provide currently running of waiting jobs details only
You can see a list of all jobs on the Yarn Resource Manager Web UI.
In your browser go to http://ResourceManagerIPAdress:8088/
This is how the history looks on the Yarn cluster I am currently testing on (and I restarted the services several times):
See more info here

Do we need to put namenode in safe mode before restarting the job tracker?

I have a Hadoop cluster running Cloudera's CDH3, Apache Hadoop's 0.20.2 equivalent. I want to restart the job-tracker as there are some jobs which are not getting killed. I tried killing them from the command line, the command executes successfully, but the jobs are still in Job Cleanup: Pending status. Anyways I want to restart the job-tracker and see if that cleanup the jobs. I know the command to restart the job-tracker, but I am not sure if I need to put the name-node in safe-mode before I restart the job-tracker.
You can try to kill the unwanted jobs using hadoop job -kill <Job-ID> and check for command status echo "$?". If that doesn't work, Restart is the only option.
Hadoop Jobtracker and namenodes are independent components, No need to execute namenode safenode before Jobtracker restart. You can restart Jobtracker process alone.(tasktracker if required)

hadoop - How to kill a TEZ job started by hive?

Below is what I can find. But the problem is if we reuse jdbc hive session all the hive queries go as same Application-Id. Is there a way I can kill a dag?
Tez jobs can be listed using: yarn application -list
Tez jobs can be killed using: yarn application -kill Application-Id

How to kill a mapred job started by hive?

I'm working by CDH 5.1 now. It starts normal Hadoop job by YARN but hive still works with mapred. Sometimes a big query will hang for a long time and I want to kill it.
I can find this big job by JobTracker web console while it didn't provide a button to kill it.
Another way is killing by command line. However, I couldn't find any job running by command line.
I have tried 2 commands:
yarn application -list
mapred job -list
How to kill big query like this?
You can get the Job ID from Hive CLI when you run a job or from the Web UI. You can also list the job IDs using the application ID from resource manager. Ideally, you should get everything from
mapred job -list
or
hadoop job -list
Using the Job ID you can kill it by using the below command.
hadoop job -kill <job_id>
Another alternative would be to kill the application using
yarn application -kill <application_id>

Resources