I have a mq cluster setup that has a few queue managers, some are full repositories and some are partial repositories.
A full repository is supposed to hold information(meta data?) about the entire cluster.
A partial repository will hold some information about the cluster.
How do I gather information about the entire cluster using Programmable Command Format? Information about hosts, queue managers, full and partial repositories, cluster queues etc....
Update 1
I have tried the following code but this does not return cluster information.
PCFMessageAgent agent = new PCFMessageAgent(queueManager);
agent.setCheckResponses(false);
PCFMessage[] responses;
PCFMessage request = new PCFMessage(MQConstants.MQCMD_INQUIRE_CLUSTER_Q_MGR);
request.addParameter(MQConstants.MQCA_CLUSTER_Q_MGR_NAME, queueManager);
responses = agent.send(request);
String clusterName = (String)responses[0].getParameterValue(MQConstants.MQCA_CLUSTER_NAME);
String clusterInfo = (String)responses[0].getParameterValue(MQConstants.MQIACF_CLUSTER_INFO);
logger.info("Cluster Name [" + clusterName + "]");
logger.info("Cluster Information [" + clusterInfo + "]");
The last line prints out a null.
Update 2
The answer below suggests that MQCMD_INQUIRE_CLUSTER_Q_MGR is equivalent to runmqsc DISPLAY CLUSQMGR(*) command. Following is the output from this command
display clusqmgr(*)
4 : display clusqmgr(*)
AMQ8441: Display Cluster Queue Manager details.
CLUSQMGR(QM_FR1) CHANNEL(TO.QM_FR1)
CLUSTER(CLUSTER1)
AMQ8441: Display Cluster Queue Manager details.
CLUSQMGR(QM_FR2) CHANNEL(TO.QM_FR2)
CLUSTER(CLUSTER1)
AMQ8441: Display Cluster Queue Manager details.
CLUSQMGR(QM_PR1) CHANNEL(TO.QM_PR1)
CLUSTER(CLUSTER1)
AMQ8441: Display Cluster Queue Manager details.
CLUSQMGR(QM_PR2) CHANNEL(TO.QM_PR2)
CLUSTER(CLUSTER1)
AMQ8441: Display Cluster Queue Manager details.
CLUSQMGR(QM_PR3) CHANNEL(TO.QM_PR3)
CLUSTER(CLUSTER1)
AMQ8441: Display Cluster Queue Manager details.
CLUSQMGR(QM_PR3) CHANNEL(TO.QM_PR3)
CLUSTER(CLUSTER1)
I was expecting a similar response with PCF in the code i have supplied, but i don't get this information. So the question is
How do I get this information using PCF? The above output is for a full repository queue manager.
Use the following PCF commands
Inquire Cluster Queue Manager (MQCMD_INQUIRE_CLUSTER_Q_MGR) which is the equivalent of the MQSC command DISPLAY CLUSQMGR. In the linked page, you can see all the possible output parameters listed in the section headed with ClusterQMgrAttrs. You can remove the line in your code that is trying to retrieve the value of the MQIACF_CLUSTER_INFO - an INPUT-ONLY parameter - and replace that line with any one of the parameters listed in that section to retrieve whatever information it is that you want about this cluster queue manager.
Inquire Queue (MQCMD_INQUIRE_Q) with the MQIACF_CLUSTER_INFO parameter which is the equivalent of the MQSC command DISPLAY QUEUE(*) CLUSINFO. Please note, the MQIACF_CLUSTER_INFO parameter is an input qualifier to this command that causes cluster queues as well as local queues to be returned as answers.
As you correctly note, only the full repository queue manager knows everything about a cluster, so you need to make your inquiries against that queue manager in order to get the full picture.
Related
When a job is submitted, when do YARN and NameNode interact? When a job is submitted, who does it get sent to? Could someone explain the end-to-end flow - how hadoop ecosystem works?
Thanks!
Namenode: Stores the meta-data of all the data stored in data nodes and monitors the health of data nodes. Basically, it is a master-slave architecture.
YARN: It stands for Yet Another Resource Negotiator. The yarn has mainly two components.
1.> Scheduling
2.> Application Manager
Yarn also contains the master, i.e Resource Manager and Slave, i.e Node Manager.
For scheduling purpose, there are 3 Schedulers:
1.> FIFO 2.> Capacity 3.> Fair-share
There is a component called Application Master assigned by Resource Manager under the Node Manager.
One application master is assigned to one application.
The job is directly submitted by the client and Resource Manager assigns the job to the Application Master and Node manager monitors the liveliness of Application Master
Now, whenever the job comes in, Resource manager creates a job id and assign an Application Master for that job. Resource Manager contacts to the Namenode to retrieve the information about the metadata of the required data on which the task has to be performed. And the information received by Resource Manager is then passed to Application Master.
This is the basic overview of the working of Yarn with Namenode. You can also read in detail from YARN
Also, NameNode interaction is just in the Hadoop applications running within YARN that talk to the NameNode. Not all YARN applications need to communicate with HDFS
Basically there is no direct interaction between YARN and HDFS, see https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
However YARN jobs require some files (libraries, configuration, etc) which usually resides on HDFS
How can I access to YARN metrics such as status of resource manager and node manager?
Moreover, the same question about running yarn containers. I would like to do it via web interface.
You can use the Yarn Resource Manager UI, which is usually accessible at port 8088 of your resource manager (although the port can be configured). Here you get an overview over your cluster.
Details about the nodes of the cluster can be found in this UI in the Cluster menu, submenu Nodes. Here one finds the health information, some information about the hardware, currently running jobs and the software version of the node manager. For more details, you would have to log into each node and check the nodemanager logs (for the node where the resource manager is running, this log is also available via Tools menu -> Local logs, but this would not be sufficient if you have more than one node in your cluster).
More details about the Resource manager (including runtime statistics) are available in the Tools menu -> Server metrics.
If you want to access these information programatically, you can use the Resource Manager's REST API.
Another option might be to use Ambari. This tools is an Hadoop management tool that can be used to monitor the different services within an Hadoop cluster and to trigger alerts in case of unusual or unexpected events. However, it requires some installation and configuration efforts.
` I need to check MQ queue is already exists in cluster. dspmq command and
dis q(TEST.QUEUE) CLUSTER. which command is used to check IBM MQ queue is already exists in cluster
dspmq is used to display the Queue Manager status.
If you want to find out whether a Cluster already has a Queue in it you want to execute the following MQSC command DISPLAY QCLUSTER(<Queue Name>) WHERE (CLUSTER EQ <cluster name>)
However, the response will only be valid if the Queue Manager knows about the Queue:
If you execute the command on a full repository then you can trust the response as the Full repositories always know everything about the cluster.
If you execute the command on a partial repository, the Queue Manager will only be able to tell you about the Queue if an application has already attempted to make use of the Queue. Otherwise it won't know whether it exists or not.
I'm using MapR distribution. While i'm trying to run a hive query. Its showing the error- java.io.ioexception: failed to run job : application rejected by queue placement policy
I have set the queue with below command,
set mapred.job.queue.name=<>;
But still no use. Could some one help me to understand!!
Thanks in advance.
I had this same error.
I used: hadoop queue -showacls | grep SUBMIT to find out the queues I had access to, and then used the command "set mapreduce.job.queuename".
Please do check your Yarn Resource Pool Configuration to make sure you have adequate resources provided for the queue.
I wonder if there is an example of post process for EMR (Elastic MapReduce)? What I am trying to achieve is send an email to group of people right after Amazon's Hadoop finished the job.
You'll want to configure the job end notification URL.
jobEnd.notificationUrl
AWS will hit this url, presumably with query variables that indicate which job has completed (job id).
You could then have this URL on your server process your email notifications, assuming you had already stored a relationship between emails and job ids.
https://issues.apache.org/jira/browse/HADOOP-1111
An easier way is to use Amazon CloudWatch (monitoring system) and Amazon Simple Notification Service (SNS) to monitor and notify you and others on the status of your EMR jobs.
For example you can set an alarm for your cluster to check when it IsIdle. It will be set to 1 once the job is done (or failed), and you can then get SNS notification as an email (or SMS even). You can set similar alarms on count of JobsFailed and other metrics.
For the complete list of EMR related metrics you can see EMR documentations
You can see more information about it here: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_ViewingMetrics.html