'yarn application -list' doesnt show any results - hadoop

I have run some Spark applications on a YARN cluster. The application shows up in the "All applications" page in the YARN UI http://host:8088/cluster but the yarn application -list command doesnt give any results. What could be the cause of this ?

When you use "-list" option without "-appTypes" or "-appStates" options, it applies default filtering for "application-types" and "states" (check the highlighted section below). If none of your applications match the default filtering, then you will not get any result.
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):0
If you see the help for "-list", it states the following:
"List applications. Supports optional use of -appTypes to filter applications based on application type, and -appStates to filter applications based on application state".
This seems to be bit misleading.
If you don't specify "-appStates", by default it takes states as "SUBMITTED", "ACCEPTED" and "RUNNING", for filtering. Please check the code below from "listApplications()" method of "org.apache.hadoop.yarn.client.cli.ApplicationCLI.java".
private void listApplications()
{
............
if (allAppStates) {
for (YarnApplicationState appState : YarnApplicationState.values()) {
appStates.add(appState);
}
} else {
if (appStates.isEmpty()) {
appStates.add(YarnApplicationState.RUNNING);
appStates.add(YarnApplicationState.ACCEPTED);
appStates.add(YarnApplicationState.SUBMITTED);
}
}
............
}
As per the code above, following logic is applied:
Call "yarn application -list": Shows all applications in states "SUBMITTED", "ACCEPTED" and "RUNNING"
For e.g. for me output is below (there are zero applications in default states)
CMD> yarn application -list
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):0
Call "yarn application -list -appStates ALL": Shows all the applications (in any state)
For e.g. for me output is below (there are totally 268 applications, also check the filtering criteria applied to "states"):
CMD> yarn application -list -appStates ALL
ALL Total number of applications (application-types: [] and states: [NEW, NEW_SAVING
, SUBMITTED, ACCEPTED, RUNNING, FINISHED, FAILED, KILLED]):268
Call "yarn application -list -appStates FINISHED": Shows all the applications (in FINISHED state)
For e.g. for me output is below (there are 136 applications in FINISHED state):
CMD> yarn application -list -appStates FINISHED
Total number of applications (application-types: [] and states: [FINISHED]):136

It turns out that I had enabled Log aggregation in YARN but had set the yarn.nodemanager.remote-app-log-dir to a custom hdfs directory (/tmp/yarnlogs), So logs were actually getting aggregated at /tmp/yarnlogs in HDFS, but the yarn command was still searching for logs at the default location on HDFS (/tmp/logs). So changing the property to its default value fixed it for me.
NOTE:
If the log aggregation directory is misconfigured ,it also causes an a error while trying to access job history from the web UI, that looks like :
Log aggregation has not completed or is not enabled

Related

hashicorp consul is not publishing all the metrics

consul isn't publishing all the metrics defined in their document, from https://www.consul.io/docs/agent/telemetry.html#transaction-timing, it shows only raft metrics but not txn kvs, has anyone observed this problem?
Command to enable prometheus style metrics:
consul agent -dev -hcl 'telemetry{prometheus_retention_time="24h" disable_hostname=true}'
watch metrics:
watch -n 1 -d "curl -s localhost:8500/v1/agent/metrics?format=prometheus|grep -v ^# | grep -E 'kvs|txn|raft'"
Metrics will be exported only if they are available, i.e. if there are no transactions or KV store operations, then you will not see these metrics in the output.
I have managed to see kvs metrics in the example you have provided. While running Consul agent via command in the question, in browser open http://127.0.0.1:8500/ and click on Key/Value option in the top list (you should end up here http://127.0.0.1:8500/ui/dc1/kv). Click on Create to add new Key/Value pair. After clicking Save you should see something like this in the terminal running watch command:
consul_fsm_kvs{op="set",quantile="0.5"} 0.3572689890861511
consul_fsm_kvs{op="set",quantile="0.9"} 0.3572689890861511
consul_fsm_kvs{op="set",quantile="0.99"} 0.3572689890861511
consul_fsm_kvs_sum{op="set"} 0.3572689890861511
consul_fsm_kvs_count{op="set"} 1
consul_kvs_apply{quantile="0.5"} 2.6777150630950928
consul_kvs_apply{quantile="0.9"} 2.6777150630950928
consul_kvs_apply{quantile="0.99"} 2.6777150630950928
consul_kvs_apply_sum 2.6777150630950928
consul_kvs_apply_count 1
If there are no more transactions some of these values will be set to NaN value, depends on Prometheus metrics type.
Similarly, to see txn, you need to create Consul Transaction
Hope that helps you set up monitoring.

Hadoop example application hangs on Windows single node

I have installed single node Hadoop on Windows and it is apprently working.
Unfortunately, I can't run test application on it.
When I do
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar grep input output 'dfs[a-z.]+'
as described on it's page, I get it not returning to command prompt. On referenced job page I see
YarnApplicationState: ACCEPTED: waiting for AM container to be
allocated, launched and register with RM.
Diagnostics: [Mon Apr 23 00:46:44 +0300 2018] Application is added to
the scheduler and is not yet activated. Skipping AM assignment as
cluster resource is empty. Details : AM Partition =
<DEFAULT_PARTITION>; AM Resource Request = <memory:2048, vCores:1>;
Queue Resource Limit for AM = <memory:0, vCores:0>; User AM Resource
Limit of the queue = <memory:0, vCores:0>; Queue AM Resource Usage =
<memory:0, vCores:0>;
What does it mean and how to push it?

Hive Browser Throwing Error

I am trying to put some basic query in hive editor in hue browser , but it is returning the following error whereas my Hivecli works fine and able to execute queries. Could someone help me?
Fetching results ran into the following error(s):
Bad status for request TFetchResultsReq(fetchType=1,
operationHandle=TOperationHandle(hasResultSet=True,
modifiedRowCount=None, operationType=0,
operationId=THandleIdentifier(secret='r\t\x80\xac\x1a\xa0K\xf8\xa4\xa0\x85?\x03!\x88\xa9',
guid='\x852\x0c\x87b\x7fJ\xe2\x9f\xee\x00\xc9\xeeo\x06\xbc')),
orientation=4, maxRows=-1):
TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]",
sqlState=None,
infoMessages=["*org.apache.hive.service.cli.HiveSQLException:Couldn't
find log associated with operation handle: OperationHandle
[opType=EXECUTE_STATEMENT,
getHandleIdentifier()=85320c87-627f-4ae2-9fee-00c9ee6f06bc]:24:23",
'org.apache.hive.service.cli.operation.OperationManager:getOperationLogRowSet:OperationManager.java:229',
'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:687',
'sun.reflect.GeneratedMethodAccessor14:invoke::-1',
'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
'java.lang.reflect.Method:invoke:Method.java:606',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78',
'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
'java.security.AccessController:doPrivileged:AccessController.java:-2',
'javax.security.auth.Subject:doAs:Subject.java:415',
'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1657',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59',
'com.sun.proxy.$Proxy19:fetchResults::-1',
'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:454',
'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:672',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538',
'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
'java.lang.Thread:run:Thread.java:745'], statusCode=3), results=None,
hasMoreRows=None)
This error could be either due to HiveServer2 not running or Hue does not have access to hive_conf_dir.
Check whether the HiveServer2 has been started and is running. It uses the port 10000 by default.
netstat -ntpl | grep 10000
If it is not running, start the HiveServer2
$HIVE_HOME/bin/hiveserver2
Also check the Hue configuration file hue.ini. The hive_conf_dir property must be set under [beeswax] section. If not set, add this property under [beeswax]
hive_conf_dir=$HIVE_HOME/conf
Restart supervisor after making these changes.

Informatica error 1417 :: Task not yet registered with this service process

I am getting following error while running a workflow in informatica.
Session task instance [worklet.session] : [TM_6775 The master DTM process was unable to connect to the master service process to update the session status with the following message: error message [ERROR: The session run for [Session task instance [worklet.session]] and [ folder id = 206, workflow id = 16042, workflow run id = 65095209, worklet run id = 65095337, task instance id = 13272 ] is not yet registered with this service process.] and error code [1417].]
This error comes randomly for many other sessions, when they are ran through workflow as a whole. However if I "start task" that failed task next time, it runs successfully.
Any help is much appreciated.
Just an idea to try if you use versioning. Check that everthing is checked in correctly. If the mapping, worflow or worklet is checked out then you and informatica will run different versions wich may cause the behaivour to differ when you start it manually.
Infromatica will allways use the checked in version and you will allways use the checked out version.

How can I disable console messages when running maven commands?

I'm in the process of executing Maven commands to run tests in the console (MacOSX). Recently, development efforts have produced extraneous messages in the console (info, debug, warning, etc.) I'd like to know how to remove messages like this:
INFO c.c.m.s.c.p.ApplicationProperties - Loading application properties from: app-config/shiro.properties
I've used this code to remove messages from the dbunit tests:
ch.qos.logback.classic.Logger Logger = (ch.qos.logback.classic.Logger)LoggerFactory.getLogger("org.dbunit");
Logger.setLevel(Level. ERROR);
However, I'm unsure how to disable these additional (often verbose and irritating) messages from showing up on the console so that I can see the output more easily. Additional messages appear as above and these:
DEBUG c.c.m.s.c.f.dao.AbstractDBDAO - Adding filters to the Main Search query.
WARN c.c.m.s.c.p.JNDIConfigurationProperties - Unable to find JNDI value for name: xxxxx
INFO c.c.m.a.t.d.DatabaseTestFixture - * executing sql: xxxxx
The successful answer was:
SOLUTION: Solution to issue IS adding a 'logback-test.xml' file to the root of my test folder. I used the default contents (as instructed by the documentation - thanks #Charlie). Once file exists there, FIXED!

Resources