Unable to schedule falcon process - Could not perform authorization operation, java.io.IOException: Couldn't set up IO streams - hortonworks-data-platform

​Hi,
I am trying to schedule a falcon process using falcon CLI and falcon service user on a Kerberised cluster. I am getting the following error message:
ERROR: Bad Request;default/org.apache.falcon.FalconWebException::org.apache.falcon.FalconException: Entity schedule failed for process: testHiveProc
Falcon app logs shows following:
used by: org.apache.falcon.FalconException: E0501 : E0501: Could not perform authorization operation, Failed on local exception: java.io.IOException: Couldn't set up IO streams; Host Details :
Any suggestions?
Thanks.

Root cause:
Oozie was running out of processes due to more number of scheduled jobs.
Short term solution:
Restart Oozie server
Long term solution:
- Increase ulimit
- Limit the number of scheduled jobs in Oozie

Related

hue said Resource Manager not available error but running fine

when i run the quick start met the error message
Potential misconfiguration detected. Fix and restart Hue.
Resource Manager : Failed to contact an active Resource Manager: YARN RM returned a failed response: HTTPConnectionPool(host='localhost', port=8088): Max retries exceeded with url: /ws/v1/cluster/apps?user=hue (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused',))
Hive : Failed to access Hive warehouse: /user/hive/warehouse
HBase Browser : The application won't work without a running HBase Thrift Server v1.
Impala : No available Impalad to send queries to.
Oozie Editor/Dashboard : The app won't work without a running Oozie server
Pig Editor : The app won't work without a running Oozie server
Spark : The app won't work without a running Livy Spark Server
i don't know why hue said error for resource manager.
i didn't install another things yet.
my resource manager is running and that api is no problem this - http://RMHOST:8088/ws/v1/cluster/apps?user=hue
response is
{
"apps": null
}
is there any problem i missed?
I changed localhost to My IP address like 192.168.x.x in resourcemanager_host, resourcemanager_api_url, proxy_api_url
I don't know why it works

Operation not permitted while launch mr job

I have change my kerberos cluster to unkerberized
after that below Exception occurred while launching MR jobs.
Application application_1458558692044_0001 failed 1 times due to AM Container for appattempt_1458558692044_0001_000001 exited with exitCode: -1000
For more detailed output, check application tracking page:http://hdp1.impetus.co.in:8088/proxy/application_1458558692044_0001/Then, click on links to logs of each attempt.
Diagnostics: Operation not permitted
Failing this attempt. Failing the application.
I am able to continue my work
delete yarn folder from all the nodes which is define on yarn.nodemanager.local-dirs this property
then restart yarn process

task error because symlink creation failed

My Hadoop cluster is running on 8 CentOS 6.3 machines and the Hadoop version is CDH 4.3 (installed from Coludera Manager 4.6).
Recently I found that some of my jobs had failed tasks. The failed tasks will success in the next attempt. However, the failed tasks are so many ( 50000 tasks, 1000 failed ) and I'm afraid this will cause performance issue or other potential issue.
All of the failed tasks have the same call stack:
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:250)
Caused by: java.io.IOException: Creation of symlink from /var/log/hadoop-0.20-mapreduce/userlogs/job_201311140947_0002/attempt_201311140947_0002_m_051950_0 to /hdfs7/mapred/local/userlogs/job_201311140947_0002/attempt_201311140947_0002_m_051950_0 failed.
at org.apache.hadoop.mapred.TaskLog.createTaskAttemptLogDir(TaskLog.java:126)
at org.apache.hadoop.mapred.DefaultTaskController.createLogDir(DefaultTaskController.java:72)
at org.apache.hadoop.mapred.TaskRunner.prepareLogFiles(TaskRunner.java:295)
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:215)
I tried to build the symlink manually on the same path and I encountered no problems. I'm wondering what cause this issue.

Making a Cassandra Connection inside Hadoop MapReduce Task

I am successfully using the DataStax Java Driver to access Cassandra inside my Java code just before I start a MapReduce Job.
cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
However I am needing to check additional information to decide on a per record basis how to reduce the record. If I attempt to use the same code inside a Hadoop Reducer class it fails to connect with the error:
INFO mapred.JobClient: Task Id :
attempt_201310280851_0012_r_000000_1, Status : FAILED
com.datastax.driver.core.exceptions.NoHostAvailableException:
All host(s) tried for query failed (tried: /127.0.0.1 ([/127.0.0.1]
Unexpected error during transport initialization
(com.datastax.driver.core.TransportException: [/127.0.0.1] Error writing)))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:186)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:81)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:662)
at com.datastax.driver.core.Cluster$Manager.access$100(Cluster.java:604)
at com.datastax.driver.core.Cluster.<init>(Cluster.java:69)
at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:96)
at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:585)
The MapReduce input and output will successfully read and write to Cassandra. As I mentioned I can connect before I run the job so I do not think it is an issue with the Cassandra server.

"Child Error" in Executing stream Job on multi node Hadoop cluster (cloudera distribution CDH3u0 Hadoop 0.20.2)

I am working on 8 node Hadoop cluster, and I am trying to execute a simple streaming Job with the specified configuration.
hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar \-D mapred.map.max.tacker.failures=10 \-D mared.map.max.attempts=8 \-D mapred.skip.attempts.to.start.skipping=8 \-D mapred.skip.map.max.skip.records=8 \-D mapred.skip.mode.enabled=true \-D mapred.max.map.failures.percent=5 \-input /user/hdfs/ABC/ \-output "/user/hdfs/output1/" \-mapper "perl -e 'while (<>) { chomp; print; }; exit;" \-reducer "perl -e 'while (<>) { ~s/LR\>/LR\>\n/g; print ; }; exit;"
I am using cloudera's distribution for hadoop CDH3u0 with hadoop 0.20.2. The problem in execution of this job is that the job is getting failed everytime. The job is giving the error:
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:242)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:229)
-------
java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:242)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:229)
STDERR on the datanodes:
Exception in thread "main" java.io.IOException: Exception reading file:/mnt/hdfs/06/local/taskTracker/hdfs/jobcache/job_201107141446_0001/jobToken
at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:146)
at org.apache.hadoop.mapreduce.security.TokenCache.loadTokens(TokenCache.java:159)
at org.apache.hadoop.mapred.Child.main(Child.java:107)
Caused by: java.io.FileNotFoundException: File file:/mnt/hdfs/06/local/taskTracker/hdfs/jobcache/job_201107141446_0001/jobToken does not exist.
For the cause of the error I have checked the following things and still it is crashing for which I am unable to understand the reason.
1. All the temp directories are in place
2. Memory is way more than it might be required for job (running a small job)
3. Permissions verified.
4. Nothing Fancier done in the configuration just usual stuff.
The most weird thing is that job runs successfully sometime and fails most of the time. Any guidance/Help regarding the issues would be really helpful. I am working on this error from last 4 days and I am not able to figure out anything. Please Help!!!
Thanks & Regards,
Atul
I have faced the same problem, it happens if task tracker is not able to allocates specified memory to the child JVM for the task.
Try executing same job again when cluster is not busy running many other jobs along with this one, it will go through or have speculative execution to true, in that case hadoop will execute the same task in another task tracker.

Resources