Getting write permission from HDFS after updating flink-1.40 to flink-1.4.2 - hadoop

Environment
Flink-1.4.2
Hadoop 2.6.0-cdh5.13.0 with 4 nodes in service and Security is off.
Ubuntu 16.04.3 LTS
Java 8
Description
I have a Java job in flink-1.4.0 which writes to HDFS in a specific path.
After updating to flink-1.4.2 I'm getting the following error from Hadoop complaining that the user doesn't have write permission to the given path:
WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xng (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=WRITE, inode="/user":hdfs:hadoop:drwxr-xr-x
NOTE:
If I run the same job on flink-1.4.0, Error disappears regardless of what version of flink (1.4.0 or 1.4.2) dependencies I have for job
Also if I run the job main method from my IDE and pass the same parameters, I don't get above error.
Question
Any Idea what's wrong? Or how to fix?

Related

Spark 2.0.1 not finding file passed in through archives flag

I was running Spark job which make use of other files that is passed in through --archives flag of spark
spark-submit .... --archives hdfs:///user/{USER}/{some_folder}.zip .... {file_to_run}.py
Spark is currently running on YARN and when I tried it with spark version 1.5.1 it was fine.
However, when I ran the same commands with spark 2.0.1, I got
ERROR yarn.ApplicationMaster: User class threw exception: java.io.IOException: Cannot run program "/home/{USER}/{some_folder}/.....": error=2, No such file or directory
Since the resource is managed by YARN, it is challenging to manually check if the file gets successfully decompressed and exist when the job runs.
I wonder if anyone has experienced similar issue.

java.io.EOFException: Premature EOF: no length prefix available in Spark on Hadoop

I'm getting this weird exception. I'm using Spark 1.6.0 on Hadoop 2.6.4 and submitting Spark job on YARN cluster.
16/07/23 20:05:21 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-532134798-128.110.152.143-1469321545728:blk_1073741865_1041
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2203)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:867)
16/07/23 20:49:09 ERROR server.TransportRequestHandler: Error sending result RpcResponse{requestId=4719626006875125240, body=NioManagedBuffer{buf=java.nio.HeapByteBuffer[pos=0 lim=81 cap=81]}} to ms0440.utah.cloudlab.us/128.110.152.175:58944; closing connection
java.nio.channels.ClosedChannelException
I was getting this error when running on Hadoop 2.6.0 and thought the exception might be kind of a bug like this but after even changing this to Hadoop 2.6.4 I'm getting the same error. There is not any memory problem, my cluster is good with HDFS and memory. I went through this and this but no luck.
Note: 1. I'm using Apache Hadoop and Spark not any CDH/HDP. 2. I'm able to copy data in HDFS and even able to execute another job on this cluster.
Check file permissions of dfs directory:
find /path/to/dfs -group root
In general, the user permission group is hdfs.
Since I started HDFS service with root user, some dfs block file with root permissions was generated.
I solved the problem after change to right permissions:
sudo chown -R hdfs:hdfs /path/to/dfs

Hadoop 1.0.4 - file permission issue in running map reduce jobs

I am new to hadoop and need to setup a sandbox environment in windows to showcase to a client. I have followed below mentioned steps
Install cygwin on all machines
setup ssh
install hadoop 1.0.4
configure hadoop
Applied patch for hadoop-7682 bug
After lot of hit and trial I was successfully able to run all the components (namenode, datanode, tasktracker and jobtracker). But now I am facing problem while running map-reduce jobs and getting permission error on tmp directory. When I run word count example using following command
bin/hadoop jar hadoop*examples*.jar wordcount wcountjob wcountjob/gutenberg-output
13/03/28 23:43:29 INFO mapred.JobClient: Task Id :
attempt_201303282342_0001_m_000003_2, Status : FAILED Error
initializing attempt_201303282342_0001_m_000003_2:
java.io.IOException: Failed to set permissions of path:
c:\cygwin\usr\local\tmp\taskTracker\uswu50754 to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.mapred.JobLocalizer.createLocalDirs(JobLocalizer.java:144)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:182)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1228)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
at java.lang.Thread.run(Thread.java:662)
I have tried setting the permissions manually but that also doesn't work. What I understand is that this due to java libraries being used that try to reset the permissions and fail. The permission patch that solved the tasktracker problem doesn't seem to solve this one.
Has anybody found a solution for this?
Can anybody point me to download location for Hadoop 0.20.2 which seems to be immune
to this problem?

Impala on Cloudera CDH "Could not create logging file: Permission denied"

I installed Impala via a parcel in the Cloudera Manager 4.5 on a CDH 4.2.0-1.cdh4.2.0.p0.10 cluster.
When I try to start the service it fails on all nodes with this message
perl -pi -e 's#{{CMF_CONF_DIR}}#/run/cloudera-scm-agent/process/800-impala-IMPALAD#g' /run/cloudera-scm-agent/process/800-impala-IMPALAD/impala-conf/impalad_flags
'[' impalad = impalad ']'
exec /opt/cloudera/parcels/IMPALA-0.6-1.p0.109/lib/impala/../../bin/impalad --flagfile=/run/cloudera-scm-agent/process/800-impala-IMPALAD/impala-conf/impalad_flags
Could not create logging file: Permission denied
COULD NOT CREATE A LOGGINGFILE 20130326-204959.15015!log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /var/log/impalad/impalad.INFO (Permission denied)
at java.io.FileOutputStream.openAppend(Native Method)
...
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
at org.apache.hadoop.fs.FileSystem.<clinit>(FileSystem.java:92)
+ date
Complete StdErr Log
I'm unsure whether the permission issue is cause of Impala not running or whether something else crashes and the permission issues just comes up because the crash log can not be written.
Any help would be great!
Run impala from debug binaries as described here:
https://issues.cloudera.org/browse/IMPALA-160
Seems to be related to the JVM in Kernel 12.04.1 LTS
Original Answer: https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/impala-user/4MRZYbn5hI0

Permission denied error for logged in user for Apache Pig

I am getting the following error when I try to run pig -help.
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1717)
at java.io.File.createTempFile0(File.java:1738)
at java.io.File.createTempFile(File.java:1815)
at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
Here is my configuration-
Apache Hadoop - 1.0.3
Apache Pig - 0.10.0
OS - Ubuntu 64-bit
User for whom the error is seen - "sumod" this is an admin level account. I have also created directory for him in the HDFS.
User for whom this error is NOT seen - "hadoop". I have created this user for hadoop jobs. He is not an admin user. But he belongs to "supergroup" on HDFS.
The paths are properly set for both the users.
I do not have to start hadoop while running "pig -help" command. I only want to make sure that Pig is installed properly.
I am following Apache doc and my understanding is that I do not have to be hadoop user to run Pig and I can be a general system user.
Why am I getting these errors? What am I doing wrong?
I had seen the same exception error. The reason for me was that the user I was running pig did not have write permission on ${hadoop.tmp.dir}
Please check the permissions of the directory where the pigscript is placed.
Whenever a pigscript is executed, errors are logged in a log file, which is written in your present working directory.
Assume your pigscript is in dir1 and your pwd is dir2 and since you are executing as user sumod; sumod should have write permissions in dir2.

Resources