Permission denied error for logged in user for Apache Pig - hadoop

I am getting the following error when I try to run pig -help.
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1717)
at java.io.File.createTempFile0(File.java:1738)
at java.io.File.createTempFile(File.java:1815)
at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
Here is my configuration-
Apache Hadoop - 1.0.3
Apache Pig - 0.10.0
OS - Ubuntu 64-bit
User for whom the error is seen - "sumod" this is an admin level account. I have also created directory for him in the HDFS.
User for whom this error is NOT seen - "hadoop". I have created this user for hadoop jobs. He is not an admin user. But he belongs to "supergroup" on HDFS.
The paths are properly set for both the users.
I do not have to start hadoop while running "pig -help" command. I only want to make sure that Pig is installed properly.
I am following Apache doc and my understanding is that I do not have to be hadoop user to run Pig and I can be a general system user.
Why am I getting these errors? What am I doing wrong?

I had seen the same exception error. The reason for me was that the user I was running pig did not have write permission on ${hadoop.tmp.dir}

Please check the permissions of the directory where the pigscript is placed.
Whenever a pigscript is executed, errors are logged in a log file, which is written in your present working directory.
Assume your pigscript is in dir1 and your pwd is dir2 and since you are executing as user sumod; sumod should have write permissions in dir2.

Related

Getting write permission from HDFS after updating flink-1.40 to flink-1.4.2

Environment
Flink-1.4.2
Hadoop 2.6.0-cdh5.13.0 with 4 nodes in service and Security is off.
Ubuntu 16.04.3 LTS
Java 8
Description
I have a Java job in flink-1.4.0 which writes to HDFS in a specific path.
After updating to flink-1.4.2 I'm getting the following error from Hadoop complaining that the user doesn't have write permission to the given path:
WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xng (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Permission denied: user=user1, access=WRITE, inode="/user":hdfs:hadoop:drwxr-xr-x
NOTE:
If I run the same job on flink-1.4.0, Error disappears regardless of what version of flink (1.4.0 or 1.4.2) dependencies I have for job
Also if I run the job main method from my IDE and pass the same parameters, I don't get above error.
Question
Any Idea what's wrong? Or how to fix?

Spark/Hadoop can't read root files

I'm trying to read a file inside a folder that only me (and root) can read/write, through spark, first I start the shell with:
spark-shell --master yarn-client
then I:
val base = sc.textFile("file///mount/bases/FOLDER_LOCKED/folder/folder/file.txt")
base.take(1)
And got the following error:
2018-02-19 13:40:20,835 WARN scheduler.TaskSetManager:
Lost task 0.0 in stage 0.0 (TID 0, mydomain, executor 1):
java.io.FileNotFoundException: File file: /mount/bases/FOLDER_LOCKED/folder/folder/file.txt does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
...
I am suspecting that as yarn/hadoop was launched with the user hadoop it can't go further in this folder to get the file. How could I solve this?
OBS: This folder can't be open to other users because it has private data.
EDIT1: This /mount/bases is a network storage, using a cifs connection.
EDIT2: hdfs and yarn was launched with the user hadoop
As hadoop was the user that lauched hdfs and yarn, he is the user that will try to open a file in a job, so it must be authorized to access this folder, fortunely hadoop checks what user is executing the job first to allow the access to a folder/file, so you will not take risks at this.
Well, if it would have been access related issue with the file, you would have got 'access denied' as an error. In this particular scenario, I think file that you are trying to read is not present at all, or might have some other name[typos]. Just check for the file name.

Hadoop: Pseudo Distributed mode for multiple users

I appreciate your help in advance.
I have setup Hadoop in Pseudo Distributed mode using the root user credentials. I want to provide access to multiple users (let us say hadoop1, hadoop2, etc) to be able to submit and run MapReduce jobs on this cluster. How do we get this done?
What I have done so far?
> - Setup Hadoop to run in Pseudo-distributed mode
> - Used "root" user credentials to set this up.
> - Added users hadoop1 and hadoop2 to a group called "hadoop".
> - Added root also to be part of the group "hadoop".
> - Created a folder called hdfstmp and set this as the path for hadoop.tmp.dir.
> - Started the cluster using bin/start-all.sh
> - Ran MapReduce jobs using hadoop1 and hadoop2 users.
I got the error below:
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:1006)
at java.io.File.createTempFile(File.java:1989)
at org.apache.hadoop.util.RunJar.main(RunJar.java:119)
To overcome this error, I gave group "hadoop" rwx permissions to folder hdfstmp. The permissions on this folder look like drwxrwxr-x.
Submitted MapReduce jobs using hadoop1 and hadoop2 users login. The job ran fine without any errors.
However, if I do a stop-all.sh and then do a start-all.sh, the DataNode (and occassionally even NameNode) does not start up. When I check the logs, i see an error as below:
2013-09-21 16:43:54,518 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /data/hdfstmp/dfs/data, expected: rwxr-xr-x, while actual: rwxrwxr-x
Now, without change to the group ownership of the hdfstmp directory, my MR jobs submitted by different users do not run. But when the NameNode gets restarted, i get the issue as above.
How do i overcome this issue? What is the best practice for the same?
Also, is there a way to monitor the jobs that are being submitted by the different users? I am assuming the Web UI should allow me to do this. Please confirm.
I appreciate any assistance you can provide me on this issue. Thanks.
Regards
Adding a dedicated Hadoop system user
We will use a dedicated Hadoop user account for running Hadoop. While that’s not required it is recommended because it helps to separate the Hadoop installation from other software applications and user accounts running on the same machine (think: security, permissions, backups, etc).
#addgroup hadoop
#adduser --ingroup hadoop hadoop1
#adduser --ingroup hadoop hadoop2
This will add the user hduser and the group hadoop to your local machine.
Change permission of your hadoop installed directory
chown -R hduser:hadoop hadoop
And lastly change hadoop temporary directoy permission
If your temp directory is /app/hadoop/tmp
#mkdir -p /app/hadoop/tmp
#chown hduser:hadoop /app/hadoop/tmp
and if you want to tighten up security, chmod from 755 to 750...
#chmod 750 /app/hadoop/tmp

Steps to install Hive

I have Hadoop configured in my REDHAT system. I am getting the following error when $HIVE_HOME/bin/hive is executed..
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1704)
at java.io.File.createTempFile(File.java:1792)
at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
hive uses a 'metastore'; it creates this directory when you invoke it for the first time. The meta-directory is usually created in the current working directory you are in (i.e. where you are running the hive command)
which dir are you invoking hive command from? Do you have write permissions there?
try this:
cd <--- this will take you to your home dir (you will have write permissions there)
hive

Apache Pig 0.10 with CDH3u0 Hadoop failed to work as normal user

I have used Pig, but new to Hadoop/Pig installation.
I have DH3u0 Hadoop installed running Pig 0.8
I downloaded Pig 0.10 and installed it in a separate directory.
I am able to start pig as root user, but failed to start pig as normal user with the following error:
Exception in thread "main" java.io.IOException: Permission denie
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1704)
at java.io.File.createTempFile(File.java:1792)
at org.apache.hadoop.util.RunJar.main(RunJar.java:146)
Any pointer to the problem would be greatly appreciated.
Also the log file is defaulted to the pig installed directory, is there a way to default the log directory to the user home directory without using the -l option.

Resources