Adding ec2-user to use hadoop - hadoop

I have setup AWS EMR. I SSH into master node. I wanted to copy a file into hdfs system. That small line of code in my program which does this is:
os.system('/home/hadoop/bin/hdfs dfs -put %s PATH_to_HADOOP' % tmp_output)
I want to enter the path to my hdfs file system.
I do
[ec2-user#ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -ls /
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2014-04-14 22:21 /hbase
drwxrwx--- - hadoop supergroup 0 2014-04-14 22:19 /tmp
I try
[ec2-user#ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -mkdir /tmp/stockmarkets
mkdir: Permission denied: user=ec2-user, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwxrwx---
So, to add ec2-user to use hadoop I followed these instructions:
http://cloudcelebrity.wordpress.com/2013/06/05/handling-permission-denied-error-on-hdfs/
But after I write (am substituting ubuntu their for ec2-user)
sudo adduser ec2-user hadoop
instead of getting an add message, I get :
Usage: useradd [options] LOGIN
Options:
-b, --base-dir BASE_DIR base directory for the home directory of the
new account
-c, --comment COMMENT GECOS field of the new account
-d, --home-dir HOME_DIR home directory of the new account
-D, --defaults print or change default useradd configuration
-e, --expiredate EXPIRE_DATE expiration date of the new account
-f, --inactive INACTIVE password inactivity period of the new account
-g, --gid GROUP name or ID of the primary group of the new
account
-G, --groups GROUPS list of supplementary groups of the new
account
-h, --help display this help message and exit
-k, --skel SKEL_DIR use this alternative skeleton directory
-K, --key KEY=VALUE override /etc/login.defs defaults
-l, --no-log-init do not add the user to the lastlog and
faillog databases
-m, --create-home create the user's home directory
-M, --no-create-home do not create the user's home directory
-N, --no-user-group do not create a group with the same name as
the user
-o, --non-unique allow to create users with duplicate
(non-unique) UID
-p, --password PASSWORD encrypted password of the new account
-r, --system create a system account
-s, --shell SHELL login shell of the new account
-u, --uid UID user ID of the new account
-U, --user-group create a group with the same name as the user
-Z, --selinux-user SEUSER use a specific SEUSER for the SELinux user mapping
So am all confused and screwed.. Please HELP>....

SSH in as hadoop#(publicIP) for Amazon EMR.
From there you can do anything you like with HDFS without having to "su." I just did an mkdir and ran distcp and a streaming job. I do everything as hadoop#, as per the EMR instructions.

If you look at the permission for the HDFS directory /tmp, you can see that /tmp is owned by user hadoop and ec2-user doesn't have permission to created files/directories inside /tmp
Assign proper permission for the directory /tmp, use the following command
[ec2-user#ip-172-31-0-185 input]$ sudo -su hadoop /home/hadoop/bin/hdfs dfs -chmod 777 /tmp
Now try create directory inside /tmp HDFS location
[ec2-user#ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -mkdir /tmp/stockmarkets

Related

Permission is denied when moving file from repository to another

Assume that I want to move a csv file from /home/user to /hdfs/data/adhoc/PR/02/RDO0/OUTPUT/
So :
hadoop fs mkdir -m 777 /hdfs/data/adhoc/PR/02/RDO0/OUTPUT/
hadoop fs -moveFromLocal RDO07J420.csv $OUTPUT_FILE_OCRE/MGM7J420-${OPC_DISO8601}.csv
But, I get this problem :
moveFromLocal: Permission denied: user=fs191, access=WRITE,
inode="/hdfs/data/adhoc/PR/02/RDO0/OUTPUT/MGM7J420-.csv.COPYING":RDO0-mdoPR:bfRDO0:drwxr-x---
You local user does not have write rights in hdfs.
Try
sudo -u hdfs hadoop fs -moveFromLocal RDO07J420.csv $OUTPUT_FILE_OCRE/MGM7J420-${OPC_DISO8601}.csv
hdfs is the root user and has write rights, but I suggest managing users and permissions better
http://www.informit.com/articles/article.aspx?p=2755708&seqNum=3

Unable to change read write permissions to hdfs directory

I am trying to copy text file into hdfs location.
I'm facing Access issue, so I tried changing permissions.
But I'm unable to change the same facing below error:
chaithu#localhost:~$ hadoop fs -put test.txt /user
put: Permission denied: user=chaithu, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
chaithu#localhost:~$ hadoop fs -chmod 777 /user
chmod: changing permissions of '/user': Permission denied. user=chaithu is not the owner of inode=user
chaithu#localhost:~$ hadoop fs -ls /
Found 2 items
drwxrwxrwt - hdfs supergroup 0 2017-12-20 00:23 /tmp
drwxr-xr-x - hdfs supergroup 0 2017-12-20 10:24 /user
Kindly help me how can I change the rights to full read and write for all users to access the HDFS folder.
First off, you shouldn't be writing into the /user folder directly nor set 777 on it
You're going to need a user directory for your current user to even run a mapreduce job, so you need to sudo su - hdfs first to become an HDFS superuser.
Then run these to create HDFS directories for your user account
hdfs dfs -mkdir -p /user/chaithu
hdfs dfs -chown -R chaithu /user/chaithu
hdfs dfs -chmod -R 770 /user/chaithu
Then exit from the hdfs user, and chaithu can now write to its own HDFS directory.
hadoop fs -put test.txt
That alone will put the file in the current user's folder.
Or, if that's too much work for you write to /tmp instead
A lazy option is to rewrite your user account to the super user.
export HADOOP_USER_NAME=hdfs
hadoop fs -put test.txt /user
And this is why hadoop is not secure or enforce user account access by default (i.e. never do this in production)
And finally, you can always just turn permissions completely off in hdfs-site.xml (again, only useful in development phases)
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
If you observe your hdfs dfs -ls result you see that only HDFS super user have the permissions to that path.
you have two solutions here
One is to change the permissions to chaitu through root user and making chaitu as user or owner, something like this hdfs dfs -chown -R hdfs:chaitu /path then you will be able to get access to that being a owner. Other dirty way is to give hdfs dfs -chmod -R 777 /path from the root, from the security stand point this 777 is not good.
Second one is using ACLS which gives you the temporary access
Please go through this link for more understanding.
More on ACLS
This is so basic and important for you to learn, try the above suggested ones and let me know if those don’t work I can help more based on the error you get.

hadoop user file permissions

i have a problem in setting hadoop file permissions in hortonworks and cloudera.
My requirement is:
1. create a new user with new group
2. create user directory in hdfs ( ex. /user/myuser )
3. Now this folder ( in this case /user/myuser ) must be accessible to only user and its group but not other users and other groups.
Following commands are used by me. ( in centos 6)
1.create group >>> groupadd mygroup
2. create new user who belongs to new group >>>> useradd -g mygroup myuser
3. create user directory in hdfs >>> hadoop fs -mkdir /user/myuser
4. changing ownership of the folder >>> hadoop fs -chown -R myuser:mygroup /user/myuser
5. giving permissions to user folder >>> hadoop fs -chmod -R 700 /user/myuser
6. i also changed the /tmp file permission to sticky bit. >>> hadoop fs -chmod -R 1777 /tmp
Here the problem comes, even setting this permissions the other users in other groups are accessing my data. please tell me the solution for this. I turned on hdfs file permissions by setting ( dfs.permission.enabled=true ).
I believe you set the wrong property to enable permissions. You need to set the following property in hdfs-site:
dfs.permissions.enabled = true
This is a good resource for HDFS permissions
You should repeat your step on the master node (active namenode).
After that, run
hdfs dfsadmin -refreshUserToGroupsMappings

No such file or directory error when using Hadoop fs --copyFromLocal command

I have a local VM that has Hortonworks Hadoop and hdfs installed on it. I ssh'ed into the VM from my machine and now I am trying to copy a file from my local filesystem into hdfs through following set of commands:
[root#sandbox ~]# sudo -u hdfs hadoop fs -mkdir /folder1/
[root#sandbox ~]# sudo -u hdfs hadoop fs -copyFromLocal /root/folder1/file1.txt /hdfs_folder1/
When I execute it I get following error as - copyFromLocal:/root/folder1/file1.txt': No such file or directory
I can see that file right in /root/folder1/ directory but with hdfs command its throwing above error. I also tried to cd to /root/folder1/ and then execute the command but same error comes. Why is the file not getting found when it is right there?
By running sudo -u hdfs hadoop fs..., it tries to read the file /root/folder1/file.txt as hdfs.
You can do this.
Run chmod 755 -R /root. It will change permissions on directory and file recursively. But it is not recommended to open up permission on root home directory.
Then you can run the copyFromLocal as sudo -u hdfs to copy file from local file system to hdfs.
Better practice is to create user space for root and copy files directly as root.
sudo -u hdfs hadoop fs -mkdir /user/root
sudo -u hdfs hadoop fs -chown root:root /user/root
hadoop fs -copyFromLocal
I had the same problem running a Hortonworks 4 node cluster. As mentioned, user "hdfs" doesn't have permission to the root directory. The solution is to copy the information from the root folder to something the "hdfs" user can access. In the standard Hortonworks installation this is /home/hdfs
as root run the following...
mkdir /home/hdfs/folder1
cp /root/folder1/file1.txt /home/hdfs/folder1
now change users to hdfs and run from the hdfs USER's accessible directory
su hdfs
cd /home/hdfs/folder1
now you can access files as the hdfs user
hdfs dfs -put file1.txt /hdfs_folder1

How to undo starting hadoop as root (superuser)

When setting up hadoop, I did not know what I was doing and I accidently ended up starting hadoop as super user.
Is there any way to fix this or is it better I remove hadoop and re set it up?
You can remove hadoop from root group gpasswd -d hadoop root
Check the user presence in root group gpasswd -d root hadoop
If you wanted to prevent "hadoop" from getting any admin rights, you could further check that "hadoop" is not in the sudo or admin groups
gpasswd -d hadoop sudo
gpasswd -d Tom admin

Resources