How to undo starting hadoop as root (superuser) - hadoop

When setting up hadoop, I did not know what I was doing and I accidently ended up starting hadoop as super user.
Is there any way to fix this or is it better I remove hadoop and re set it up?

You can remove hadoop from root group gpasswd -d hadoop root
Check the user presence in root group gpasswd -d root hadoop
If you wanted to prevent "hadoop" from getting any admin rights, you could further check that "hadoop" is not in the sudo or admin groups
gpasswd -d hadoop sudo
gpasswd -d Tom admin

Related

Permission denied: user=basi, access=WRITE, inode="/":

Im a fresher in hadoop and pig.i have installed pig in my local user in ubuntu and hadoop as hduser.Pig working fine in local mode for small datasets.started pig in mapreduce mode and tryng to implement wordcount but getting permission denied error as below.
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=basi, access=WRITE, inode="/":hduser:supergroup:drwxr-xr-x
started hadoop in psudomode
statrted pig in local user:pig -x mapreduce
grunt> A = LOAD '/Wordcount.txt' AS (line:Chararray);
grunt> B = FOREACH A GENERATE FLATTEN(TOKENIZE(line)) AS word;
grunt> grouped = group B by word;
grunt> wc = FOREACH grouped GENERATE group, COUNT(B);
grunt> DUMP wc
/Wordcount.txt is file in hdfs
Its not clear how you loaded /Wordcount.txt into the root folder, but the error is saying you're trying to write into the root directory, which is only possible as the hduser account, not basi, your local user.
One option - switch to the other user.
Otherwise, don't use the root of HDFS as the dumping ground for all files; use your dedicated /user directory
It is not Pig but Hadoop related. It happened to me with Spark. Probably you installed your Hadoop manually. You need to create supergroup and add hduser into supergroup.
sudo groupadd supergroup
sudo usermod -aG supergroup hduser
Then try again.
proceed as below
chmod 777 /Wordcount.txt
chmod change the permission of text file as rwxrwxrwx for owner group and other respectively
and then provide complete location of text file in the load command similar to below
grunt> A = LOAD '/directory/abc/Wordcount.txt' AS (line:Chararray);
then run the code again...
hopes this will help you out.
In Pig, DUMP command would first write its output to /tmp/temp.... and then the client reads from it. My guess is, your cluster does not have /tmp. If that is the case, please try creating the /tmp directory (usually with permission 1777).
(Edited: Reading answers of others, I think the one about /user makes sense. Without it, you won't even be able to submit any jobs.)

hadoop user file permissions

i have a problem in setting hadoop file permissions in hortonworks and cloudera.
My requirement is:
1. create a new user with new group
2. create user directory in hdfs ( ex. /user/myuser )
3. Now this folder ( in this case /user/myuser ) must be accessible to only user and its group but not other users and other groups.
Following commands are used by me. ( in centos 6)
1.create group >>> groupadd mygroup
2. create new user who belongs to new group >>>> useradd -g mygroup myuser
3. create user directory in hdfs >>> hadoop fs -mkdir /user/myuser
4. changing ownership of the folder >>> hadoop fs -chown -R myuser:mygroup /user/myuser
5. giving permissions to user folder >>> hadoop fs -chmod -R 700 /user/myuser
6. i also changed the /tmp file permission to sticky bit. >>> hadoop fs -chmod -R 1777 /tmp
Here the problem comes, even setting this permissions the other users in other groups are accessing my data. please tell me the solution for this. I turned on hdfs file permissions by setting ( dfs.permission.enabled=true ).
I believe you set the wrong property to enable permissions. You need to set the following property in hdfs-site:
dfs.permissions.enabled = true
This is a good resource for HDFS permissions
You should repeat your step on the master node (active namenode).
After that, run
hdfs dfsadmin -refreshUserToGroupsMappings

No such file or directory error when using Hadoop fs --copyFromLocal command

I have a local VM that has Hortonworks Hadoop and hdfs installed on it. I ssh'ed into the VM from my machine and now I am trying to copy a file from my local filesystem into hdfs through following set of commands:
[root#sandbox ~]# sudo -u hdfs hadoop fs -mkdir /folder1/
[root#sandbox ~]# sudo -u hdfs hadoop fs -copyFromLocal /root/folder1/file1.txt /hdfs_folder1/
When I execute it I get following error as - copyFromLocal:/root/folder1/file1.txt': No such file or directory
I can see that file right in /root/folder1/ directory but with hdfs command its throwing above error. I also tried to cd to /root/folder1/ and then execute the command but same error comes. Why is the file not getting found when it is right there?
By running sudo -u hdfs hadoop fs..., it tries to read the file /root/folder1/file.txt as hdfs.
You can do this.
Run chmod 755 -R /root. It will change permissions on directory and file recursively. But it is not recommended to open up permission on root home directory.
Then you can run the copyFromLocal as sudo -u hdfs to copy file from local file system to hdfs.
Better practice is to create user space for root and copy files directly as root.
sudo -u hdfs hadoop fs -mkdir /user/root
sudo -u hdfs hadoop fs -chown root:root /user/root
hadoop fs -copyFromLocal
I had the same problem running a Hortonworks 4 node cluster. As mentioned, user "hdfs" doesn't have permission to the root directory. The solution is to copy the information from the root folder to something the "hdfs" user can access. In the standard Hortonworks installation this is /home/hdfs
as root run the following...
mkdir /home/hdfs/folder1
cp /root/folder1/file1.txt /home/hdfs/folder1
now change users to hdfs and run from the hdfs USER's accessible directory
su hdfs
cd /home/hdfs/folder1
now you can access files as the hdfs user
hdfs dfs -put file1.txt /hdfs_folder1

Hadoop Webhdfs Delete option over Amazon EMR failed

i'm trying to see if delete option works over webhdfs :
http://ec2-ab-cd-ef-hi.compute-1.amazonaws.com:14000/webhdfs/v1/user/barak/barakFile.csv?op=DELETE&user.name=hadoop
but i get an error:
{"RemoteException":{"message":"Invalid HTTP GET operation [DELETE]",
"exception":"IOException","javaClassName":"java.io.IOException"}}
This file has all privilege ( 777 ) .
[hadoop#ip-172-99-9-99 ~]$ hadoop fs -ls hdfs:///user/someUser
Found 2 items
-rwxrwxrwx 1 hadoop hadoop 344 2015-12-10 08:33 hdfs:///user/someUser/someUser.csv
what else should i check for allowing in order to allow delete option over Amazon EMR WEBHDFS
You need to use curl -i -X command like this
curl -i -X DELETE "http://ec2-**-**-**-***.compute-1.amazonaws.com:14000/webhdfs/v1/user/hadoop/hdfs-site.xml?op=DELETE&user.name=hadoop"
I had the needed privileges for a file but didn't have all the needed privileges for the directory. changing permission for the entire Path solved it.

Adding ec2-user to use hadoop

I have setup AWS EMR. I SSH into master node. I wanted to copy a file into hdfs system. That small line of code in my program which does this is:
os.system('/home/hadoop/bin/hdfs dfs -put %s PATH_to_HADOOP' % tmp_output)
I want to enter the path to my hdfs file system.
I do
[ec2-user#ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -ls /
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2014-04-14 22:21 /hbase
drwxrwx--- - hadoop supergroup 0 2014-04-14 22:19 /tmp
I try
[ec2-user#ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -mkdir /tmp/stockmarkets
mkdir: Permission denied: user=ec2-user, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwxrwx---
So, to add ec2-user to use hadoop I followed these instructions:
http://cloudcelebrity.wordpress.com/2013/06/05/handling-permission-denied-error-on-hdfs/
But after I write (am substituting ubuntu their for ec2-user)
sudo adduser ec2-user hadoop
instead of getting an add message, I get :
Usage: useradd [options] LOGIN
Options:
-b, --base-dir BASE_DIR base directory for the home directory of the
new account
-c, --comment COMMENT GECOS field of the new account
-d, --home-dir HOME_DIR home directory of the new account
-D, --defaults print or change default useradd configuration
-e, --expiredate EXPIRE_DATE expiration date of the new account
-f, --inactive INACTIVE password inactivity period of the new account
-g, --gid GROUP name or ID of the primary group of the new
account
-G, --groups GROUPS list of supplementary groups of the new
account
-h, --help display this help message and exit
-k, --skel SKEL_DIR use this alternative skeleton directory
-K, --key KEY=VALUE override /etc/login.defs defaults
-l, --no-log-init do not add the user to the lastlog and
faillog databases
-m, --create-home create the user's home directory
-M, --no-create-home do not create the user's home directory
-N, --no-user-group do not create a group with the same name as
the user
-o, --non-unique allow to create users with duplicate
(non-unique) UID
-p, --password PASSWORD encrypted password of the new account
-r, --system create a system account
-s, --shell SHELL login shell of the new account
-u, --uid UID user ID of the new account
-U, --user-group create a group with the same name as the user
-Z, --selinux-user SEUSER use a specific SEUSER for the SELinux user mapping
So am all confused and screwed.. Please HELP>....
SSH in as hadoop#(publicIP) for Amazon EMR.
From there you can do anything you like with HDFS without having to "su." I just did an mkdir and ran distcp and a streaming job. I do everything as hadoop#, as per the EMR instructions.
If you look at the permission for the HDFS directory /tmp, you can see that /tmp is owned by user hadoop and ec2-user doesn't have permission to created files/directories inside /tmp
Assign proper permission for the directory /tmp, use the following command
[ec2-user#ip-172-31-0-185 input]$ sudo -su hadoop /home/hadoop/bin/hdfs dfs -chmod 777 /tmp
Now try create directory inside /tmp HDFS location
[ec2-user#ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -mkdir /tmp/stockmarkets

Resources