Can not create directory on hdfs of hadoop - hadoop

I can not operate my hdfs about permission. I executed hadoop fs -ls / and return
drwx------ - ubuntu supergroup 0 2015-09-02 09:58 /tmp
user of OS is user1,
How to change the user of hdfs to user1? I add dfs.permissions.enabled to false in hdfs-site.xml, and format hdfs again, but the problem still exists.
could anybody help me ?

There in no affair of OS user, namely user1.
You communicate with hadoop server via a hadoop client.
So you should check you hadoop client's conf, on my machine hadoop-site.xml.
In hadoop-site.sml, you set the corresponding username and password to your hdfs
<property>
<name>hadoop.job.ugi</name>
<value>yourusername,yourpassword</value>
<description>username, password used by client</description>
</property>

Related

Difference between Superuser and supergroup in Hadoop

What is supergroup and superuser in Hadoop/HDFS?
Superuser
Based on the Hadoop official documentation:
The super-user is the user with the same identity as the name node process itself. Loosely, if you started the name node, then you are the super-user. The super-user can do anything in that permissions checks never fail for the super-user.
Supergroup
Supergroup is the group of superusers. This group is used to ensure that the Hadoop Client has superuser access. It can be configured using dfs.permissions.superusergroup property in the core-site.xml file.
References
Hadoop superuser and supergroup

Hadoop Hive: How to allow regular user continuously write data and create tables in warehouse directory?

I am running Hadoop 2.2.0.2.0.6.0-101 on a single node.
I am trying to run Java MRD program that writes data to an existing Hive table from Eclipse under regular user. I get exception:
org.apache.hadoop.security.AccessControlException: Permission denied: user=dev, access=WRITE, inode="/apps/hive/warehouse/testids":hdfs:hdfs:drwxr-xr-x
This happens because regular user has no write permission to warehouse directory, only hdfs user does:
drwxr-xr-x - hdfs hdfs 0 2014-03-06 16:08 /apps/hive/warehouse/testids
drwxr-xr-x - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
To circumvent this I change permissions on warehouse directory, so everybody now have write permissions:
[hdfs#localhost wks]$ hadoop fs -chmod -R a+w /apps/hive/warehouse
[hdfs#localhost wks]$ hadoop fs -ls /apps/hive/warehouse
drwxrwxrwx - hdfs hdfs 0 2014-03-06 16:08 /apps/hive/warehouse/testids
drwxrwxrwx - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
This helps to some extent, and MRD program can now write as a regular user to warehouse directory, but only once. When trying to write data into the same table second time I get:
ERROR security.UserGroupInformation: PriviledgedActionException as:dev (auth:SIMPLE) cause:org.apache.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.testids
Now, if I delete output table and create it anew in hive shell, I again get default permissions that do not allow regular user to write data into this table:
[hdfs#localhost wks]$ hadoop fs -ls /apps/hive/warehouse
drwxr-xr-x - hdfs hdfs 0 2014-03-11 12:19 /apps/hive/warehouse/testids
drwxrwxrwx - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
Please advise on Hive correct configuration steps that will allow a program run as a regular user do the following operations in Hive warehouse:
Programmatically create / delete / rename Hive tables?
Programmatically read / write data from Hive tables?
Many thanks!
If you maintain the table from outside Hive, then declare the table as external:
An EXTERNAL table points to any HDFS location for its storage, rather than being stored in a folder specified by the configuration property hive.metastore.warehouse.dir.
A Hive administrator can create the table and it can point it toward your own user owned HDFS storage location and you grant Hive permission to read from there.
As a general comment, there are no ways for an unprivileged user to do an unauthorized privileged action. Any such way is technically an exploit and you should never rely on it: even if is possible today, it will likely be closed soon. Hive Authorization (and HCatalog authorization) is orthogonal to HDFS authorization.
Your application is also incorrect, irrelevant of authorization issues. You are trying to write 'twice' in the same table which means your application does not handle partitions correctly. Start from An Introduction to Hive’s Partitioning.
You can configure for hdfs-site.xml such as:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
This configure will disable permissions on HDFS. So, a regular user can do the operations on HDFS.
I hope this solve will help you.

Not able to access HDFS from the Datanodes in clusture

Have installed cloudera cdh4 on 3 node cluster.Facing problem when trying to access the data in HDFS through slave nodes(Datanodes).
When I am trying to create the new folder in HDFS using
hadoop fs -mkdir Flume(Foldername)
command not able to put the data or create the folder in the hdfs of the cluster from either of the slaves,but working from the master node,also flume ,hive ,pig all other process are running in the slaves
(Datanodes)
Tried
restarting the cluster
namenode format
Still not working!!
Secondly When I am doing
hadoop fs -ls /
results are not of from the hdfs but from the current directory path of the slave node from where I am usin this command.
And how to check if hdfs is working and installed properly in slave nodes(Datanodes) in cluster apart from creating the directory in HDFS.
Could anybody help?
Please verify the property "fs.default.name" in the core-site.xml.
<property>
<name>fs.default.name</name>
<value>hdfs://namenode:9000</value>
</property>

hadoop hdfs points to file:/// not hdfs://

So I installed Hadoop via Cloudera Manager cdh3u5 on CentOS 5. When I run cmd
hadoop fs -ls /
I expected to see the contents of hdfs://localhost.localdomain:8020/
However, it had returned the contents of file:///
Now, this goes without saying that I can access my hdfs:// through
hadoop fs -ls hdfs://localhost.localdomain:8020/
But when it came to installing other applications such as Accumulo, accumulo would automatically detect Hadoop Filesystem in file:///
Question is, has anyone ran into this issue and how did you resolve it?
I had a look at HDFS thrift server returns content of local FS, not HDFS , which was a similar issue, but did not solve this issue.
Also, I do not get this issue with Cloudera Manager cdh4.
By default, Hadoop is going to use local mode. You probably need to set fs.default.name to hdfs://localhost.localdomain:8020/ in $HADOOP_HOME/conf/core-site.xml.
To do this, you add this to core-site.xml:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost.localdomain:8020/</value>
</property>
The reason why Accumulo is confused is because it's using the same default configuration to figure out where HDFS is... and it's defaulting to file://
We should specify data node data directory and name node meta data directory.
dfs.name.dir,
dfs.namenode.name.dir,
dfs.data.dir,
dfs.datanode.data.dir,
fs.default.name
in core-site.xml file and format name node.
To format HDFS Name Node:
hadoop namenode -format
Enter 'Yes' to confirm formatting name node. Restart HDFS service and deploy client configuration to access HDFS.
If you have already did the above steps. Ensure client configuration is deployed correctly and it points to the actual cluster endpoints.

Permission denied at hdfs

I am new to hadoop distributed file system, I have done complete installation of hadoop single node on my machine.but after that when i am going to upload data to hdfs it give an error message Permission Denied.
Message from terminal with command:
hduser#ubuntu:/usr/local/hadoop$ hadoop fs -put /usr/local/input-data/ /input
put: /usr/local/input-data (Permission denied)
hduser#ubuntu:/usr/local/hadoop$
After using sudo and adding hduser to sudouser:
hduser#ubuntu:/usr/local/hadoop$ sudo bin/hadoop fs -put /usr/local/input-data/ /inwe
put: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="":hduser:supergroup:rwxr-xr-x
hduser#ubuntu:/usr/local/hadoop$
I solved this problem temporary by disabling the dfs permission.By adding below property code
to conf/hdfs-site.xml
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
I had similar situation and here is my approach which is somewhat different:
HADOOP_USER_NAME=hdfs hdfs dfs -put /root/MyHadoop/file1.txt /
What you actually do is you read local file in accordance to your local permissions but when placing file on HDFS you are authenticated like user hdfs. You can do this with other ID (beware of real auth schemes configuration but this is usually not a case).
Advantages:
Permissions are kept on HDFS.
You don't need sudo.
You don't need actually appropriate local user 'hdfs' at all.
You don't need to copy anything or change permissions because of previous points.
You are experiencing two separate problems here:
hduser#ubuntu:/usr/local/hadoop$ hadoop fs -put /usr/local/input-data/ /input put: /usr/local/input-data (Permission denied)
Here, the user hduser does not have access to the local directory /usr/local/input-data. That is, your local permissions are too restrictive. You should change it.
hduser#ubuntu:/usr/local/hadoop$ sudo bin/hadoop fs -put /usr/local/input-data/ /inwe put: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="":hduser:supergroup:rwxr-xr-x
Here, the user root (since you are using sudo) does not have access to the HDFS directory /input. As you can see: hduser:supergroup:rwxr-xr-x says only hduser has write access. Hadoop doesn't really respect root as a special user.
To fix this, I suggest you change the permissions on the local data:
sudo chmod -R og+rx /usr/local/input-data/
Then, try the put command again as hduser.
I've solved this problem by using following steps
su hdfs
hadoop fs -put /usr/local/input-data/ /input
exit
Start a shell as hduser (from root) and run your command
sudo -u hduser bash
hadoop fs -put /usr/local/input-data/ /input
[update]
Also note that the hdfs user is the super user and has all r/w privileges.
For Hadoop 3.x, if you try to create a file on HDFS when unauthenticated (e.g. user=dr.who) you will get this error.
It is not recommended for systems that need to be secure, however if you'd like to disable file permissions entirely in Hadoop 3 the hdfs-site.xml setting has changed to:
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

Resources