No Such file or directory : hdfs - hadoop

I deployed Kubernetes on a single node using minikube and then installed hadoop and hdfs with helm. It's working well.
The problem is when i try to copy a file from local to hdfs $ hadoop fs -copyFromLocal /data/titles.csv /data i get this: No such file or directory
this is the path on local :

You've shown a screenshot of your host's filesystem GUI details panel.
Unless you mount /data folder inside the k8s pod, there will be no /data folder you can put from.
In other words, you should get a similar error with just ls /data, and this isn't an HDFS problem since "local" means different things in different contexts.
You have at least 3 different "local" filesystems - your host, the namenode pod, the datanode pod, and possibly also the minikube driver (if using a VM)

Related

How to change the container log location in a Dataproc cluster?

What is the correct way to change the container log location in a Dataproc cluster during cluster creation?
The default path is /var/log/hadoop-yarn/userlogs and I want to change it to a local SSD mount such as /mnt/1/hadoop/yarn/userlogs. I tried adding
--properties=yarn:yarn.nodemanager.log-dirs
to the gcloud dataproc clusters create command but got the error -
bash: --properties=yarn:yarn.nodemanager.log-dirs=/mnt/1/hadoop/yarn: No such file or directory
This is most likely because the local SSD gets mounted after the cluster is created. Can someone please help?

0 datanodes when copying file from local to hadoop

My OS is Windows 10.
Ubuntu 20.04.3 LTS (GNU/Linux 4.4.0-19041-Microsoft x86_64) installed on Windows 10.
When I copy the local file to hadoop, I am receiving an error as 0 datanodes available.
I am able to copy the file from hadoop to local folder. I can see the file in local directory using the command $ ls -l
Also I am able to create directory or files in hadoop. But if restart the ubuntu terminal again, there is no such directory or files exist. It shows empty.
The steps I followed:
1. start-all.sh
2. jps
(datanodes missing)
3. copy the local file to hadoop
ERROR as 0 datanodes available
4. copy files from hadoop to local directory successful
If you stop/restart the WSL2 terminal without running stop-dfs or stop-all, you run the risk of corrupting the namenode, and it needs to be reformatted using hadoop namenode -format, not rm the namenode directory.
After formatting, you can restart the datanodes and they should become healthy again.
Same logic applies in a production environment, which is why you should always have a standby namenode for failover

Mounting a part of Hdfs using HDFS Nfs Gateway

1.I am trying to mount HDFS to a local directory using NFS Gateway,I was successful in mounting the root "/",What i want to achieve is mounting only a part of HDFS (say /user) on local file system is there any way by which this can be done
2.Also i want to know how to mount HDFS on a separate machine,not part of the cluster,what i ideally want is a machine acting as NFS drive on which i have mounted a part of HDFS and all client uploading the file to that remote directory
i am using Hortonworks vm 2.2 ,Any suggestion would be very helpul

Hadoop removing a mount point folder from Cloudera

I've searched and I've been reading on Cloudera Hadoop on removing mount point file systems but I cannot find a thing on removing them.
I have two SSD drives in 6 machines and when I initially installed Cloudera Hadoop it added all file systems and I only need two mount points to run a few teragen and terasorts.
I need to remove everything except for:
/dev/nvme0n1 and /dev/nvme1n1
In Cloudera Manager you can modify the list of drives used for HDFS data at:
Clusters > HDFS > Configuration > DataNode Default Group (or whatever you may have renamed this to) > DataNode Data Directory

Hadoop DFSClient installation

I run Hadoop cluster and I'm interested to install one more machine with DFSClient only.
This machine (lets call it machine X) will not be a part of the cluster.
Machine X will run DFSClient and I should be able to see HDFS from it.
In order to install DFSClient, I copied Hadoop home directory from one of cluster's node to machine X (including .jar files and configuration).
Then I run:
hadoop fs -ls /
I get the local ROOT directory (not HDFS ROOT).
What am I doing wrong?
Copy hdfs-site.xml and place in a folder under your local linux account home dir. Then ensure that your name node (default.fs.name) is pointing to the remote namenode. Then try hadoop --config <your_config_folder> fs -ls / where your_config_folder is where you placed your hdfs-site.xml.
Technically it should work if the following steps are done
If you have copied the configuration files(*.xml) from the hadoop cluster.
HADOOP_HOME set with the copied hadoop path.
Machine X should have access to the cluster network

Resources