Mounting a part of Hdfs using HDFS Nfs Gateway - hadoop

1.I am trying to mount HDFS to a local directory using NFS Gateway,I was successful in mounting the root "/",What i want to achieve is mounting only a part of HDFS (say /user) on local file system is there any way by which this can be done
2.Also i want to know how to mount HDFS on a separate machine,not part of the cluster,what i ideally want is a machine acting as NFS drive on which i have mounted a part of HDFS and all client uploading the file to that remote directory
i am using Hortonworks vm 2.2 ,Any suggestion would be very helpul

Related

No Such file or directory : hdfs

I deployed Kubernetes on a single node using minikube and then installed hadoop and hdfs with helm. It's working well.
The problem is when i try to copy a file from local to hdfs $ hadoop fs -copyFromLocal /data/titles.csv /data i get this: No such file or directory
this is the path on local :
You've shown a screenshot of your host's filesystem GUI details panel.
Unless you mount /data folder inside the k8s pod, there will be no /data folder you can put from.
In other words, you should get a similar error with just ls /data, and this isn't an HDFS problem since "local" means different things in different contexts.
You have at least 3 different "local" filesystems - your host, the namenode pod, the datanode pod, and possibly also the minikube driver (if using a VM)

copy a file from wsl to hdfs running on docker

I'm trying to copy a file from my local drive to hdfs.
I'm running Hadoop on docker as an image. I try to perform some exercise on MapReduce, therefore, I want to copy a data file from a local drive (let's say my d: drive) to hdfs.
i tried below command but it fails with ssh: connect to host localhost port 22: Connection refused:
scp -P 50070 /mnt/d/project/recreate.out root#localhost:/root
since I'm new to Hadoop and big data my explanation may terrible. Please tolerate with me.
I'm trying to do above things from windows subsystem for Linux (WSL)
Regards,
crf
SCP won't move data to Hadoop. And port 50070 is not accepting connections over that protocol (SSH)
You need to setup and use a command similar to hdfs dfs -copyFromLocal. You can install the HDFS cli on the Windows host command prompt, too, so you don't need WSL to upload files...
When using Docker, I would suggest doing this
Add a volume mount from your host to some Hadoop container outside of the datanode and namenode directories (in other words, don't override the data that is there, and mounting files here will not "upload to HDFS")
docker exec into this running container
Run above hdfs command, uploading from the mounted volume

Copying a directory from a remote HDFS local file system to my local machine

I have a directory on my local hdfs environment, I want to copy it to my local computer. I am accessing the hdfs using ssh (with a password).
I tried many suggested copy command but did not work.
What I tried:
scp ‘username#hn0-sc-had:Downloads/*’ ~/Downloads
as mentioned in this link.
What am I doing wrong?
SCP will copy from the remote Linux server.
HDFS does not exist on a single server or is a "local filesystem", therefore SCP is not the right tool to copy from it directly
Your options include
SSH to remote server
Use hdfs dfs -copyToLocal in order to pull files from HDFS
Use SCP from your computer to get the files you just downloaded on the remote server
Or
Configure a local Hadoop CLI using XML files from remote server
Use hdfs dfs -copytoLocal directly against HDFS from your own computer
Or
Install HDFS NFS Gateway
Mount an NFS volume on your local computer, and copy of files from it

Is it possible to create a HDFS folder in memory?

Hadoop now supports mount a ramdisk (tmpfs), is it possible to create a hdfs folder pure in tmpfs, like all files in "/usr/memory" locate in ramdisk

How to copy files from a remote server to hdfs location

I want to copy files from a remote server using sftp to an hdfs location directly without copying the files to local. The hdfs location is a secured cluster. Please suggest if this is feasible and how to proceed in that case.
Also I would want to know if there is any other way to connect and copy apart from sftp.
I think the most convenient way (given that your remote machine is able to connect to the hadoop cluster) is to make that remote machine act as an HDFS client. Just ssh to that machine, install the hadoop distribution, configure it properly, then run:
hadoop fs -put /local/path /hdfs/path

Resources