unable to load hadoop fs - hadoop

I have installed hadoop on Ubuntu 4.4.3.I have followed all steps written in here.When I ran a command hadoop fs -ls . I got following output.
hduser#ubuntu:/usr/local/hadoop/sbin$ hadoop fs -ls /
Found 26 items
drwx------ - root root 16384 2010-04-04 05:08 /lost+found
drwxr-xr-x - root root 4096 2012-08-25 09:12 /bin
drwxr-xr-x - root root 4096 2009-10-28 13:55 /srv
-rw-r--r-- 1 root root 7986235 2012-08-25 09:29 /initrd.img
dr-xr-xr-x - root root 0 2013-09-01 15:57 /proc
drwx------ - root root 4096 2013-09-01 11:04 /root
drwxrwxrwx - root root 4096 2012-08-26 05:12 /opt
drwxr-xr-x - root root 4096 2010-04-04 05:29 /mnt
drwxr-xr-x - root root 4096 2009-10-28 13:55 /usr
drwxr-xr-x - root root 4096 2010-04-04 05:09 /cdrom
drwxr-xr-x - root root 0 2013-09-01 15:57 /sys
drwxr-xr-x - hduser hadoop 4096 2013-08-25 03:47 /app
drwxr-xr-x - root root 4096 2010-11-24 10:50 /var
-rw-r--r-- 1 root root 4050496 2012-07-25 09:53 /vmlinuz
-rw-r--r-- 1 root root 3890400 2009-10-16 11:03 /vmlinuz.old
drwxr-xr-x - root root 4096 2010-11-27 08:37 /.cache
drwxr-xr-x - root root 4096 2013-09-01 22:26 /media
-rw-r--r-- 1 root root 7233695 2012-08-25 08:53 /initrd.img.old
drwxr-xr-x - root root 12288 2013-09-01 22:46 /etc
drwxr-xr-x - root root 4096 2013-08-25 03:30 /home
drwxr-xr-x - root root 3980 2013-09-01 15:57 /dev
drwxr-xr-x - root root 12288 2012-08-25 22:07 /lib
drwxrwxrwt - root root 4096 2013-09-01 23:53 /tmp
drwxr-xr-x - root root 4096 2012-08-25 09:29 /boot
drwxr-xr-x - root root 4096 2009-10-19 16:05 /selinux
drwxr-xr-x - root root 4096 2012-08-25 09:04 /sbin
When I run same command in our office lab , I didnt get this op.
Can anyone tell me where I am going wrong ?

Try bin/hadoop fs -ls /. Scripts should be present inside bin folder. Have you followed the link, you have shown, properly?I don't find sbin anywhere in this. Could you please point me to it, if I am wrong.

The reason is you did not configure your core-site.xml correctly with the attribute "fs.default.name". When you don't configure it or you incorrectly configure it, the filesystem will be the default one which is your local file system. So rightly it is listing the root of your local file system.
Please check your core-site.xml carefully and also you need to start the DFS before using the HDFS.

Related

Docker: Got "permission denied" error at volume mounting directory

I wrote a docker-compose.yml like this:
version: "3"
services:
notebook:
image: jupyter/datascience-notebook
ports:
- "8888:8888"
volumes:
- jupyterlabPermanent:/hahaha
environment:
JUPYTER_ENABLE_LAB: "yes"
TZ: "Asia/Tokyo"
command:
start-notebook.sh --NotebookApp.token=''
volumes:
jupyterlabPermanent:
Let me make it clear that what characters are appearing on the stage.
\hahaha: container side directory which is located at the root directory
jupyterlabPermanent: volume which is mounted by hahaha the container side directory.
dockerjulia_jupyterlabPermanent\_data: host side directory secured for volume jupyterlabPermanent which syncronize the data located in \hahaha.Full path to dockerjulia_jupyterlabPermanent\_data is \\wsl$\docker-desktop-data\version-pack-data\community\docker\volumes\dockerjulia_jupyterlabPermanent\_data.
When I use touch command on bash at \hahaha directory, I get permission denied
# bash command line at \hahaha
(base) jovyan#4bcdaa228d9e:/hahaha$ touch test.txt
touch: cannot touch 'test.txt': Permission denied
Because of this, every tasks done in the container cannot be stored in the \hahaha and jupyterlabPermanent volume, and this means data saving is not working in this environment.
How can I solve this?
I searched a bit for this, and found I need to change the configuration of permission, but I don't understand it.
I am using Docker Desktop for Windows with WSL 2 on Windows 10 Home.
You need root access on the volume to change the permissions. So let's run a plain Ubuntu container and mount the volume
docker run -it --rm -v jupyterlabPermanent:/hahaha ubuntu
now we can change the group ownership to GID 100 which is the group the jovyan user is a member of and also change the permissions to 775 so group members can write to it
chown :100 /hahaha
chmod 775 /hahaha
Now you can exit the Ubuntu container and run the jupyter container and you should be able to write to the volume.
Thank you for answering my question. The main problem was that I didn't know the existence of the concept of "owner" and "permission" of Linux system. But, an hour of research and learning let me figure out what the problem here is.
My solution 1
My first solution is to try the following command line on Host console:
docker exec -it -u 0 CONATAINER_NAME /bin/bash
Adding -u option and designating 0, the User ID of root, lets you dive into the container as you are root.
As I checked using ll command at the top directory of the container, the permissions of the files and folders at the top directory of the container appears to be dominated by root, and the hahaha is the one of them.(It means docker-compose.yml created hahaha directory for volume at the top directory)
(base) jovyan#4bcdaa228d9e:/$ ll
total 64
drwxr-xr-x 1 root root 4096 Jan 26 00:19 ./
drwxr-xr-x 1 root root 4096 Jan 26 00:19 ../
lrwxrwxrwx 1 root root 7 Jan 6 01:47 bin -> usr/bin/
drwxr-xr-x 2 root root 4096 Apr 15 2020 boot/
drwxr-xr-x 5 root root 340 Jan 26 00:19 dev/
-rwxr-xr-x 1 root root 0 Jan 26 00:19 .dockerenv*
drwxr-xr-x 1 root root 4096 Jan 26 00:19 etc/
drwxr-xr-x 2 root root 4096 Jan 27 22:18 hahaha/
drwxr-xr-x 1 root root 4096 Jan 24 20:30 home/
lrwxrwxrwx 1 root root 7 Jan 6 01:47 lib -> usr/lib/
lrwxrwxrwx 1 root root 9 Jan 6 01:47 lib32 -> usr/lib32/
lrwxrwxrwx 1 root root 9 Jan 6 01:47 lib64 -> usr/lib64/
lrwxrwxrwx 1 root root 10 Jan 6 01:47 libx32 -> usr/libx32/
drwxr-xr-x 2 root root 4096 Jan 6 01:47 media/
drwxr-xr-x 2 root root 4096 Jan 6 01:47 mnt/
drwxr-xr-x 1 root root 4096 Jan 25 02:49 opt/
dr-xr-xr-x 217 root root 0 Jan 26 00:19 proc/
drwx------ 2 root root 4096 Jan 6 01:50 root/
drwxr-xr-x 5 root root 4096 Jan 6 01:50 run/
lrwxrwxrwx 1 root root 8 Jan 6 01:47 sbin -> usr/sbin/
drwxr-xr-x 2 root root 4096 Jan 6 01:47 srv/
dr-xr-xr-x 11 root root 0 Jan 26 00:19 sys/
drwxrwxrwt 2 root root 4096 Jan 6 01:50 tmp/
drwxr-xr-x 1 root root 4096 Jan 6 01:47 usr/
drwxr-xr-x 1 root root 4096 Jan 6 01:50 var/
Therefore, there was no permission for jovyan to touch something at hahaha at the top directory dominated only by root, and this is made it by diving into the container as root.
My solution 2
The second solution is to rewrite the docker-compose.yml as follows:
version: "3"
services:
notebook:
image: jupyter/datascience-notebook
ports:
- "8888:8888"
volumes:
#
- jupyterlabPermanent:/home/jovyan/hahaha # before -> jupyterlabPermanent:/hahaha
environment:
JUPYTER_ENABLE_LAB: "yes"
TZ: "Asia/Tokyo"
command:
start-notebook.sh --NotebookApp.token=''
volumes:
jupyterlabPermanent:
This change lets docker-compose create container as volume mounting directory is set at /home/jovyan/hahaha.
The files and folders under /home/jovyan is owned and by jovyan(not by root) so jovyan can touch some files at /home/jovyan/hahaha freely. (No need to dive into the container as root)

Retiring the once only volume, holding important looking files

/volume1 was once my only volume, and it's has been joined by /volume2 in preparation for retiring /volume1.
Having relocated all my content I can see lots of files I cannot explain. Unusually they are all prefixed with #, e.g.
/volume1$ ls -als
total 430144
0 drwxr-xr-x 1 root root 344 May 2 16:19 .
4 drwxr-xr-x 24 root root 4096 May 2 16:18 ..
0 drwxr-xr-x 1 root root 156 Jun 29 15:57 #appstore
0 drwx------ 1 root root 0 Apr 11 04:03 #autoupdate
0 drwxr-xr-x 1 root root 14 May 2 16:19 #clamav
332 -rw------- 1 root root 339245 Jan 23 13:50 #cnid_dbd.core.gz
0 drwxr-xr-x 1 admin users 76 Aug 19 2020 #database
0 drwx--x--x 1 root root 174 Jun 29 15:57 #docker
0 drwxrwxrwx+ 1 root root 24 Jan 23 15:27 #eaDir
420400 -rw------- 1 root root 430485906 Jan 4 05:06 #G1.core.gz
0 drwxrwxrwx 1 root root 12 Jan 21 13:47 #img_bkp_cache
0 drwxr-xr-x 1 root root 14 Dec 29 18:45 #maillog
0 drwxr-xr-x 1 root root 60 Dec 29 18:39 #MailScanner
0 drwxrwxr-x 1 root root 106 Oct 7 2018 #optware
7336 -rw------- 1 root root 7510134 Jan 24 01:33 #Plex.core.gz
0 drwxr-xr-x 1 postfix root 166 Oct 12 2020 #postfix
2072 -rw------- 1 root root 2118881 Jan 17 03:47 #rsync.core.gz
0 drwxr-xr-x 1 root root 88 May 2 16:19 #S2S
0 drwxr-xr-x 1 root root 0 Jan 23 13:50 #sharesnap
0 drwxrwxrwt 1 root root 48 Jun 29 15:57 #tmp
I have two questions
what does the # prefix signify, and
how can I move/remove them, given that something's going to miss these files.
From experimentation it seems the answers are:
Nothing - they're a convention used by the Synology packaging system, it appears.
With one exception I didn't need to consider the consequences of removing the file system on which these stood. The #appstore directory clearly holds the installed Synology packages, and after pulling /volume1 they showed in the Package Center as "needing repair". Once they were repaired, the same # prefixed directories appeared in the new volume - and the configuration was retained - so it appears these directories hold only the immutable software components.
The exception: I use ipkg mostly for fetchmail. I took a listing of the installed packages as well as the fetchmailrc, and then reinstalled the same packages once "Easy Bootstrap Installer" was ready for use (repair didn't work on this, but uninstall and reinstall worked fine).

certbot creating a new certificate every day

I have a script that auto renews a let's encrypt certificate when it becomes available. We run this script everyday at 17:00
#!/bin/sh
/usr/bin/certbot --cert-name sitename.com --text --agree-tos certonly -a webroot --keep-until-expiring --webroot-path /var/www/path/public -d sitename.com -d www.sitename.com
Recently i've seen that a new certificate is getting generated with a new directory every day with a 00XX suffix.
There has been no chance to this file since it was created (19th August)
So /etc/letsencrypt/archive looks like this:
drwxr-xr-x. 2 root root 4096 Nov 11 09:45 sitename.com
drwxr-xr-x. 2 root root 4096 Aug 19 16:39 sitename.com-0001
drwxr-xr-x. 2 root root 4096 Aug 19 16:43 sitename.com-0002
drwxr-xr-x. 2 root root 4096 Oct 16 17:00 sitename.com-0003
drwxr-xr-x. 2 root root 4096 Oct 17 17:00 sitename.com-0004
drwxr-xr-x. 2 root root 4096 Oct 18 17:00 sitename.com-0005
drwxr-xr-x. 2 root root 4096 Oct 19 17:00 sitename.com-0006
drwxr-xr-x. 2 root root 4096 Oct 20 17:00 sitename.com-0007
drwxr-xr-x. 2 root root 4096 Oct 23 17:00 sitename.com-0008
drwxr-xr-x. 2 root root 4096 Oct 24 17:00 sitename.com-0009
drwxr-xr-x. 2 root root 4096 Oct 25 17:01 sitename.com-0010
drwxr-xr-x. 2 root root 4096 Oct 26 17:00 sitename.com-0011
drwxr-xr-x. 2 root root 4096 Oct 27 17:00 sitename.com-0012
drwxr-xr-x. 2 root root 4096 Oct 30 17:00 sitename.com-0013
drwxr-xr-x. 2 root root 4096 Oct 31 17:00 sitename.com-0014
drwxr-xr-x. 2 root root 4096 Nov 1 17:00 sitename.com-0015
drwxr-xr-x. 2 root root 4096 Nov 2 17:00 sitename.com-0016
drwxr-xr-x. 2 root root 4096 Nov 3 17:00 sitename.com-0017
drwxr-xr-x. 2 root root 4096 Nov 6 17:00 sitename.com-0018
drwxr-xr-x. 2 root root 4096 Nov 7 17:00 sitename.com-0019
drwxr-xr-x. 2 root root 4096 Nov 8 17:00 sitename.com-0020
drwxr-xr-x. 2 root root 4096 Nov 9 17:01 sitename.com-0021
drwxr-xr-x. 2 root root 4096 Nov 10 17:00 sitename.com-0022
I believe that -0001 and -0002 were created because of a misconfiguration when the certificate was first generated.
But can anybody help explain why a certificate and directory has been created each day since october 16th?
I managed to figure out the problem.
After running
/usr/bin/certbot certificates
There was error:
Renewal configuration file /etc/letsencrypt/renewal/sitename.com.conf produced an unexpected error: renewal config file {} is missing a required file reference. Skipping.
And it appears that there is a conf file for each of the -00XX directories.
Upon looking at /etc/letsencrypt/renewal/sitename.com.conf I found that the file was empty.
So I took the latest -00XX conf file and removed the 00XX suffix from the text lines.
The conf file should appear like this:
# renew_before_expiry = 30 days
version = 1.0.0
archive_dir = /etc/letsencrypt/archive/sitename.com
cert = /etc/letsencrypt/live/sitename.com/cert.pem
privkey = /etc/letsencrypt/live/sitename.com/privkey.pem
chain = /etc/letsencrypt/live/sitename.com/chain.pem
fullchain = /etc/letsencrypt/live/sitename.com/fullchain.pem
# Options used in the renewal process
[renewalparams]
authenticator = webroot
account = XXXXXXXXXXXXXXXXXXXXXXX
webroot_path = /var/www/path/public,
server = https://acme-v02.api.letsencrypt.org/directory
But this will still create a new directory and certificate for each day
You will need to prepend this to your conf file, include a line for each domain associated with your certificate.
[[webroot_map]]
sitename.com = /var/www/path/public
www.sitename.com = /var/www/path/public
And that should prevent certificates being generated everyday.
I believe that the problem is a result of removing domains from a certificate and manually renewing.

Developing inside docker on WSL2-Ubuntu from vscode

I am trying run docker inside WSL (am running Ubuntu in WSL). Also am new to docker. The doc says:
To get the best out of the file system performance when bind-mounting files:
Store source code and other data that is bind-mounted into Linux containers (i.e., with docker run -v <host-path>:<container-path>) in the Linux filesystem, rather than the Windows filesystem.
Linux containers only receive file change events (“inotify events”) if the original files are stored in the Linux filesystem.
Performance is much higher when files are bind-mounted from the Linux filesystem, rather than remoted from the Windows host. Therefore avoid docker run -v /mnt/c/users:/users (where /mnt/c is mounted from Windows).
Instead, from a Linux shell use a command like docker run -v ~/my-project:/sources <my-image> where ~ is expanded by the Linux shell to $HOME.
I also came across following:
Run sudo docker run -v "$HOME:/host" --name "[name_work]" -it docker.repo/[name]. With, [$HOME:/host], you can access your home directory in /host dir in docker image. This allows you to access your files on the local machine inside the docker. So you can edit your source code in your local machine using your favourite editor and run them directly inside the docker. Make sure that you have done this correct. Otherwise, you may need to copy files from the local machine to docker, for each edit (a painful job).
I am not able to understand the format of parameter passed to -v option and what it does. I am thinking that it will allow to access Ubuntu directories inside docker. So $HOME:/host will map Ubuntu's home directory to /host inside.
Q1. But what is /host?
Q2. Can I do what is stated by above two quotes together? I mean what they are saying is compatible? I guess yes. What all its saying is I should not mount from windows director like /mnt/<driveletter>/.... If I am mounting linux directory like $USER/... then it will give better performance, right?
I tried out running it to understand it:
~$ docker run -v "$HOME:/host" --name "mydokr" -it docker.repo.in/dokrimg
root#f814974a1cfb:/home# ls
root#f814974a1cfb:/home# ll
total 8
drwxr-xr-x 2 root root 4096 Apr 15 11:09 ./
drwxr-xr-x 1 root root 4096 Sep 22 07:16 ../
root#f814974a1cfb:/home# pwd
/home
root#f814974a1cfb:/home# cd ..
root#f814974a1cfb:/# ll
total 64
drwxr-xr-x 1 root root 4096 Sep 22 07:16 ./
drwxr-xr-x 1 root root 4096 Sep 22 07:16 ../
-rwxr-xr-x 1 root root 0 Sep 22 07:16 .dockerenv*
lrwxrwxrwx 1 root root 7 Jul 3 01:56 bin -> usr/bin/
drwxr-xr-x 2 root root 4096 Apr 15 11:09 boot/
drwxr-xr-x 5 root root 360 Sep 22 07:16 dev/
drwxr-xr-x 1 root root 4096 Sep 22 07:16 etc/
drwxr-xr-x 2 root root 4096 Apr 15 11:09 home/
drwxr-xr-x 5 1000 1001 4096 Sep 22 04:52 host/
lrwxrwxrwx 1 root root 7 Jul 3 01:56 lib -> usr/lib/
lrwxrwxrwx 1 root root 9 Jul 3 01:56 lib32 -> usr/lib32/
lrwxrwxrwx 1 root root 9 Jul 3 01:56 lib64 -> usr/lib64/
lrwxrwxrwx 1 root root 10 Jul 3 01:56 libx32 -> usr/libx32/
drwxr-xr-x 2 root root 4096 Jul 3 01:57 media/
drwxr-xr-x 2 root root 4096 Jul 3 01:57 mnt/
drwxr-xr-x 2 root root 4096 Jul 3 01:57 opt/
dr-xr-xr-x 182 root root 0 Sep 22 07:16 proc/
drwx------ 1 root root 4096 Aug 24 03:54 root/
drwxr-xr-x 1 root root 4096 Aug 11 10:24 run/
lrwxrwxrwx 1 root root 8 Jul 3 01:56 sbin -> usr/sbin/
drwxr-xr-x 2 root root 4096 Jul 3 01:57 srv/
dr-xr-xr-x 11 root root 0 Sep 22 03:32 sys/
-rw-r--r-- 1 root root 1610 Aug 24 03:56 test_logPath.log
drwxrwxrwt 1 root root 4096 Aug 24 03:57 tmp/
drwxr-xr-x 1 root root 4096 Aug 11 10:24 usr/
drwxr-xr-x 1 root root 4096 Jul 3 02:00 var/
root#f814974a1cfb:/home# cd ../host
root#f814974a1cfb:/host# ll
total 36
drwxr-xr-x 5 1000 1001 4096 Sep 22 04:52 ./
drwxr-xr-x 1 root root 4096 Sep 22 07:16 ../
-rw-r--r-- 1 1000 1001 220 Sep 22 03:38 .bash_logout
-rw-r--r-- 1 1000 1001 3771 Sep 22 03:38 .bashrc
drwxr-xr-x 3 1000 1001 4096 Sep 22 04:56 .docker/
drwxr-xr-x 2 1000 1001 4096 Sep 22 03:38 .landscape/
-rw-r--r-- 1 1000 1001 0 Sep 22 03:38 .motd_shown
-rw-r--r-- 1 1000 1001 921 Sep 22 04:52 .profile
-rw-r--r-- 1 1000 1001 0 Sep 22 03:44 .sudo_as_admin_successful
drwxr-xr-x 5 1000 1001 4096 Sep 22 04:52 .vscode-server/
-rw-r--r-- 1 1000 1001 183 Sep 22 04:52 .wget-hsts
So I am not getting whats happening here. I know docker has its own file system.
Q3. Is is that, what am finding at /home and /host is indeed container's own file system?
Q4. Also, what happened to -v $HOME:/host here?
Q5. How can I do as stated by 2nd quote:
This allows you to access your files on the local machine inside the docker. So you can edit your source code in your local machine using your favourite editor and run them directly inside the docker.
Q6. How do I connect vscode to this container? From WSL-Ubuntu, I could just run code . to launch vscode. But the same does not seem to work here:
root#f814974a1cfb:/home# code .
bash: code: command not found
This link says:
A devcontainer.json file can be used to tell VS Code how to configure the development container, including the Dockerfile to use, ports to open, and extensions to install in the container. When VS Code finds a devcontainer.json in the workspace, it automatically builds (if necessary) the image, starts the container, and connects to it.
But I guess this says starting up creating new container form vscode. But not connecting to already existing container. I am not able to find my dockercontainer.json. I downloaded this container image using docker pull.

Virtual machine increase disk space and reached at low disk space issue

I have installed Apache Amabari on three VM cluster system before 1 month ago and utilized something 47 GB out of 183 GB but now it has been increasing daily 1 to 2 GB not installed any other thing. Could you guild me how can I need to remove or free space from VM.
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root
490G 22G 444G 5% /
tmpfs 3.9G 8.0K 3.9G 1% /dev/shm
/dev/sda1 477M 48M 404M 11% /boot
vagrant 183G 181G 2.3G 99% /vagrant
VM 1 take 49 GB Space Used
VM 2 take 29 GB Space Used
VM 3 take 79 GB Space Used
VM 3 root Level file Space Details
dr-xr-xr-x. 2 root root 4096 May 12 08:05 bin
dr-xr-xr-x. 5 root root 1024 Apr 27 2013 boot
drwxr-xr-x 3 root root 4096 May 12 08:40 cgroups_test
drwxr-xr-x 18 root root 3680 Jun 14 10:37 dev
drwxr-xr-x. 102 root root 4096 Jun 14 10:37 etc
drwxr-xr-x 5 root root 4096 May 12 10:11 hadoop
drwxr-xr-x. 19 root root 4096 May 22 08:39 home
dr-xr-xr-x. 9 root root 4096 May 12 08:05 lib
dr-xr-xr-x. 10 root root 12288 May 12 08:05 lib64
drwx------. 2 root root 16384 Apr 27 2013 lost+found
drwxr-xr-x. 3 root root 4096 Apr 27 2013 media
drwxr-xr-x. 2 root root 4096 Sep 23 2011 mnt
drwxr-xr-x. 4 root root 4096 Apr 27 2013 opt
dr-xr-xr-x 111 root root 0 Jun 14 10:37 proc
dr-xr-x---. 5 root root 4096 Jun 13 13:28 root
dr-xr-xr-x. 2 root root 12288 May 12 08:05 sbin
drwxr-xr-x. 2 root root 4096 Apr 27 2013 selinux
drwxr-xr-x. 2 root root 4096 Sep 23 2011 srv
-rw-r--r-- 1 root root 3221225472 Jun 14 10:38 swapfile
drwxr-xr-x 13 root root 0 Jun 14 10:37 sys
drwxrwxrwt. 42 root root 4096 Jun 16 06:53 tmp
drwxr-xr-x. 15 root root 4096 May 12 08:04 usr
drwxr-xr-x 1 vagrant vagrant 4096 May 12 05:44 vagrant
drwxr-xr-x. 19 root root 4096 May 17 07:48 var
[root#c6403 /]# pwd
/
Please guide me where I am doing wrong or please tell me how can I increse free space from my VM.
The first thing to say is that Ambari Vagrant environment is not intended for production use. This configuration should be used for study and/or testing. Running Hadoop cluster on virtual machines on a single physical host imposes major performance and reliability drawbacks (e.g. implicitly broken failover/data replication). For details, see this question
For production use, you should either install Ambari directly on a physical machine or provision 1-2 virtual machines per every physical host of your cluster.
If you are still going to stay with virtual machines and dig into troubleshooting, try installing ncdu utility into your VM.
The typical ncdu output looks like:
ncdu 1.7 ~ Use the arrow keys to navigate, press ? for help
--- /data ----------------------------------------------------------------------------------------------------------
163.3GiB [##########] /docimages
84.4GiB [##### ] /data
82.0GiB [##### ] /sldata
56.2GiB [### ] /prt
40.1GiB [## ] /slisam
30.8GiB [# ] /isam
18.3GiB [# ] /mail
10.2GiB [ ] /export
3.9GiB [ ] /edi
1.7GiB [ ] /io
1.2GiB [ ] /dmt
896.7MiB [ ] /src
821.5MiB [ ] /upload
691.1MiB [ ] /client
686.8MiB [ ] /cocoon
542.5MiB [ ] /hist
358.1MiB [ ] /savsrc
228.9MiB [ ] /help
108.1MiB [ ] /savbin
101.2MiB [ ] /dm
40.7MiB [ ] /download
Similar output (but without sorting), may be achieved by runing this command:
du -sh /*
This way you can see what takes the most space in your virtual machine. Probably most space is taken up by logs at /var/log/. Also, explore /usr/hdp directory using ncdu, because a lot of HDP stack files are stored here

Resources