Convert between byte count and "human-readable" string - shell

Is there a shell command that simply converts back and forth between a number string in bytes and the "human-readable" number string offered by some commands via the -h option?
To clarify the question: ls -l without the -h option (some output supressed)
> ls -l
163564736 file1.bin
13209 file2.bin
gives the size in bytes, while with the -hoption (some output supressed)
> ls -lh
156M file1.bin
13K file2.bin
the size is human readable in kilobytes and megabytes.
Is there a shell command that simply turns 163564736into 156M and 13209 into 13K and also does the reverse?

numfmt
To:
echo "163564736" | numfmt --to=iec
From:
echo "156M" | numfmt --from=iec

There is no standard (cross-platform) tool to do it. But solution using awk is described here

Related

Does ls -l print the file size in blocks or bytes?

I created a new file on a unix server.
Now, ls -l prints the following:
-rwxr-xr-x 1 mccorm14 student 49508 Oct 26 11:29 4th
Does ls -l print the file size (49508) in blocks or bytes? If it is blocks, how do I convert it to bytes?
ls -l lists file sizes in bytes. No conversion necessary. For readability you can use -h to print human readable sizes like 49K.
You may be thinking of du, which uses block size by default. For du try -h to print human readable sizes or -k to set the block size to 1KiB.

Get total size of a list of files in UNIX

I want to run a find command that will find a certain list of files and then iterate through that list of files to run some operations. I also want to find the total size of all the files in that list.
I'd like to make the list of files FIRST, then do the other operations. Is there an easy way I can report just the total size of all the files in the list?
In essence I am trying to find a one-liner for the 'total_size' variable in the code snippet below:
#!/bin/bash
loc_to_look='/foo/bar/location'
file_list=$(find $loc_to_look -type f -name "*.dat" -size +100M)
total_size=???
echo 'total size of all files is: '$total_size
for file in $file_list; do
# do a bunch of operations
done
You should simply be able to pass $file_list to du:
du -ch $file_list | tail -1 | cut -f 1
du options:
-c display a total
-h human readable (i.e. 17M)
du will print an entry for each file, followed by the total (with -c), so we use tail -1 to trim to only the last line and cut -f 1 to trim that line to only the first column.
Methods explained here have hidden bug. When file list is long, then it exceeds limit of shell comand size. Better use this one using du:
find <some_directories> <filters> -print0 | du <options> --files0-from=- --total -s|tail -1
find produces null ended file list, du takes it from stdin and counts.
this is independent of shell command size limit.
Of course, you can add to du some switches to get logical file size, because by default du told you how physical much space files will take.
But I think it is not question for programmers, but for unix admins :) then for stackoverflow this is out of topic.
This code adds up all the bytes from the trusty ls for all files (it excludes all directories... apparently they're 8kb per folder/directory)
cd /; find -type f -exec ls -s \; | awk '{sum+=$1;} END {print sum/1000;}'
Note: Execute as root. Result in megabytes.
The problem with du is that it adds up the size of the directory nodes as well. It is an issue when you want to sum up only the file sizes. (Btw., I feel strange that du has no option for ignoring the directories.)
In order to add the size of files under the current directory (recursively), I use the following command:
ls -laUR | grep -e "^\-" | tr -s " " | cut -d " " -f5 | awk '{sum+=$1} END {print sum}'
How it works: it lists all the files recursively ("R"), including the hidden files ("a") showing their file size ("l") and without ordering them ("U"). (This can be a thing when you have many files in the directories.) Then, we keep only the lines that start with "-" (these are the regular files, so we ignore directories and other stuffs). Then we merge the subsequent spaces into one so that the lines of the tabular aligned output of ls becomes a single-space-separated list of fields in each line. Then we cut the 5th field of each line, which stores the file size. The awk script sums these values up into the sum variable and prints the results.
ls -l | tr -s ' ' | cut -d ' ' -f <field number> is something I use a lot.
The 5th field is the size. Put that command in a for loop and add the size to an accumulator and you'll get the total size of all the files in a directory. Easier than learning AWK. Plus in the command substitution part, you can grep to limit what you're looking for (^- for files, and so on).
total=0
for size in $(ls -l | tr -s ' ' | cut -d ' ' -f 5) ; do
total=$(( ${total} + ${size} ))
done
echo ${total}
The method provided by #Znik helps with the bug encountered when the file list is too long.
However, on Solaris (which is a Unix), du does not have the -c or --total option, so it seems there is a need for a counter to accumulate file sizes.
In addition, if your file names contain special characters, this will not go too well through the pipe (Properly escaping output from pipe in xargs
).
Based on the initial question, the following works on Solaris (with a small amendment to the way the variable is created):
file_list=($(find $loc_to_look -type f -name "*.dat" -size +100M))
printf '%s\0' "${file_list[#]}" | xargs -0 du -k | awk '{total=total+$1} END {print total}'
The output is in KiB.

Command to list all file types and their average size in a directory

I am working on a specific project where I need to work out the make-up of a large extract of documents so that we have a baseline for performance testing.
Specifically, I need a command that can recursively go through a directory and, for each file type, inform me of the number of files of that type and their average size.
I've looked at solutions like:
Unix find average file size,
How can I recursively print a list of files with filenames shorter than 25 characters using a one-liner? and https://unix.stackexchange.com/questions/63370/compute-average-file-size, but nothing quite gets me to what I'm after.
This du and awk combination should work for you:
du -a mydir/ | awk -F'[.[:space:]]' '/\.[a-zA-Z0-9]+$/ { a[$NF]+=$1; b[$NF]++ }
END{for (i in a) print i, b[i], (a[i]/b[i])}'
Give you something to start, with below script, you will get a list of file and its size, line by line.
#!/usr/bin/env bash
DIR=ABC
cd $DIR
find . -type f |while read line
do
# size=$(stat --format="%s" $line) # For the system with stat command
size=$(perl -e 'print -s $ARGV[0],"\n"' $line ) # #Mark Setchell provided the command, but I have no osx system to test it.
echo $size $line
done
Output sample
123 ./a.txt
23 ./fds/afdsf.jpg
Then it is your homework, with above output, you should be easy to get file type and their average size
You can use "du" maybe:
du -a -c *.txt
Sample output:
104 M1.txt
8 in.txt
8 keys.txt
8 text.txt
8 wordle.txt
136 total
The output is in 512-byte blocks, but you can change it with "-k" or "-m".

How to know the size of a very small file?

suppose I have a very simple ASCII file that only contains
11111111
now, I want to use a command to find how many bytes it really has, not how many bytes the system allocated for it. I tried
ln -s
and
du
but they only output
4
I think that's how many blocks the system allocates for this file, how can I use a command to find the size of such a small file?
You need to use du -b to see the size of the file in bytes.
$ du -b file
9 file
wc -c will do:
$ echo "11111111" > file
$ wc -c file
9 file
You can use the stat command to get information on a file. For instance, the size of file in bytes is:
$ echo "11111111" > file
$ stat -c %s file
9
Type man stat to see all of the other useful things it can tell you about a file.

Bash scripting: How can I patch files? (write a given string in a given position of a file)

I’m writing a script to change the UUID of an NTFS partition (AFAIK, there is none). That means writing 8 bytes from 0x48 to 0x4F (72-79 decimal) of /dev/sdaX (X being the # of my partition).
If I wanted to change it to a random UUID, I could use this:
dd if=/dev/urandom of=/dev/sdaX bs=8 count=1 seek=9 conv=notrunc
Or I could change /dev/urandom to /dev/sdaY to clone the UUID from another partition.
But... what if I want to craft a personalized UUID? I already have it stored (and regex-checked) in a $UUID variable in hexadecimal string format (16 characters), like this:
UUID="2AE2C85D31835048"
I was thinking about this approach:
echo "$UUID" | xxd -r -p | dd of=/dev/sdaX ...
This is just a scratch... I’m not sure about the exact options to make it work. My question is:
Is the echo $var | xxd -r | dd really the best approach?
What would be the exact command and options to make it work?
As for the answers, I’m also looking for:
An explanation of all the options used, and what they do.
If possible, an alternative command to test it in a file and/or screen before changing the partition.
I already have an 100-byte dump file called ntfs.bin that I can use for tests and check results using
xxd ntfs.bin
So any solution that provides me a way to check results using xxd in screen so I can compare with original ntfs.bin file would be highly appreciated.
Try:
UUID="2AE2C85D31835048"
echo "$UUID" | xxd -r -p | wc -c
echo "$UUID" | xxd -r -p | dd of=file obs=1 oseek=72 conv=block,notrunc cbs=8

Resources