what does the line of code mean out of curiosity - bash

JUst out of curiosity, can anyone tell me what do 1 does at the end of this statement?
md5 = $(md5sum ${my_iso_file} | cut -d ' ' -f 1)

-f is a "field-list" option for the cut command. The 1 is the value provided for that option, meaning that cut should only print field 1.
Source: http://www.ss64.com/bash/cut.html

it gives out the md5sum of an iso file; md5sum gives out
<md5sum> <filename>
and the cut returns the first word

Related

Unix Shell script code meaning for beginner

I am a novice learner of Unix and shell scripting.
Can anyone explain the meaning of this line and how it works:
Record_count=$(wc -l ${table_dir} "/" $table_file_name | cut -d' ' f1)
I am not sure of what "/" does here.
Let's go step by step.
First step. This wc -l ${table_dir} "/" $table_file_name doesn't work as it's written but I understand it means return the number of lines (wc -l) of the file ${table_dir}/${table_file_name}. It returns something that looks like this (imagining that your_table_file_name.txt has 5 lines):
$ wc -l wc -l "${table_dir}/${table_file_name}"
5 your_table_dir/your_table_file_name.txt
Second step. I think this cut -d' ' f1 has a typo and is actually cut -d ' ' -f1. What this does is splitting a line by the space character (cut -d ' ') and only returns the first item of the sequence (-f1).
So, when you apply it to your line 5 your_table_dir/your_table_file_name.txt, it returns 5.
Third step. So what wc -l "${table_dir}/${table_file_name}" | cut -d ' ' -f1 does is returning the number of lines that ${table_dir}/${table_file_name} has.
Final step. In shell script, foo=$(some_command) means: assign to the variable called foo, the result of the command some_command.
So, what your whole line Record_count=$(wc -l "${table_dir}/${table_file_name}" | cut -d ' ' -f1) does is assigning to the variable Record_count, the count of the lines of the file ${table_dir}/${table_file_name}.

Use 'df -h' to check % remaining disk space of a specific folder

I am using 'df -h' command to get disk space details in my directory and it gives me response as below :
Now I want to be able to do this check automatically through some batch or script - so I am wondering, if I will be able to check disk space only for specific folders which I care about, as shown in image - I am only supposed to check for /nas/home that it does not go above 75%.
How can I achieve this ? Any help ?
My work till now:
I am using
df -h > DiskData.txt
... this outputs to a text file
grep "/nas/home" "DiskData.txt"
... which gives me the output:
*500G 254G 247G 51% /nas/home*
Now I want to be able to search for the number previous or right nearby '%' sign (51 in this case) to achieve what I want.
This command will give you percentage of /nas/home directory
df /nas/home | awk '{ print $4 }' | tail -n 1| cut -d'%' -f1
So basically you can use store as value in some variable and then apply if else condition.
var=`df /nas/home | awk '{ print $4 }' | tail -n 1| cut -d'%' -f1`
if(var>75){
#send email
}
another variant:
df --output=pcent /nas/home | tail -n 1 | tr -d '[:space:]|%'
output=pcent - show only percent value (for coreutils => 8.21 )
A more concise way without extensive piping could be:
df -h /nas/home | perl -ane 'print substr $F[3],0,-1 if $.==2'
Returns: 51 for your example.

Match List of Numbers in For Loop in Bash

I have a script that loops over a curl command, which pulls in data from an API.
LIST_OF_ID=$(curl -s -X POST -d "username=$USER&password=$PASS&action=action" http://link.to/api.php)
for PHONE_NUMBER in $(echo $LIST_OF_ID | tr '_' ' ' | awk '{print $2}');
do
$VOIP_ID = $(echo $LIST_OF_ID | tr '_' ' ' | awk '{print $1}')
done
I also have a variable of 16 numbers in the range of "447856321455"
NUMBERS=$(cat << EOF
441111111111
441111111112
441111111113
... etc
)
The output on the API call is:
652364_441111111112
As you may notice I have taken the output and cut it into 2 parts and put it in a variable.
What I need is to match the 6 digit code from the output where the number in the output, matches with the number in the variable.
I've attempted it using if statements but I can't work my head around the correct way of doing it.
Any help would be appreciated.
Thank you.
I would do it using join rather than a loop in bash. Like this:
curl -s -X POST -d "$PARAMS" "$URL" | sort \
| join -t _ -2 2 -o 2.1 <(sort numbers.txt) -
What this does is take the sorted output from curl and join it with the sorted contents of numbers.txt (you could use $NUMBERS too), using _ as the separator, using column 2 of file 2 which is - meaning stdin (from curl). Then output field 2.1 which is the six-digit ID.
Read why-is-using-a-shell-loop-to-process-text-considered-bad-practice and then do something like this:
curl ... |
awk -v numbers="$NUMBERS" -F'_' '
BEGIN { split(numbers,tmp,/[[:space:]]+/); for (i in tmp) nums[tmp[i]] }
$2 in nums
'
but to be honest I cant really tell what it is you are trying to do as the numbers in your sample input don't seem to match each other (what does in the range of "447856321455" mean and how does it relate to $NUMBERS containing 441111111111 through 441111111113 and how does any of that relate to match the 6 digit code) and the expected output is missing.

Bash script to search csv file column and count how many times a value shows up

I am really new a bash and I was trying to search a csv file column for a value and then add a counter. I found this online but it prints it and I have been trying to count how many times an R shows up and not print the whole thing.
awk -F "\"*,\"*" '{print $2}' $file
The csv file is like:
12345,R,N,N,Y,N,N,N,Bob Builder
I am looking for R in column 2. Can anybody point me in the right direction?
The following should do what you want (where file.csv is your csv file):
Case sensitive version:
cut -f 2 -d , file.csv | grep -c R
Case insensitive version:
cut -f 2 -d , file.csv | grep -ic R
Explanation
cut -f 2 -d , file.csv
This takes each line of file.csv and extracts the specified fields. The -f 2 option means extract field 2 and the -d , means use a ',' as the field delimiter. The output of this is then piped to grep.
grep -c R This looks for lines containing 'R'. Since it is passed the contents of the previous cut command, it is looking for an 'R' in field two. The -c option means count the number of matching lines.
Using awk only:
awk -F "\",\"" '{if ($2 == "R") cnt++} END{print cnt}' file
For a fun - perl only - this count everything.
perl -F, -anle 'map{$cnt{$_}{$F[$_]}++}0..$#F;END{print $cnt{1}{R}}'

Only get hash value using md5sum (without filename)

I use md5sum to generate a hash value for a file.
But I only need to receive the hash value, not the file name.
md5=`md5sum ${my_iso_file}`
echo ${md5}
Output:
3abb17b66815bc7946cefe727737d295 ./iso/somefile.iso
How can I 'strip' the file name and only retain the value?
A simple array assignment works... Note that the first element of a Bash array can be addressed by just the name without the [0] index, i.e., $md5 contains only the 32 characters of md5sum.
md5=($(md5sum file))
echo $md5
# 53c8fdfcbb60cf8e1a1ee90601cc8fe2
Using AWK:
md5=`md5sum ${my_iso_file} | awk '{ print $1 }'`
You can use cut to split the line on spaces and return only the first such field:
md5=$(md5sum "$my_iso_file" | cut -d ' ' -f 1)
On Mac OS X:
md5 -q file
md5="$(md5sum "${my_iso_file}")"
md5="${md5%% *}" # remove the first space and everything after it
echo "${md5}"
Another way is to do:
md5sum filename | cut -f 1 -d " "
cut will split the line to each space and return only the first field.
By leaning on head:
md5_for_file=`md5sum ${my_iso_file}|head -c 32`
One way:
set -- $(md5sum $file)
md5=$1
Another way:
md5=$(md5sum $file | while read sum file; do echo $sum; done)
Another way:
md5=$(set -- $(md5sum $file); echo $1)
(Do not try that with backticks unless you're very brave and very good with backslashes.)
The advantage of these solutions over other solutions is that they only invoke md5sum and the shell, rather than other programs such as awk or sed. Whether that actually matters is then a separate question; you'd probably be hard pressed to notice the difference.
If you need to print it and don't need a newline, you can use:
printf $(md5sum filename)
md5=$(md5sum < $file | tr -d ' -')
md5=`md5sum ${my_iso_file} | cut -b-32`
md5sum puts a backslash before the hash if there is a backslash in the file name. The first 32 characters or anything before the first space may not be a proper hash.
It will not happen when using standard input (file name will be just -), so pixelbeat's answer will work, but many others will require adding something like | tail -c 32.
if you're concerned about screwy filenames :
md5sum < "${file_name}" | awk NF=1
f244e67ca3e71fff91cdf9b8bd3aa7a5
other messier ways to deal with this :
md5sum "${file_name}" | awk NF=NF OFS= FS=' .*$'
or
| awk '_{ exit }++_' RS=' '
f244e67ca3e71fff91cdf9b8bd3aa7a5
to do it entirely inside awk :
mawk 'BEGIN {
__ = ARGV[ --ARGC ]
_ = sprintf("%c",(_+=(_^=_<_)+_)^_+_*++_)
RS = FS
gsub(_,"&\\\\&",__)
( _=" md5sum < "((_)(__)_) ) | getline
print $(_*close(_)) }' "${file_name}"
f244e67ca3e71fff91cdf9b8bd3aa7a5
Well, I had the same problem today, but I was trying to get the file MD5 hash when running the find command.
I got the most voted question and wrapped it in a function called md5 to run in the find command. The mission for me was to calculate the hash for all files in a folder and output it as hash:filename.
md5() { md5sum $1 | awk '{ printf "%s",$1 }'; }
export -f md5
find -type f -exec bash -c 'md5 "$0"' {} \; -exec echo -n ':' \; -print
So, I'd got some pieces from here and also from 'find -exec' a shell function in Linux
For the sake of completeness, a way with sed using a regular expression and a capture group:
md5=$(md5sum "${my_iso_file}" | sed -r 's:\\*([^ ]*).*:\1:')
The regular expression is capturing everything in a group until a space is reached. To get a capture group working, you need to capture everything in sed.
(More about sed and capture groups here: How can I output only captured groups with sed?)
As delimiter in sed, I use colons because they are not valid in file paths and I don't have to escape the slashes in the filepath.
Another way:
md5=$(md5sum ${my_iso_file} | sed '/ .*//' )
md5=$(md5sum < index.html | head -c -4)

Resources