what does this bash script line of code mean - shell

I am new to shell scripting and I found following line of code in a given script.
Could someone explain me with an example what the following line of code means
Path=`echo $line | awk -F '|' '{print $1}'`

echo $line will print the value of the variable $line, the | symbol means that the output of this will be passed (or piped) to another program/command/script. I will not attempt to explain awk here, but what is done above is that the output from the echo $line is taken and processed with it.
the option -FS as per awk man page means
-F fs Use fs for the input field separator
so the string after it will be used to split the input string given to awk into different fields. Example, you variable $line has a value of a|b it will be split into two fields a and b. What is to be done with this is specified within the '{}' expression.
Again, what can be done in there is next to infinite, here the only thing that is done is to print the first field which can be accessed with $1, or a in the above example ($2 would be b as can be guessed).
Finally, the output of this whole operation is then stored in the variable Path.
to summarize:
line="a|b"
echo $line | awk -F '|' '{print $1}'
> a
Path=`echo $line | awk -F '|' '{print $1}'`
echo $Path
> a

echo $line | awk -F '|' '{print $1}'
Explanation:
echo -> display a line of text
$line -> parameter expansion read the line
| -> A pipeline is a sequence of one or more commands separated by one of the control operators |
awk -> Invoke awk program
-F '|' -> Field separator as | for the data feed
'{print $1}' -> Print the first field
Example
echo 'a|b|c' | awk -F '|' '{print $1}'
will print a

I think this is just a complicated way to express
echo ${line%%|*}
i.e. write to stdout the part of the content of the variable line which goes up to - but not including - the first vertical bar.

Path=`echo $line | awk -F '|' '{print $1}'`
^ ^ ^ ^
| | | |
| | | print 1st column
| | |
| | input field separator
| |
| echo variable line
|
variable Path
-F'|' - by default awk splits record/line/row into columns by single space, but with |, awk splits by pipe
Above one can be written as
Path=$( awk -F '|' '{ print $1 }' <<< "$line" )
Suppose say
$ line="1|2|3"
$ Path=$( awk -F '|' '{ print $1 }' <<< "$line" )
$ echo $Path; # you get first column
1
Same as
$ Path=$( cut -d'|' -f1 <<< "$line" )
$ echo $Path;
1

the default field separator is ' ', if you have -F , means change default separator to '|'

Related

How to grab fields in inverted commas

I have a text file which contains the following lines:
"user","password_last_changed","expires_in"
"jeffrey","2021-09-21 12:54:26","90 days"
"root","2021-09-21 11:06:57","0 days"
How can I grab two fields jeffrey and 90 days from inverted commas and save in a variable.
If awk is an option, you could save an array and then save the elements as individual variables.
$ IFS="\"" read -ra var <<< $(awk -F, '/jeffrey/{ print $1, $NF }' input_file)
$ $ var2="${var[3]}"
$ echo "$var2"
90 days
$ var1="${var[1]}"
$ echo "$var1"
jeffrey
while read -r line; do # read in line by line
name=$(echo $line | awk -F, ' { print $1} ' | sed 's/"//g') # grap first col and strip "
expire=$(echo $line | awk -F, ' { print $3} '| sed 's/"//g') # grap third col and strip "
echo "$name" "$expire" # do your business
done < yourfile.txt
IFS=","
arr=( $(cat txt | head -2 | tail -1 | cut -d, -f 1,3 | tr -d '"') )
echo "${arr[0]}"
echo "${arr[1]}"
The result is into an array, you can access to the elements by index.
May be this below method will help you using
sed and awk command
#!/bin/sh
username=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $1}')
echo "$username"
expires_in=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $3}')
echo "$expires_in"
Output :
jeffrey
90 days
Note :
This above method will work if their is only distinct username
As far i know username are not duplicate

sort fields within a line

input:
87 6,1,9,13
3 9,4,14,35,38,13
31 3,1,6,5
(i.e. a tab-delimited column where the second field is a comma-delimited list of unordered integers.)
desired output:
87 1,6,9,13
3 4,9,13,14,35,38
31 1,3,5,6
Goal:
for each line separately, sort the comma-separated list appearing in the second field. i.e. sort the 2nd column within for each line separately.
Note: the rows should not be re-ordered.
What I've tried:
sort - Since the order of the rows should not change, then sort is simply not applicable.
awk - since the greater file is tab-delimited, not comma-delimited, it cannot parse the second column as multiple "sub-fields"
There might be a perl way? I know nothing about perl though...
It can be done by simple perl oneliner:
perl -F'/\t/' -alne'$s=join",",sort{$a<=>$b}split",",$F[1];print"$F[0]\t$s"'
and shell (bash) one as well:
while read a b;do echo -e "$a\t$(echo $b|tr , '\n'|sort -n|tr '\n' ,|sed 's/,$//')"; done
while read LINE; do
echo -e "$(echo $LINE | awk '{print $1}')\t$(echo $LINE | awk '{print $2}' | tr ',' '\n' | sort -n | paste -s -d,)";
done < input
Obviously a lot going on here so here we go:
input contains your input
$(echo $LINE | awk '{print $1}') prints the first field, pretty straightforward
$(echo $LINE | awk '{print $2}' | tr ',' '\n' | sort -n | paste -s -d,) prints the second field, but breaks it down into lines by replacing the commas by newlines (tr ',' '\n'), then sort numerically, then assemble the lines back to comma-delimited values (paste -s -d,).
$ cat input
87 6,1,9,13
3 9,4,14,35,38,13
31 3,1,6,5
$ while read LINE; do echo -e "$(echo $LINE | awk '{print $1}')\t$(echo $LINE | awk '{print $2}' | tr ',' '\n' | sort -n | paste -s -d,)"; done < input
87 1,6,9,13
3 4,9,13,14,35,38
31 1,3,5,6
Another way:
echo happybirthday|awk '{split($0,A);asort(A); for (i=1;i<length(A);i++) {print A[i]}}' FS=""|tr -d '\n';echo aabdhhipprty
I didn't know how to get back to this page after recovering login info, so am posting as a guest.

using date variable inside sed command

I am storing date inside a variable and using that in the sed as below.
DateTime=`date "+%m/%d/%Y"`
Plc_hldr1=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $1 }'`
Plc_hldr2=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $2 }'`
sed "s/$Plc_hldr1/$DateTime/;s/$Plc_hldr2/$Total/" html_format.htm >> /u/raskar/test/html_final.htm
While running the sed command I am getting the below error.
sed: 0602-404 Function s/%%DDMS1RT%%/01/02/2014/;s/%%DDMS1C%%/1235/ cannot be parsed.
I suppose this is happening as the date contains the following output which includes slashes '/'
01/02/2014
I tried with different quotes around the date. How do I make it run?
Change the separator to something else that won't appear in your patterns, for example:
sed "s?$Plc_hldr1?$DateTime?;s?$Plc_hldr2?$Total?"
Not the direct quertion but replace
Plc_hldr1=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $1 }'`
Plc_hldr2=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $2 }'`
by
Plc_hldr1=`sed -n "$i {s/ .*//p;q}"`
Plc_hldr2=`sed -n "$i {s/[^ ]\{1,\} \{1,\}\([^ ]\{1,\}\) .*/\1/p;q}"`
and with aix/ksh
sed -n "$i {s/\([^ ]\{1,\} \{1,\}[^ ]\{1,\}\) .*/\1/p;q}" | read Plc_hldr1 Plc_hldr2

Print out onto same line with ":" separating variables

I have the following piece of code and would like to display HOST and RESULT side by side with a : separating them.
HOST=`grep pers results.txt | cut -d':' -f2 | awk '{print $1}'`
RESULT=`grep cleanup results.txt | cut -d':' -f2 | awk '{print $1}' | sed -e 's/K/000/' -'s/M/000000/'`
echo ${HOST}${RESULT}
Please can anyone assist with the final command to display these, I am just getting all of hosts and then all of results.
You probably want this:
HOST=( `grep pers results.txt | cut -d':' -f2 | awk '{ print $1 }'` ) #keep the output of the command in an array
RESULT=( `grep cleanup results.txt | cut -d':' -f2 | awk '{ print $1 }' | sed -e 's/K/000/' -'s/M/000000/'` )
for i in "${!HOST[#]}"; do
echo "${HOST[$i]}:${RESULT[$i]}"
done
A version that works without arrays, using an extra file handle to read from 2 sources at at time.
while read host; read result <&3; do
echo "$host:$result"
done < <( grep peers results.txt | cut -d: -f2 | awk '{print $1}' ) \
3< <( grep cleanup results.txt | cut -d':' -f2 | awk '{print $1}' | sed -e 's/K/000/' -'s/M/000000/')
It's still not quite POSIX, as it requires process substitution. You could instead use explicit fifes. (Also, an attempt to shorten the pipelines that produce the hosts and results. It's probably possible to combine this into a single awk command, since you can either do the substitution in awk, or pipe to sed from within awk. But this is all off-topic, so I leave it as an exercise to the reader.)
mkfifo hostsrc
mkfifo resultsrc
awk -F: '/peers/ {split($2, a, ' '); print a[1]}' results.txt > hostsrc &
awk -F: '/cleanup/ {split($2, a, ' '); print a[1]}' results.txt | sed -e 's/K/000' -e 's/M/000000/' > resultsrc &
while read host; read result <&3; do
echo "$host:$result"
done < hostsrc 3< resultsrc

When using awk to parse a CSV file, why does it ignore empty cells?

I have some scripts which use awk to parse a CSV file. I have noticed that, if a cell is empty, awk simply moves to the next cell. This means, if I ask it to read column 4, but that cell is empty, it prints the data from column 5, e.g.:
echo "1#2#3##5" | awk -F "#*" '{print $4}'
My expected result is that it will print nothing, because column 4 is empty.
Why is awk skipping column 4?
How can I get awk to not ignore empty columns?
The problem is not what you think. awk is not ignoring empty cells; it is parsing that line as 4 fields instead of 5.
[me#home]$ echo "1#2#3##5" | awk -F "#*" '{print NF}'
4
That's becuase you're using #* as your field separator which allows one or more consecutive # as your field separator (#, ##, ###, ... are all valid field separators).
Try using -F "#" instead.
[me#home]$ echo "1#2#3##5" | awk -F "#" '{print NF}'
5
[me#home]$ echo "1#2#3##5" | awk -F "#" '{print $4}'
[me#home]$ echo "1#2#3##5" | awk -F "#" '{print $5}'
5

Resources