How to get first and last element from a csv column - bash

I have a csv file with the following format:
Time, Field1, Field2,
1000, 1, 2,
1001, 3, 4,
1002, 5, 6,
I want to get the first and last element from the time column and store them in variables in my bash script.
So, based in this example I need:
$start=1000
$end=1002
How can I do this?

You have a lot of alternatives. Here are some of them:
Using head, tail and cut
$start=$(head -n2 file.csv | tail -n1 | cut -d',' -f1)
$end=$(tail -n1 file.csv | cut -d',' -f1)
Using awk
$start=$(awk -F',' 'NR==2{print $1}' file.csv)
$end=$(awk -F',' 'END{print $1}' file.csv)
One-Liner using awk (thanks to this answer)
read start finish <<< $(awk -F',' 'NR==2{print $1}END{print $1}' file.csv)
Another One-Liner using awk
read -d'\n' start finish < <(awk -F',' 'NR==2{print $1}END{print $1}' file.csv)

You can use a while loop like this:
while IFS=',' read -r c _; do
((end=c))
((start==0 && c>0)) && start=$c
done < file.csv
Check variables:
declare -p start end
declare -- start="1000"
declare -- end="1002"

Also, try this:
start_end(){
start=$(cat csv.file | head -n +2 | tail -n 1 | awk -F ',' '{print $1}')
end=$(cat csv.file | tail -n 1 | awk -F ',' '{print $1}')
}

You can use sed -n 's/,.*//;2p;$p' file.csv to extract the first and last fields from the first column. From its output, you can separate each of them and read it into a variable like so:
{
read start
read end
} < <(sed -n 's/,.*//;2p;$p' file.csv)
The first read reads the first line into the variable $start, while the second read reads the second line of the output into the variable $end.

Related

How to grab fields in inverted commas

I have a text file which contains the following lines:
"user","password_last_changed","expires_in"
"jeffrey","2021-09-21 12:54:26","90 days"
"root","2021-09-21 11:06:57","0 days"
How can I grab two fields jeffrey and 90 days from inverted commas and save in a variable.
If awk is an option, you could save an array and then save the elements as individual variables.
$ IFS="\"" read -ra var <<< $(awk -F, '/jeffrey/{ print $1, $NF }' input_file)
$ $ var2="${var[3]}"
$ echo "$var2"
90 days
$ var1="${var[1]}"
$ echo "$var1"
jeffrey
while read -r line; do # read in line by line
name=$(echo $line | awk -F, ' { print $1} ' | sed 's/"//g') # grap first col and strip "
expire=$(echo $line | awk -F, ' { print $3} '| sed 's/"//g') # grap third col and strip "
echo "$name" "$expire" # do your business
done < yourfile.txt
IFS=","
arr=( $(cat txt | head -2 | tail -1 | cut -d, -f 1,3 | tr -d '"') )
echo "${arr[0]}"
echo "${arr[1]}"
The result is into an array, you can access to the elements by index.
May be this below method will help you using
sed and awk command
#!/bin/sh
username=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $1}')
echo "$username"
expires_in=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $3}')
echo "$expires_in"
Output :
jeffrey
90 days
Note :
This above method will work if their is only distinct username
As far i know username are not duplicate

what does this bash script line of code mean

I am new to shell scripting and I found following line of code in a given script.
Could someone explain me with an example what the following line of code means
Path=`echo $line | awk -F '|' '{print $1}'`
echo $line will print the value of the variable $line, the | symbol means that the output of this will be passed (or piped) to another program/command/script. I will not attempt to explain awk here, but what is done above is that the output from the echo $line is taken and processed with it.
the option -FS as per awk man page means
-F fs Use fs for the input field separator
so the string after it will be used to split the input string given to awk into different fields. Example, you variable $line has a value of a|b it will be split into two fields a and b. What is to be done with this is specified within the '{}' expression.
Again, what can be done in there is next to infinite, here the only thing that is done is to print the first field which can be accessed with $1, or a in the above example ($2 would be b as can be guessed).
Finally, the output of this whole operation is then stored in the variable Path.
to summarize:
line="a|b"
echo $line | awk -F '|' '{print $1}'
> a
Path=`echo $line | awk -F '|' '{print $1}'`
echo $Path
> a
echo $line | awk -F '|' '{print $1}'
Explanation:
echo -> display a line of text
$line -> parameter expansion read the line
| -> A pipeline is a sequence of one or more commands separated by one of the control operators |
awk -> Invoke awk program
-F '|' -> Field separator as | for the data feed
'{print $1}' -> Print the first field
Example
echo 'a|b|c' | awk -F '|' '{print $1}'
will print a
I think this is just a complicated way to express
echo ${line%%|*}
i.e. write to stdout the part of the content of the variable line which goes up to - but not including - the first vertical bar.
Path=`echo $line | awk -F '|' '{print $1}'`
^ ^ ^ ^
| | | |
| | | print 1st column
| | |
| | input field separator
| |
| echo variable line
|
variable Path
-F'|' - by default awk splits record/line/row into columns by single space, but with |, awk splits by pipe
Above one can be written as
Path=$( awk -F '|' '{ print $1 }' <<< "$line" )
Suppose say
$ line="1|2|3"
$ Path=$( awk -F '|' '{ print $1 }' <<< "$line" )
$ echo $Path; # you get first column
1
Same as
$ Path=$( cut -d'|' -f1 <<< "$line" )
$ echo $Path;
1
the default field separator is ' ', if you have -F , means change default separator to '|'

sort fields within a line

input:
87 6,1,9,13
3 9,4,14,35,38,13
31 3,1,6,5
(i.e. a tab-delimited column where the second field is a comma-delimited list of unordered integers.)
desired output:
87 1,6,9,13
3 4,9,13,14,35,38
31 1,3,5,6
Goal:
for each line separately, sort the comma-separated list appearing in the second field. i.e. sort the 2nd column within for each line separately.
Note: the rows should not be re-ordered.
What I've tried:
sort - Since the order of the rows should not change, then sort is simply not applicable.
awk - since the greater file is tab-delimited, not comma-delimited, it cannot parse the second column as multiple "sub-fields"
There might be a perl way? I know nothing about perl though...
It can be done by simple perl oneliner:
perl -F'/\t/' -alne'$s=join",",sort{$a<=>$b}split",",$F[1];print"$F[0]\t$s"'
and shell (bash) one as well:
while read a b;do echo -e "$a\t$(echo $b|tr , '\n'|sort -n|tr '\n' ,|sed 's/,$//')"; done
while read LINE; do
echo -e "$(echo $LINE | awk '{print $1}')\t$(echo $LINE | awk '{print $2}' | tr ',' '\n' | sort -n | paste -s -d,)";
done < input
Obviously a lot going on here so here we go:
input contains your input
$(echo $LINE | awk '{print $1}') prints the first field, pretty straightforward
$(echo $LINE | awk '{print $2}' | tr ',' '\n' | sort -n | paste -s -d,) prints the second field, but breaks it down into lines by replacing the commas by newlines (tr ',' '\n'), then sort numerically, then assemble the lines back to comma-delimited values (paste -s -d,).
$ cat input
87 6,1,9,13
3 9,4,14,35,38,13
31 3,1,6,5
$ while read LINE; do echo -e "$(echo $LINE | awk '{print $1}')\t$(echo $LINE | awk '{print $2}' | tr ',' '\n' | sort -n | paste -s -d,)"; done < input
87 1,6,9,13
3 4,9,13,14,35,38
31 1,3,5,6
Another way:
echo happybirthday|awk '{split($0,A);asort(A); for (i=1;i<length(A);i++) {print A[i]}}' FS=""|tr -d '\n';echo aabdhhipprty
I didn't know how to get back to this page after recovering login info, so am posting as a guest.

How to split a text file on a delimiter into multiple files in unix?

I have a text file that looks like this:
input_file
1|abc
2|def
3|ghi
n|etc...
I need to split this up into two files on the pipe delimeter. So this is the expected output:
File_1:
1
2
3
n
File_2:
abc
def
ghi
etc
I do not know how many lines the input file will have. How do you achieve this in ksh or bash?
Thank you.
awk would be suitable for this task:
awk -F\| '{print $1 > "File_1"; print $2 > "File_2"}' input_file
This splits your text on the "|" and prints each column to the respective file.
If there were more than two fields, you may prefer to use a loop instead:
awk -F\| '{for(i=1;i<=NF;++i) print $i > "File_" i}' input_file
cut -d '|' -f 1 input_file > File_1
cut -d '|' -f 2 input_file > File_2
Only with bash:
while IFS='|' read A B; do echo "$A" >>File_1; echo "$B" >>File_2; done <input_file
Here is another solution using other bash commands
cat input_file | cut -d '|' -f1 > File_1
cat input_file | cut -d '|' -f2 > File_2
Or you can put them together in one line
cat input_file | tee >(cut -d '|' -f1 > File_1) | cut -d '|' -f2 > File_2

Print out onto same line with ":" separating variables

I have the following piece of code and would like to display HOST and RESULT side by side with a : separating them.
HOST=`grep pers results.txt | cut -d':' -f2 | awk '{print $1}'`
RESULT=`grep cleanup results.txt | cut -d':' -f2 | awk '{print $1}' | sed -e 's/K/000/' -'s/M/000000/'`
echo ${HOST}${RESULT}
Please can anyone assist with the final command to display these, I am just getting all of hosts and then all of results.
You probably want this:
HOST=( `grep pers results.txt | cut -d':' -f2 | awk '{ print $1 }'` ) #keep the output of the command in an array
RESULT=( `grep cleanup results.txt | cut -d':' -f2 | awk '{ print $1 }' | sed -e 's/K/000/' -'s/M/000000/'` )
for i in "${!HOST[#]}"; do
echo "${HOST[$i]}:${RESULT[$i]}"
done
A version that works without arrays, using an extra file handle to read from 2 sources at at time.
while read host; read result <&3; do
echo "$host:$result"
done < <( grep peers results.txt | cut -d: -f2 | awk '{print $1}' ) \
3< <( grep cleanup results.txt | cut -d':' -f2 | awk '{print $1}' | sed -e 's/K/000/' -'s/M/000000/')
It's still not quite POSIX, as it requires process substitution. You could instead use explicit fifes. (Also, an attempt to shorten the pipelines that produce the hosts and results. It's probably possible to combine this into a single awk command, since you can either do the substitution in awk, or pipe to sed from within awk. But this is all off-topic, so I leave it as an exercise to the reader.)
mkfifo hostsrc
mkfifo resultsrc
awk -F: '/peers/ {split($2, a, ' '); print a[1]}' results.txt > hostsrc &
awk -F: '/cleanup/ {split($2, a, ' '); print a[1]}' results.txt | sed -e 's/K/000' -e 's/M/000000/' > resultsrc &
while read host; read result <&3; do
echo "$host:$result"
done < hostsrc 3< resultsrc

Resources