Length of a specific field, and showing the record in much easier way - bash

My goal is to find out the length of the second field and if the length is more than five characters, then I need to show the entire record using shell scripts/command.
echo "From the csv file"
cat latency.csv |
while read line
do
latency=`echo $line | cut -d"," -f2 | tr -d " "`
length=$(echo ${#latency})
if [ $length -gt 5 ]
then
echo $line
fi
done
There is nothing wrong with my code, but being UNIX/Linux, I thought there should be a simpler way of doing such things.
Is there one such simpler method?

awk -F, 'length($2)>5' file
this should work
updated
awk -F, '{a=$0;gsub(/ /,"",$2);if(length($2)>5)print a}' file

awk -F, '{
t = $2
gsub(/ /, x, t)
if (length(t) > 5)
print
}' latency.csv
Or:
perl -F, -ane'
print if
$F[1] =~ tr/ //dc > 5
' latency.csv

Related

How to grab fields in inverted commas

I have a text file which contains the following lines:
"user","password_last_changed","expires_in"
"jeffrey","2021-09-21 12:54:26","90 days"
"root","2021-09-21 11:06:57","0 days"
How can I grab two fields jeffrey and 90 days from inverted commas and save in a variable.
If awk is an option, you could save an array and then save the elements as individual variables.
$ IFS="\"" read -ra var <<< $(awk -F, '/jeffrey/{ print $1, $NF }' input_file)
$ $ var2="${var[3]}"
$ echo "$var2"
90 days
$ var1="${var[1]}"
$ echo "$var1"
jeffrey
while read -r line; do # read in line by line
name=$(echo $line | awk -F, ' { print $1} ' | sed 's/"//g') # grap first col and strip "
expire=$(echo $line | awk -F, ' { print $3} '| sed 's/"//g') # grap third col and strip "
echo "$name" "$expire" # do your business
done < yourfile.txt
IFS=","
arr=( $(cat txt | head -2 | tail -1 | cut -d, -f 1,3 | tr -d '"') )
echo "${arr[0]}"
echo "${arr[1]}"
The result is into an array, you can access to the elements by index.
May be this below method will help you using
sed and awk command
#!/bin/sh
username=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $1}')
echo "$username"
expires_in=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $3}')
echo "$expires_in"
Output :
jeffrey
90 days
Note :
This above method will work if their is only distinct username
As far i know username are not duplicate

Trim line to the first comma (bash)

I have a line from which I need to cut the branch name to the first comma:
commit 2bea9e0351dae65f18d2de11621049b465b1e868 (HEAD, origin/MGB-322, refs/pipelines/36877)
I need to cut out MGB-322.
The number of characters in a line is always different.
awk -F "origin/" '{print $2}' - this is how I cut out
MGB-322, refs/pipelines/36877)
But how to tell it to trim to the first comma?
I tried doing it via substr,
awk -F "origin/" '{print substr ($2,1, index $2 ,)}'
But it is not clear how to correctly specify the comma in index
With any awk. Use / and , as field separator:
awk '{print $3}' FS='[/,]' file
Output:
MGB-322
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
With OP's code fix: considered that you have only occurrence of origin in case you have more than occurrence then change $NF to $2 in following code. Written and tested in https://ideone.com/xjv2we
awk -F"origin/" '{print $NF}' Input_file
sed could be also helpful here, generic solution it's based on first occurrence of comma and / as per OP's thread title. I have written this on mobile so couldn't test it as of now should with though and will test it after sometime.
sed 's/\([^,]*\),\([^/]*\)\/\(.*\)/\3/' Input_file
"I need to cut out MGB-322."
You can use cut in two steps:
echo "${line}" | cut -d"/" -f2 | cut -d"," -f1
I would prefer one step with awk (already anwered by others) or sed
echo "${line}" | sed -r 's/.*origin.(.*), refs.*/\1/'
Why spawn procs? bash's built-in parameter parsing will handle this.
If
$: line="commit 2bea9e0351dae65f18d2de11621049b465b1e868 (HEAD, origin/MGB-322, refs/pipelines/36877)"
then
$: [[ "$line" =~ .*origin.(.*), ]] && echo "${BASH_REMATCH[1]}"
MGB-322
or maybe
$: tmp=${line#*, origin/}; echo ${tmp%,*}
MGB-322
or even
$: IFS=",/" read _ _ x _ <<< "$line" && echo $x
MGB-322
c.f. https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html

Print line based on 2nd field value, without using a loop

I try to retrieve a line from a file without using a loop.
myFile.txt
val1;a;b;c
val2;b;d;e
val3;c;r;f
I would like to get the line where the second column is b.
If I do grep "b" myFile.txt then both first and second line will be outputed.
If I do cat myFile.txt | cut -d ';' -f2 | grep "b" then the output will just be b whereas I'd like to get the full line val2;b;d;e.
Is there a way of reaching the desired results without using a loop as below ? My file being huge it wouldn't be nice looping through it again and again.
while read line; do
if [ `echo $line | cut -d ';' -f2` = "b" ]; then
echo $line
fi
done < myFile.txt
Given your input file, The below one-liner should work:
awk -F";" '$2 == "b" {print}' myFile.txt
Explanation:
awk -F";" ##Field Separator as ";"
'$2 == "b" ##Searches for "b" in the second column($2)
{print}' ##prints the searched line
Using:
grep:
grep '^[^;]*;b;' myFile.txt
sed:
sed '/^[^;]*;b;/!d' myFile.txt
Output is the same for both:
val2;b;d;e

what does this bash script line of code mean

I am new to shell scripting and I found following line of code in a given script.
Could someone explain me with an example what the following line of code means
Path=`echo $line | awk -F '|' '{print $1}'`
echo $line will print the value of the variable $line, the | symbol means that the output of this will be passed (or piped) to another program/command/script. I will not attempt to explain awk here, but what is done above is that the output from the echo $line is taken and processed with it.
the option -FS as per awk man page means
-F fs Use fs for the input field separator
so the string after it will be used to split the input string given to awk into different fields. Example, you variable $line has a value of a|b it will be split into two fields a and b. What is to be done with this is specified within the '{}' expression.
Again, what can be done in there is next to infinite, here the only thing that is done is to print the first field which can be accessed with $1, or a in the above example ($2 would be b as can be guessed).
Finally, the output of this whole operation is then stored in the variable Path.
to summarize:
line="a|b"
echo $line | awk -F '|' '{print $1}'
> a
Path=`echo $line | awk -F '|' '{print $1}'`
echo $Path
> a
echo $line | awk -F '|' '{print $1}'
Explanation:
echo -> display a line of text
$line -> parameter expansion read the line
| -> A pipeline is a sequence of one or more commands separated by one of the control operators |
awk -> Invoke awk program
-F '|' -> Field separator as | for the data feed
'{print $1}' -> Print the first field
Example
echo 'a|b|c' | awk -F '|' '{print $1}'
will print a
I think this is just a complicated way to express
echo ${line%%|*}
i.e. write to stdout the part of the content of the variable line which goes up to - but not including - the first vertical bar.
Path=`echo $line | awk -F '|' '{print $1}'`
^ ^ ^ ^
| | | |
| | | print 1st column
| | |
| | input field separator
| |
| echo variable line
|
variable Path
-F'|' - by default awk splits record/line/row into columns by single space, but with |, awk splits by pipe
Above one can be written as
Path=$( awk -F '|' '{ print $1 }' <<< "$line" )
Suppose say
$ line="1|2|3"
$ Path=$( awk -F '|' '{ print $1 }' <<< "$line" )
$ echo $Path; # you get first column
1
Same as
$ Path=$( cut -d'|' -f1 <<< "$line" )
$ echo $Path;
1
the default field separator is ' ', if you have -F , means change default separator to '|'

How can I specify a row in awk in for loop?

I'm using the following awk command:
my_command | awk -F "[[:space:]]{2,}+" 'NR>1 {print $2}' | egrep "^[[:alnum:]]"
which successfully returns my data like this:
fileName1
file Name 1
file Nameone
f i l e Name 1
So as you can see some file names have spaces. This is fine as I'm just trying to echo the file name (nothing special). The problem is calling that specific row within a loop. I'm trying to do it this way:
i=1
for num in $rows
do
fileName=$(my_command | awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]])"
echo "$num $fileName"
$((i++))
done
But my output is always null
I've also tried using awk -v record=$i and then printing $record but I get the below results.
f i l e Name 1
EDIT
Sorry for the confusion: rows is a variable that list ids like this 11 12 13
and each one of those ids ties to a file name. My command without doing any parsing looks like this:
id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3
I can only use the id field to run a the command that I need, but I want to use the File Info field to notify the user of the actual File that the command is being executed against.
I think your $i does not expand as expected. You should quote your arguments this way:
fileName=$(my_command | awk -F "[[:space:]]{2,}+" "NR==$i {print \$2}" | egrep "^[[:alnum:]]")
And you forgot the other ).
EDIT
As an update to your requirement you could just pass the rows to a single awk command instead of a repeatitive one inside a loop:
#!/bin/bash
ROWS=(11 12)
function my_command {
# This function just emulates my_command and should be removed later.
echo " id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3"
}
awk -- '
BEGIN {
input = ARGV[1]
while (getline line < input) {
sub(/^ +/, "", line)
split(line, a, / +/)
for (i = 2; i < ARGC; ++i) {
if (a[1] == ARGV[i]) {
printf "%s %s\n", a[1], a[2]
break
}
}
}
exit
}
' <(my_command) "${ROWS[#]}"
That awk command could be condensed to one line as:
awk -- 'BEGIN { input = ARGV[1]; while (getline line < input) { sub(/^ +/, "", line); split(line, a, / +/); for (i = 2; i < ARGC; ++i) { if (a[1] == ARGV[i]) {; printf "%s %s\n", a[1], a[2]; break; }; }; }; exit; }' <(my_command) "${ROWS[#]}"
Or better yet just use Bash instead as a whole:
#!/bin/bash
ROWS=(11 12)
while IFS=$' ' read -r LINE; do
IFS='|' read -ra FIELDS <<< "${LINE// +( )/|}"
for R in "${ROWS[#]}"; do
if [[ ${FIELDS[0]} == "$R" ]]; then
echo "${R} ${FIELDS[1]}"
break
fi
done
done < <(my_command)
It should give an output like:
11 File Name1
12 Fi leNa me2
Shell variables aren't expanded inside single-quoted strings. Use the -v option to set an awk variable to the shell variable:
fileName=$(my_command | awk -v i=$i -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]])"
This method avoids having to escape all the $ characters in the awk script, as required in konsolebox's answer.
As you already heard, you need to populate an awk variable from your shell variable to be able to use the desired value within the awk script so thi:
awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]]"
should be this:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
Also, though, you don't need awk AND grep since awk can do anything grep van do so you can change this part of your script:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
to this:
awk -v i="$i" -F "[[:space:]]{2,}+" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
and you don't need a + after a numeric range so you can change {2,}+ to just {2,}:
awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
Most importantly, though, instead of invoking awk once for every invocation of my_command, you can just invoke it once for all of them, i.e. instead of this (assuming this does what you want):
i=1
for num in rows
do
fileName=$(my_command | awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}')
echo "$num $fileName"
$((i++))
done
you can do something more like this:
for num in rows
do
my_command
done |
awk -F '[[:space:]]{2,}' '$2~/^[[:alnum:]]/{print NR, $2}'
I say "something like" because you don't tell us what "my_command", "rows" or "num" are so I can't be precise but hopefully you see the pattern. If you give us more info we can provide a better answer.
It's pretty inefficient to rerun my_command (and awk) every time through the loop just to extract one line from its output. Especially when all you're doing is printing out part of each line in order. (I'm assuming that my_command really is exactly the same command and produces the same output every time through your loop.)
If that's the case, this one-liner should do the trick:
paste -d' ' <(printf '%s\n' $rows) <(my_command |
awk -F '[[:space:]]{2,}+' '($2 ~ /^[::alnum::]/) {print $2}')

Resources