Using AWK for a variable inside a loop - bash

So I get some URL's from my other team and I need to identiy a defined pattern in that URL and save the value after the pattern inside a variable.
Can this be achieved ?
**Input file: Just an example**
https://stackoverflow.com/questions/hakuna
https://stackoverflow.com/questions/simba
I wrote a simple for loop for this purpose
for i in `cat inputFile`
do
storeVal=awk -v $i -F"questions/" '{print$2}'
echo "The Name for the day is ${storeVal}"
length=`secondScript.sh ${storeVal}`
if [[ $length -gt 10 ]]
then
thirdScript.sh ${storeVal}
elif [[ $length -lt 10 ]]
then
fourthScript.sh ${storeVal}
else
echo "The length of for ${storeVal} is undefined"
done
Desired Output:
The Name for the day is hakuna
The length of for hakuna is greater than 10
Command1 hakuna executed
Command2 hakuna executed
The Name for the day is simba
Command1 simba executed
Command2 simba executed
And extra point to be noted.
The reason why I need to store the awk cut value in a variable is because I need to use that variable in multiple places with the loop.

Since it sounds like you want to run a command for every line in the input file, you can just use the built-in functionality of the shell:
while IFS=/ read -ra pieces; do
printf '%s\n' "${pieces[#]}" # prints each piece on a separate line
done < inputFile
If you always want the last part of the url (i.e. after the last /) on each line, then you can use "${pieces[-1]}":
while IFS=/ read -ra pieces; do
variable=${pieces[-1]} # do whatever you want with this variable
printf 'The Name for the day is %s\n' "$variable" # e.g. print it
done < inputFile

Since you want to extract all the string after the questions/ bit, you can simply use shell pattern substitution for that.
while IFS= read -r line; do
storeVal=${line##*questions/}
echo "The Name for the day is ${storeVal}"
done < 'inputFile'
Sample input file:
$ cat inputFile
https://stackoverflow.com/questions/hakuna
https://stackoverflow.com/questions/simba
https://stackoverflow.com/questions/simba/lion
Output of script:
The Name for the day is hakuna
The Name for the day is simba
The Name for the day is simba/lion

Related

Simple bash program which compares values

I have a file which contains varoius data (date,time,speed, distance from the front, distance from the back), the file looks like this, just with more rows:
2003.09.23.,05:05:21:64,134,177,101
2009.03.10.,17:46:17:81,57,102,57
2018.01.05.,00:30:37:04,354,145,156
2011.07.11.,23:21:53:43,310,125,47
2011.06.26.,07:42:10:30,383,180,171
I'm trying to write a simple Bash program, which tells the dates and times when the 'distance from the front' is less than the provided parameter ($1)
So far I wrote:
#!/bin/bash
if [ $# -eq 0 -o $# -gt 1 ]
then
echo "wrong number of parameters"
fi
i=0
fdistance=()
input='auto.txt'
while IFS= read -r line
do
year=${line::4}
month=${line:5:2}
day=${line:8:2}
hour=${line:12:2}
min=${line:15:2}
sec=${line:18:2}
hthsec=${line:21:2}
fdistance=$(cut -d, -f 4)
if [ "$fdistance[$i]" -lt "$1" ]
then
echo "$year[$i]:$month[$i]:$day[$i],$hour[$i]:$min[$i]:$sec[$i]:$hthsec[$i]"
fi
i=`expr $i + 1`
done < "$input"
but this gives the error "whole expression required" and doesn't work at all.
If you have the option of using awk, the entire process can be reduced to:
awk -F, -v dist=150 '$4<dist {split($1,d,"."); print d[1]":"d[2]":"d[3]","$2}' file
Where in the example above, any record with distance (field 4, $4) less than the dist variable value takes the date field (field 1, $1) and splits() the field into the array d on "." where the first 3 elements will be year, mo, day and then simply prints the output of those three elements separated by ":" (which eliminates the stray "." at the end of the field). The time (field 2, $2) is output unchanged.
Example Use/Output
With your sample data in file, you can do:
$ awk -F, -v dist=150 '$4<dist {split($1,d,"."); print d[1]":"d[2]":"d[3]","$2}' file
2009:03:10,17:46:17:81
2018:01:05,00:30:37:04
2011:07:11,23:21:53:43
Which provides the records in the requested format where the distance is less than 150. If you call awk from within your script you can pass the 150 in from the 1st argument to your script.
You can also accomplish this task by substituting a ':' for each '.' in the first field with gsub() and outputting a substring of the first field with substr() that drops the last character, e.g.
awk -F, -v dist=150 '$4<dist {gsub(/[.]/,":",$1); print substr($1,0,length($1)-1),$2}' file
(same output)
While parsing the data is a great exercise for leaning string handling in shell or bash, in practice awk will be Orders of Magnitude faster than a shell script. Processing a million line file -- the difference in runtime can be seconds with awk compared to minutes (or hours) with a shell script.
If this is an exercise to learn string handling in your shell, just put this in your hip pocket for later understanding that awk is the real Swiss Army-Knife for text processing. (well worth the effort to learn)
Would you try the following:
#/bin/bash
if (( $# != 1 )); then
echo "usage: $0 max_distance_from_the_front" >& 2 # output error message to the stderr
exit 1
fi
input="auto.txt"
while IFS=, read -r mydate mytime speed fdist bdist; do # split csv and assign variables
mydate=${mydate%.}; mydate=${mydate//./:} # reformat the date string
if (( fdist < $1 )); then # if the front disatce is less than $1
echo "$mydate,$mytime" # then print the date and time
fi
done < "$input"
Sample output with the same parameter as Keldorn:
$ ./test.sh 130
2009:03:10,17:46:17:81
2011:07:11,23:21:53:43
There are a few odd things in your script:
Why is fdistance an array. It is not necessary (and here done wrong) since the file is read line by line.
What is the cut of the line fdistance=$(cut -d, -f 4) supposed to cut, what's the input?
(Note: When invalid parameters, better end the script right away. Added in the example below.)
Here is a working version (apart from the parsing of the date, but that is not what your question was about so I skipped it):
#!/usr/bin/env bash
if [ $# -eq 0 -o $# -gt 1 ]
then
echo "wrong number of parameters"
exit 1
fi
input='auto.txt'
while IFS= read -r line
do
fdistance=$(echo "$line" | awk '{split($0,a,","); print a[4]}')
if [ "$fdistance" -lt "$1" ]
then
echo $line
fi
done < "$input"
Sample output:
$ ./test.sh 130
2009.03.10.,17:46:17:81,57,102,57
2011.07.11.,23:21:53:43,310,125,47
$

How do i compare a given string with multiple lines of text in bash?

What i wanna do is assign the 3rd field (each field is separated by :) from each line in Nurses.txt to a variable and compare it with another string which is manually given by the user when he runs the script.
Nurses.txt has this content in it:
12345:Ana Correia:CSLisboa:0:1
98765:Joao Vieira:CSPorto:0:1
54321:Joana Pereira:CSSantarem:0:1
65432:Jorge Vaz:CSSetubal:0:1
76543:Diana Almeida:CSLeiria:0:1
87654:Diogo Cruz:CSBraga:0:1
32198:Bernardo Pato:CSBraganca:0:1
21654:Maria Mendes:CSBeja:0:1
88888:Alice Silva:CSEvora:0:1
96966:Gustavo Carvalho:CSFaro:0:1
And this is the script I have so far, add_nurses.sh:
#!/bin/bash
CS=$(awk -F "[:]" '{print $3}' nurses.txt)
if [["$CS" == "$3"]] ;
then
echo "Error. There is already a nurse registered in that zone";
else
echo "There are no nurses registered in that zone";
fi
When I try to run the script and give it some arguments as shown here:
./add_nurses "Ana Correia" 12345 "CSLisboa" 0
It´s supposed to return "Error. There is already a nurse registered in that zone" but instead it just tells me i have an Output error in Line #6...
A simpler and shorter way to do this job is
if grep -q "^[^:]*:[^:]*:$3:" nurses.txt; then
echo "Error. There is already a nurse registered in that zone"
else
echo "There are no nurses registered in that zone"
fi
The grep call can be simplified as grep -Fq ":$3:" if there is no risk of collision with other fields.
Alternatively, in pure bash without using any external command line utilities:
#!/bin/bash
while IFS=: read -r id name region rest && [[ $region != "$3" ]]; do
:
done < nurses.txt
if [[ $region = "$3" ]]; then
echo "Error. There is already a nurse registered in that zone"
else
echo "There are no nurses registered in that zone"
fi
An alternative way to read the colon separated file would not need awk at all, just bash built-in commands:
read to read from a file into variables
with the -r option to prevent backslash interpretation
IFS as Internal Field Separator to specify the colon : as field separator
#!/bin/bash
# parse parameters to variables
set add_nurse=$1
set add_id=$2
set add_zone=$3
# read colon separated file
set IFS=":"
while read -r nurse id zone d1 d2; do
echo "Nurse: $nurse (ID $id)" "Registered Zone: $zone" "$d1" "$d2"
if [ "$nurse" == "$add_nurse" ] ; then
echo "Found specified nurse '$add_nurse' already registered for zone '$zone'.'"
exit 1
fi
if [ "$zone" == "$add_zone" ] ; then
echo "Found another nurse '$nurse' already registered for specified zone '$add_zone'.'"
exit 1
fi
done < nurses.txt
# reset IFS to default: space, tab, newline
unset IFS
# no records found matching nurse or zone
echo "No nurse is registered for specified zone."
See also:
bash - Read cells in csv file - Unix & Linux Stack Exchange
Judging by the user input (by field from the nurses.txt) to determine if there is indeed a nurse in a given zone according to the op's description, I came up with this solution.
#!/usr/bin/env bash
user_input=("$#")
mapfile -t text_input < <(awk -F':' '{print $2, $1, $3, $4}' nurses.txt)
pattern_from_text_input=$(IFS='|'; printf '%s' "#(${text_input[*]})")
if [[ ${user_input[*]} == $pattern_from_text_input ]]; then
printf 'Error. There is already a nurse "%s" registered in that zone!' "$1" >&2
else
printf 'There are no nurse "%s" registered in that zone.' "$1"
fi
run the script with a debug flag -x e.g.
bash -x ./add_nurses ....
to see what the script is actually doing.
The script will work with the (given order) sample of arguments otherwise an option parser might be required.
It requires bash4+ version because of mapfile aka readarray. For completeness a while read loop and an array assignment is an alternative to mapfile.
while read -r lines; do
text_input+=("$lines")
done < <(awk -F':' '{print $2, $1, $3, $4}' nurses.txt)
First, the content of $CS is a list of items and not only one item so to compare the input against all the items you need to iterate over the fields. Otherwise, you will never get true for the condition.
Second [[ is not the correct command to use here, it will consider the content as bash commands and not as strings.
I updated your script, to make it work for the case you described above
#!/bin/bash
CS=$(awk -F "[:]" '{print $3}' nurses.txt)
for item in `echo $CS`
do
[ "$item" == "$3" ] && echo "Error. There is already a nurse registered in that zone" && exit 1
done
echo "There are no nurses registered in that zone";
Output
➜ $ ./add_nurses.sh "Ana Correia" 12345 "CSLisboa" 0
Error. There is already a nurse registered in that zone
➜ $ ./add_nurses.sh "Ana Correia" 12345 "CSLisboadd" 0
There are no nurses registered in that zone
As already stated in comments and answer:
use single brackets with space inside to test variables: [ "$CS" == "$3" ]
if using awk to get 3rd field of CSV file, it actually returns a column with multiple values as array: verify output by echo "$CS"
So you must use a loop to test each element of the array.
If you iterate over each value of the 3rd nurse's column you can apply almost the same if-test. Only difference are the consequences:
in the case when a value does not match you will continue with the next value
if a value matches you could leave the loop, also the bash-script
#!/bin/bash
# array declaration follows pattern: array=(elements)
CS_array=($(awk -F "[:]" '{print $3}' nurses.txt))
# view how the awk output looks: like an array ?!
echo "$CS_array"
# use a for-each loop to check each string-element of the array
for CS in "${CS_array[#]}" ;
do
# your existing id with corrected test brackets
if [ "$CS" == "$3" ] ;
then
echo "Error. There is already a nurse registered in that zone"
# exit to break the loop if a nurse was found
exit 1
# no else needed, only a 'not found' after all have looped without match
fi
done
echo "There are no nurses registered in that zone"
Notice how complicated the array was passed to the loop:
the "" (double quotes) around are used to get each element as string, even if containing spaces inside (like a nurse's name might)
the ${} (dollar curly-braces) enclosing an expression with more than just a variable name
the expression CS_array[#] will get each element ([#]) from the array (CS_array)
You could also experiment with the array (different attributes):
echo "${#CS_array[*]}" # size of array with prepended hash
echo "${CS_array[*]}" # word splitting based on $IFS
echo "${CS_array[0]}" # first element of the array, 0 based
Detailed tutorial on arrays in bash: A Complete Guide on How To Use Bash Arrays
See also:
Loop through an array of strings in Bash?

extract information from a file in unix using shell script

I have a below file which containing some data
name:Mark
age:23
salary:100
I want to read only name, age and assign to a variable in shell script
How I can achieve this thing
I am able to real all file data by using below script not a particular data
#!/bin/bash
file="/home/to/person.txt"
val=$(cat "$file")
echo $val
please suggest.
Rather than running multiple greps or bash loops, you could just run a single read that reads the output of a single invocation of awk:
read age salary name <<< $(awk -F: '/^age/{a=$2} /^salary/{s=$2} /^name/{n=$2} END{print a,s,n}' file)
Results
echo $age
23
echo $salary
100
echo $name
Mark
If the awk script sees an age, it sets a to the age. If it sees a salary , it sets s to the salary. If it sees a name, it sets n to the name. At the end of the input file, it outputs what it has seen for the read command to read.
Using grep : \K is part of perl regex. It acts as assertion and checks if text supplied left to it is present or not. IF present prints as per regex ignoring the text left to it.
name=$(grep -oP 'name:\K.*' person.txt)
age=$(grep -oP 'age:\K.*' person.txt)
salary=$(grep -oP 'salary:\K.*' person.txt)
Or using awk one liner ,this may break if the line containing extra : .
declare $(awk '{sub(/:/,"=")}1' person.txt )
Will result in following result:
sh-4.1$ echo $name
Mark
sh-4.1$ echo $age
23
sh-4.1$ echo $salary
100
You could try this
if your data is in a file: data.txt
name:vijay
age:23
salary:100
then you could use a script like this
#!/bin/bash
# read will read a line until it hits a record separator i.e. newline, at which
# point it will return true, and store the line in variable $REPLY
while read
do
if [[ $REPLY =~ ^name:.* || $REPLY =~ ^age:.* ]]
then
eval ${REPLY%:*}=${REPLY#*:} # strip suffix and prefix
fi
done < data.txt # read data.txt from STDIN into the while loop
echo $name
echo $age
output
vijay
23
well if you can store data in json or other similar formate it will be very easy to access complex data
data.json
{
"name":"vijay",
"salary":"100",
"age": 23
}
then you can use jq to parse json and get data easily
jq -r '.name' data.json
vijay

Parsing .csv file in bash, not reading final line

I'm trying to parse a csv file I made with Google Spreadsheet. It's very simple for testing purposes, and is basically:
1,2
3,4
5,6
The problem is that the csv doesn't end in a newline character so when I cat the file in BASH, I get
MacBook-Pro:Desktop kkSlider$ cat test.csv
1,2
3,4
5,6MacBook-Pro:Desktop kkSlider$
I just want to read line by line in a BASH script using a while loop that every guide suggests, and my script looks like this:
while IFS=',' read -r last first
do
echo "$last $first"
done < test.csv
The output is:
MacBook-Pro:Desktop kkSlider$ ./test.sh
1 2
3 4
Any ideas on how I could have it read that last line and echo it?
Thanks in advance.
You can force the input to your loop to end with a newline thus:
#!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
echo "$last $first"
done
Unfortunately, this may result in an empty line at the end of your output if the input already has a newline at the end. You can fix that with a little addition:
!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
if [[ $last != "" ]] ; then
echo "$last $first"
fi
done
Another method relies on the fact that the values are being placed into the variables by the read but they're just not being output because of the while statement:
#!/bin/bash
while IFS=',' read -r last first
do
echo "$last $first"
done <test.csv
if [[ $last != "" ]] ; then
echo "$last $first"
fi
That one works without creating another subshell to modify the input to the while statement.
Of course, I'm assuming here that you want to do more inside the loop that just output the values with a space rather than a comma. If that's all you wanted to do, there are other tools better suited than a bash read loop, such as:
tr "," " " <test.csv
cat file |sed -e '${/^$/!s/$/\n/;}'| while IFS=',' read -r last first; do echo "$last $first"; done
If the last (unterminated) line needs to be processed differently from the rest, #paxdiablo's version with the extra if statement is the way to go; but if it's going to be handled like all the others, it's cleaner to process it in the main loop.
You can roll the "if there was an unterminated last line" into the main loop condition like this:
while IFS=',' read -r last first || [ -n "$last" ]
do
echo "$last $first"
done < test.csv

Read a config file in BASH without using "source"

I'm attempting to read a config file that is formatted as follows:
USER = username
TARGET = arrows
I realize that if I got rid of the spaces, I could simply source the config file, but for security reasons I'm trying to avoid that. I know there is a way to read the config file line by line. I think the process is something like:
Read lines into an array
Filter out all of the lines that start with #
search for the variable names in the array
After that I'm lost. Any and all help would be greatly appreciated. I've tried something like this with no success:
backup2.config>cat ~/1
grep '^[^#].*' | while read one two;do
echo $two
done
I pulled that from a forum post I found, just not sure how to modify it to fit my needs since I'm so new to shell scripting.
http://www.linuxquestions.org/questions/programming-9/bash-shell-program-read-a-configuration-file-276852/
Would it be possible to automatically assign a variable by looping through both arrays?
for (( i = 0 ; i < ${#VALUE[#]} ; i++ ))
do
"${NAME[i]}"=VALUE[i]
done
echo $USER
Such that calling $USER would output "username"? The above code isn't working but I know the solution is something similar to that.
The following script iterates over each line in your input file (vars in my case) and does a pattern match against =. If the equal sign is found it will use Parameter Expansion to parse out the variable name from the value. It then stores each part in it's own array, name and value respectively.
#!/bin/bash
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=${line%% =*}
value[i]=${line#*= }
((i++))
fi
done < vars
echo "total array elements: ${#name[#]}"
echo "name[0]: ${name[0]}"
echo "value[0]: ${value[0]}"
echo "name[1]: ${name[1]}"
echo "value[1]: ${value[1]}"
echo "name array: ${name[#]}"
echo "value array: ${value[#]}"
Input
$ cat vars
sdf
USER = username
TARGET = arrows
asdf
as23
Output
$ ./varscript
total array elements: 2
name[0]: USER
value[0]: username
name[1]: TARGET
value[1]: arrows
name array: USER TARGET
value array: username arrows
First, USER is a shell environment variable, so it might be better if you used something else. Using lowercase or mixed case variable names is a way to avoid name collisions.
#!/bin/bash
configfile="/path/to/file"
shopt -s extglob
while IFS='= ' read lhs rhs
do
if [[ $lhs != *( )#* ]]
then
# you can test for variables to accept or other conditions here
declare $lhs=$rhs
fi
done < "$configfile"
This sets the vars in your file to the value associated with it.
echo "Username: $USER, Target: $TARGET"
would output
Username: username, Target: arrows
Another way to do this using keys and values is with an associative array:
Add this line before the while loop:
declare -A settings
Remove the declare line inside the while loop and replace it with:
settings[$lhs]=$rhs
Then:
# set keys
user=USER
target=TARGET
# access values
echo "Username: ${settings[$user]}, Target: ${settings[$target]}"
would output
Username: username, Target: arrows
I have a script which only takes a very limited number of settings, and processes them one at a time, so I've adapted SiegeX's answer to whitelist the settings I care about and act on them as it comes to them.
I've also removed the requirement for spaces around the = in favour of ignoring any that exist using the trim function from another answer.
function trim()
{
local var=$1;
var="${var#"${var%%[![:space:]]*}"}"; # remove leading whitespace characters
var="${var%"${var##*[![:space:]]}"}"; # remove trailing whitespace characters
echo -n "$var";
}
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
setting_name=$(trim "${line%%=*}");
setting_value=$(trim "${line#*=}");
case "$setting_name" in
max_foos)
prune_foos $setting_value;
;;
max_bars)
prune_bars $setting_value;
;;
*)
echo "Unrecognised setting: $setting_name";
;;
esac;
fi
done <"$config_file";
Thanks SiegeX. I think the later updates you mentioned does not reflect in this URL.
I had to edit the regex to remove the quotes to get it working. With quotes, array returned is empty.
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=${line%% =*}
value[i]=${line##*= }
((i++))
fi
done < vars
A still better version is .
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=`echo $line | cut -d'=' -f 1`
value[i]=`echo $line | cut -d'=' -f 2`
((i++))
fi
done < vars
The first version is seen to have issues if there is no space before and after "=" in the config file. Also if the value is missing, i see that the name and value are populated as same. The second version does not have any of these. In addition it trims out unwanted leading and trailing spaces.
This version reads values that can have = within it. Earlier version splits at first occurance of =.
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=`echo $line | cut -d'=' -f 1`
value[i]=`echo $line | cut -d'=' -f 2-`
((i++))
fi
done < vars

Resources