extract information from a file in unix using shell script - shell

I have a below file which containing some data
name:Mark
age:23
salary:100
I want to read only name, age and assign to a variable in shell script
How I can achieve this thing
I am able to real all file data by using below script not a particular data
#!/bin/bash
file="/home/to/person.txt"
val=$(cat "$file")
echo $val
please suggest.

Rather than running multiple greps or bash loops, you could just run a single read that reads the output of a single invocation of awk:
read age salary name <<< $(awk -F: '/^age/{a=$2} /^salary/{s=$2} /^name/{n=$2} END{print a,s,n}' file)
Results
echo $age
23
echo $salary
100
echo $name
Mark
If the awk script sees an age, it sets a to the age. If it sees a salary , it sets s to the salary. If it sees a name, it sets n to the name. At the end of the input file, it outputs what it has seen for the read command to read.

Using grep : \K is part of perl regex. It acts as assertion and checks if text supplied left to it is present or not. IF present prints as per regex ignoring the text left to it.
name=$(grep -oP 'name:\K.*' person.txt)
age=$(grep -oP 'age:\K.*' person.txt)
salary=$(grep -oP 'salary:\K.*' person.txt)
Or using awk one liner ,this may break if the line containing extra : .
declare $(awk '{sub(/:/,"=")}1' person.txt )
Will result in following result:
sh-4.1$ echo $name
Mark
sh-4.1$ echo $age
23
sh-4.1$ echo $salary
100

You could try this
if your data is in a file: data.txt
name:vijay
age:23
salary:100
then you could use a script like this
#!/bin/bash
# read will read a line until it hits a record separator i.e. newline, at which
# point it will return true, and store the line in variable $REPLY
while read
do
if [[ $REPLY =~ ^name:.* || $REPLY =~ ^age:.* ]]
then
eval ${REPLY%:*}=${REPLY#*:} # strip suffix and prefix
fi
done < data.txt # read data.txt from STDIN into the while loop
echo $name
echo $age
output
vijay
23

well if you can store data in json or other similar formate it will be very easy to access complex data
data.json
{
"name":"vijay",
"salary":"100",
"age": 23
}
then you can use jq to parse json and get data easily
jq -r '.name' data.json
vijay

Related

Using AWK for a variable inside a loop

So I get some URL's from my other team and I need to identiy a defined pattern in that URL and save the value after the pattern inside a variable.
Can this be achieved ?
**Input file: Just an example**
https://stackoverflow.com/questions/hakuna
https://stackoverflow.com/questions/simba
I wrote a simple for loop for this purpose
for i in `cat inputFile`
do
storeVal=awk -v $i -F"questions/" '{print$2}'
echo "The Name for the day is ${storeVal}"
length=`secondScript.sh ${storeVal}`
if [[ $length -gt 10 ]]
then
thirdScript.sh ${storeVal}
elif [[ $length -lt 10 ]]
then
fourthScript.sh ${storeVal}
else
echo "The length of for ${storeVal} is undefined"
done
Desired Output:
The Name for the day is hakuna
The length of for hakuna is greater than 10
Command1 hakuna executed
Command2 hakuna executed
The Name for the day is simba
Command1 simba executed
Command2 simba executed
And extra point to be noted.
The reason why I need to store the awk cut value in a variable is because I need to use that variable in multiple places with the loop.
Since it sounds like you want to run a command for every line in the input file, you can just use the built-in functionality of the shell:
while IFS=/ read -ra pieces; do
printf '%s\n' "${pieces[#]}" # prints each piece on a separate line
done < inputFile
If you always want the last part of the url (i.e. after the last /) on each line, then you can use "${pieces[-1]}":
while IFS=/ read -ra pieces; do
variable=${pieces[-1]} # do whatever you want with this variable
printf 'The Name for the day is %s\n' "$variable" # e.g. print it
done < inputFile
Since you want to extract all the string after the questions/ bit, you can simply use shell pattern substitution for that.
while IFS= read -r line; do
storeVal=${line##*questions/}
echo "The Name for the day is ${storeVal}"
done < 'inputFile'
Sample input file:
$ cat inputFile
https://stackoverflow.com/questions/hakuna
https://stackoverflow.com/questions/simba
https://stackoverflow.com/questions/simba/lion
Output of script:
The Name for the day is hakuna
The Name for the day is simba
The Name for the day is simba/lion

envsubst based on string

I have the following two lines in my text file
$MyEnv
someText$MyEnv
I want to use envsubst to only replace the second occurence of MyEnv variable. How can I use the string "someText" to distinguish between the first and second occurrence of the variable and substitute in env variable?
so envsubst < file1 >file2
file 2
$MyEnv
someTextValueofMyEnv
How is this possible
The following code will substitute all environment variables on the second line of the input. The requested envsubst command is the only non-builtin that is used.
L=0
while read line; do
L=$((L+1))
if [ $L = 2 ]; then
echo "$line" |envsubst
else
echo "$line"
fi
done < file1 > file2
Start reading with the last line since it dictates the inputs and outputs; the contents of file1 are read line by line, populating $line for each iteration of the while loop. The echo lines are piped into file2.
We have a line counter $L which increments at the beginning of the loop. If we're on line 2, we send the line through envsubst. Otherwise, we just report it.
You also asked how you could use the string "someText" to distinguish between occurrences. I'm not exactly sure what you mean by this, but consider this:
while read line; do
# $line contains the string 'someText$MyEnv'
# (literally: $line does not match itself when removing that string)
if [ "$line" != "${line#*someText\$MyEnv}" ]; then
echo "$line" |envsubst
else
echo "$line"
fi
done < file1 > file2
Note: envsubst will only substitute exported variables. The envsubst command is not portable; it's part of GNU gettext and it is not a part of either the POSIX standard utilities or the Linux Standard Base commands (LSB).
To be fully portable (and fully using sh builtins!), you'd need to use eval, which is unsafe without lots of extra checks.

Parsing a config file in bash

Here's my config file (dansguardian-config):
banned-phrase duck
banned-site allaboutbirds.org
I want to write a bash script that will read this config file and create some other files for me. Here's what I have so far, it's mostly pseudo-code:
while read line
do
# if line starts with "banned-phrase"
# add rest of line to file bannedphraselist
# fi
# if line starts with "banned-site"
# add rest of line to file bannedsitelist
# fi
done < dansguardian-config
I'm not sure if I need to use grep, sed, awk, or what.
Hope that makes sense. I just really hate DansGuardian lists.
With awk:
$ cat config
banned-phrase duck frog bird
banned-phrase horse
banned-site allaboutbirds.org duckduckgoose.net
banned-site froggingbirds.gov
$ awk '$1=="banned-phrase"{for(i=2;i<=NF;i++)print $i >"bannedphraselist"}
$1=="banned-site"{for(i=2;i<=NF;i++)print $i >"bannedsitelist"}' config
$ cat bannedphraselist
duck
frog
bird
horse
$ cat bannedsitelist
allaboutbirds.org
duckduckgoose.net
froggingbirds.gov
Explanation:
In awk by default each line is separated into fields by whitespace and each field is handled by $i where i is the ith field i.e. the first field on each line is $1, the second field on each line is $2 upto $NF where NF is the variable that contains the number of fields on the given line.
So the script is simple:
Check the first field against our required strings $1=="banned-phrase"
If the first field matched then loop over all the other fields for(i=2;i<=NF;i++) and print each field print $i and redirect the output to the file >"bannedphraselist".
You could do
sed -n 's/^banned-phrase *//p' dansguardian-config > bannedphraselist
sed -n 's/^banned-site *//p' dansguardian-config > bannedsitelist
Although that means reading the file twice. I doubt that the possible performance loss matters though.
You can read multiple variables at once; by default they're split on whitespace.
while read command target; do
case "$command" in
banned-phrase) echo "$target" >>bannedphraselist;;
banned-site) echo "$target" >>bannedsitelist;;
"") ;; # blank line
*) echo >&2 "$0: unrecognized config directive '$command'";;
esac
done < dansguardian-config
Just as an example. A smarter implementation would read the list files first, make sure things weren't already banned, etc.
What is the problem with all the solutions which uses echo text >> file? It can be checked with strace that in every such step the file is opened, then positioned to the end, then text is written and file is closed. So if there is 1000 times echo text >> file then there will be 1000 open, lseek, write, close. The number of open, lseek and close can be reduced a lot on the following way:
while read key val; do
case $key in
banned-phrase) echo $val>&2;;
banned-site) echo $val;;
esac
done >bannedsitelist 2>bannedphraselist <dansguardian-config
The stdout and stderr is redirected to files and kept open while the loop is alive. So the files are opened once and closed once. No need of lseek. Also the file caching is used more in this way as the unnecessary calls to close will not flush the buffers each time.
while read name value
do
if [ $name = banned-phrase ]
then
echo $value >> bannedphraselist
elif [ $name = banned-site ]
then
echo $value >> bannedsitelist
fi
done < dansguardian-config
Better to use awk:
awk '$1 ~ /^banned-phrase/{print $2 >> "bannedphraselist"}
$1 ~ /^banned-site/{print $2 >> "bannedsitelist"}' dansguardian-config

Parsing .csv file in bash, not reading final line

I'm trying to parse a csv file I made with Google Spreadsheet. It's very simple for testing purposes, and is basically:
1,2
3,4
5,6
The problem is that the csv doesn't end in a newline character so when I cat the file in BASH, I get
MacBook-Pro:Desktop kkSlider$ cat test.csv
1,2
3,4
5,6MacBook-Pro:Desktop kkSlider$
I just want to read line by line in a BASH script using a while loop that every guide suggests, and my script looks like this:
while IFS=',' read -r last first
do
echo "$last $first"
done < test.csv
The output is:
MacBook-Pro:Desktop kkSlider$ ./test.sh
1 2
3 4
Any ideas on how I could have it read that last line and echo it?
Thanks in advance.
You can force the input to your loop to end with a newline thus:
#!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
echo "$last $first"
done
Unfortunately, this may result in an empty line at the end of your output if the input already has a newline at the end. You can fix that with a little addition:
!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
if [[ $last != "" ]] ; then
echo "$last $first"
fi
done
Another method relies on the fact that the values are being placed into the variables by the read but they're just not being output because of the while statement:
#!/bin/bash
while IFS=',' read -r last first
do
echo "$last $first"
done <test.csv
if [[ $last != "" ]] ; then
echo "$last $first"
fi
That one works without creating another subshell to modify the input to the while statement.
Of course, I'm assuming here that you want to do more inside the loop that just output the values with a space rather than a comma. If that's all you wanted to do, there are other tools better suited than a bash read loop, such as:
tr "," " " <test.csv
cat file |sed -e '${/^$/!s/$/\n/;}'| while IFS=',' read -r last first; do echo "$last $first"; done
If the last (unterminated) line needs to be processed differently from the rest, #paxdiablo's version with the extra if statement is the way to go; but if it's going to be handled like all the others, it's cleaner to process it in the main loop.
You can roll the "if there was an unterminated last line" into the main loop condition like this:
while IFS=',' read -r last first || [ -n "$last" ]
do
echo "$last $first"
done < test.csv

How to extract values from a string on ksh (korn shell)

I have the following input
MyComposite[2.1], partition=default, mode=active, state=on, isDefault=true, deployedTime=2012-05-07T15:35:22.473-07:00
MessageManager[1.0], partition=default, mode=active, state=on, isDefault=true, deployedTime=2012-05-07T15:37:14.137-07:00
SimpleApproval[1.0], partition=default, mode=active, state=on, isDefault=true, deployedTime=2012-05-07T15:28:39.599-07:00
and I have a script that parses the input line by line from a file but I don't have a clue on how I could extract individual parameters from each line into local variables so I can perform additional processes
So far I'm trying the following:
#!/bin/ksh
file="output"
compositeName="foo" ci=0
# while loop while read line do
# display line or do somthing on $line
if echo "$line" | egrep -q '\[[0-9]*\.[0-9]*\].*?(mode=active).*?
(state=on)' then compositeName=$( echo "$line" | egrep '[0-9]*' )
echo "$compositeName"
#echo "$line"
fi
done <"$file"
I'm somwhow lookint to extract only two values from this string, the first word and the float between brackets
ie:
name = MyComposite
version = 2.1
any ideas?
I'm not sure if those line numbers are in the file or not. If not, you can do this:
#!/usr/bin/env ksh
while IFS="," read nameVersion line; do
name="${nameVersion%%\[*}"
version="${nameVersion//*\[+([0-9.])\]*/\1}"
print "name=$name version=$version"
done < "$file"
If the line numbers are in the file, change the name assignment in the above script to name="${nameVersion//+([0-9]).+( )+(*)\[*/\3}"

Resources