How to extract values from a string on ksh (korn shell) - shell

I have the following input
MyComposite[2.1], partition=default, mode=active, state=on, isDefault=true, deployedTime=2012-05-07T15:35:22.473-07:00
MessageManager[1.0], partition=default, mode=active, state=on, isDefault=true, deployedTime=2012-05-07T15:37:14.137-07:00
SimpleApproval[1.0], partition=default, mode=active, state=on, isDefault=true, deployedTime=2012-05-07T15:28:39.599-07:00
and I have a script that parses the input line by line from a file but I don't have a clue on how I could extract individual parameters from each line into local variables so I can perform additional processes
So far I'm trying the following:
#!/bin/ksh
file="output"
compositeName="foo" ci=0
# while loop while read line do
# display line or do somthing on $line
if echo "$line" | egrep -q '\[[0-9]*\.[0-9]*\].*?(mode=active).*?
(state=on)' then compositeName=$( echo "$line" | egrep '[0-9]*' )
echo "$compositeName"
#echo "$line"
fi
done <"$file"
I'm somwhow lookint to extract only two values from this string, the first word and the float between brackets
ie:
name = MyComposite
version = 2.1
any ideas?

I'm not sure if those line numbers are in the file or not. If not, you can do this:
#!/usr/bin/env ksh
while IFS="," read nameVersion line; do
name="${nameVersion%%\[*}"
version="${nameVersion//*\[+([0-9.])\]*/\1}"
print "name=$name version=$version"
done < "$file"
If the line numbers are in the file, change the name assignment in the above script to name="${nameVersion//+([0-9]).+( )+(*)\[*/\3}"

Related

Array variable in command in url

I have problem with url formatting in bash script. In below code url request:
text="$(lynx --dump https://address/"${array[${i}]}")"
returns HTTP Error 400. The request URL is invalid. I assume that on
"${array[${i}]}"
is something wrong in url part. But I can't figure out what is right format.
#!/bin/bash
saveIFS="$IFS"
IFS=$'\n'
array=($(<words))
IFS="$saveIFS"
elements=${#array[#]}
for (( i=0;i<$elements;i++))
do
text="$(lynx --dump https://address/"${array[${i}]}")"
echo "$text" >> "outputfilename"
fi
done
I also tried:
text="$(lynx --dump https://address/${array[${i}]})"
Try
#!/bin/bash
IFS=$'\n' read -rd '' -a array <words
elements=${#array[#]}
for (( i=0;i<$elements;i++))
do
text="$(lynx --dump https://address/"${array[${i}]}")"
echo "$text" >> "outputfilename"
done
The array variable wasn't being set with array=($(<words))
You can use read or readarray, but this example is with read
Incidentally, putting IFS=$'\n' before read without a command separator ; sets $IFS only for the read command, removing the need to save and re-set $IFS
You don't need an array at all; the following will work in any POSIX-compatible shell, assuming you have one URL component per line:
while IFS= read -r line; do
text=$(lynx --dump https://address/"$line")
echo "$text"
done < words >> output filename
My two cents...
I prefer use printf -v for this, and this could be build like a filter:
catWeb() {
while IFS= read -r word;do
printf -v url "https://address/%s" "$word"
lynx --dump "$url"
done
}
catWeb <words >outputfilename
I was reading windows file. Lines ended with CR LF. So address contains
\r
character. I can remove it:
array[${i}]=${array[${i}]%$'\r'}
Or I can reformat input file so lines end only with LF.
Main structure of working script reading from CR LF file is
#!/bin/bash
IFS=$'\n' read -rd '' -a array <words
elements=${#array[#]}
for (( i=0;i<$elements;i++))
do
array[${i}]=${array[${i}]%$'\r'}
text="$(lynx --dump https://adrress/"${array[${i}]}")"
if [ ${#text} -gt 1 ]
then
echo "$text" >> "filename"
else
echo "${array[${i}]}" >> "filename2"
fi
done

envsubst based on string

I have the following two lines in my text file
$MyEnv
someText$MyEnv
I want to use envsubst to only replace the second occurence of MyEnv variable. How can I use the string "someText" to distinguish between the first and second occurrence of the variable and substitute in env variable?
so envsubst < file1 >file2
file 2
$MyEnv
someTextValueofMyEnv
How is this possible
The following code will substitute all environment variables on the second line of the input. The requested envsubst command is the only non-builtin that is used.
L=0
while read line; do
L=$((L+1))
if [ $L = 2 ]; then
echo "$line" |envsubst
else
echo "$line"
fi
done < file1 > file2
Start reading with the last line since it dictates the inputs and outputs; the contents of file1 are read line by line, populating $line for each iteration of the while loop. The echo lines are piped into file2.
We have a line counter $L which increments at the beginning of the loop. If we're on line 2, we send the line through envsubst. Otherwise, we just report it.
You also asked how you could use the string "someText" to distinguish between occurrences. I'm not exactly sure what you mean by this, but consider this:
while read line; do
# $line contains the string 'someText$MyEnv'
# (literally: $line does not match itself when removing that string)
if [ "$line" != "${line#*someText\$MyEnv}" ]; then
echo "$line" |envsubst
else
echo "$line"
fi
done < file1 > file2
Note: envsubst will only substitute exported variables. The envsubst command is not portable; it's part of GNU gettext and it is not a part of either the POSIX standard utilities or the Linux Standard Base commands (LSB).
To be fully portable (and fully using sh builtins!), you'd need to use eval, which is unsafe without lots of extra checks.

extract information from a file in unix using shell script

I have a below file which containing some data
name:Mark
age:23
salary:100
I want to read only name, age and assign to a variable in shell script
How I can achieve this thing
I am able to real all file data by using below script not a particular data
#!/bin/bash
file="/home/to/person.txt"
val=$(cat "$file")
echo $val
please suggest.
Rather than running multiple greps or bash loops, you could just run a single read that reads the output of a single invocation of awk:
read age salary name <<< $(awk -F: '/^age/{a=$2} /^salary/{s=$2} /^name/{n=$2} END{print a,s,n}' file)
Results
echo $age
23
echo $salary
100
echo $name
Mark
If the awk script sees an age, it sets a to the age. If it sees a salary , it sets s to the salary. If it sees a name, it sets n to the name. At the end of the input file, it outputs what it has seen for the read command to read.
Using grep : \K is part of perl regex. It acts as assertion and checks if text supplied left to it is present or not. IF present prints as per regex ignoring the text left to it.
name=$(grep -oP 'name:\K.*' person.txt)
age=$(grep -oP 'age:\K.*' person.txt)
salary=$(grep -oP 'salary:\K.*' person.txt)
Or using awk one liner ,this may break if the line containing extra : .
declare $(awk '{sub(/:/,"=")}1' person.txt )
Will result in following result:
sh-4.1$ echo $name
Mark
sh-4.1$ echo $age
23
sh-4.1$ echo $salary
100
You could try this
if your data is in a file: data.txt
name:vijay
age:23
salary:100
then you could use a script like this
#!/bin/bash
# read will read a line until it hits a record separator i.e. newline, at which
# point it will return true, and store the line in variable $REPLY
while read
do
if [[ $REPLY =~ ^name:.* || $REPLY =~ ^age:.* ]]
then
eval ${REPLY%:*}=${REPLY#*:} # strip suffix and prefix
fi
done < data.txt # read data.txt from STDIN into the while loop
echo $name
echo $age
output
vijay
23
well if you can store data in json or other similar formate it will be very easy to access complex data
data.json
{
"name":"vijay",
"salary":"100",
"age": 23
}
then you can use jq to parse json and get data easily
jq -r '.name' data.json
vijay

Parsing .csv file in bash, not reading final line

I'm trying to parse a csv file I made with Google Spreadsheet. It's very simple for testing purposes, and is basically:
1,2
3,4
5,6
The problem is that the csv doesn't end in a newline character so when I cat the file in BASH, I get
MacBook-Pro:Desktop kkSlider$ cat test.csv
1,2
3,4
5,6MacBook-Pro:Desktop kkSlider$
I just want to read line by line in a BASH script using a while loop that every guide suggests, and my script looks like this:
while IFS=',' read -r last first
do
echo "$last $first"
done < test.csv
The output is:
MacBook-Pro:Desktop kkSlider$ ./test.sh
1 2
3 4
Any ideas on how I could have it read that last line and echo it?
Thanks in advance.
You can force the input to your loop to end with a newline thus:
#!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
echo "$last $first"
done
Unfortunately, this may result in an empty line at the end of your output if the input already has a newline at the end. You can fix that with a little addition:
!/bin/bash
(cat test.csv ; echo) | while IFS=',' read -r last first
do
if [[ $last != "" ]] ; then
echo "$last $first"
fi
done
Another method relies on the fact that the values are being placed into the variables by the read but they're just not being output because of the while statement:
#!/bin/bash
while IFS=',' read -r last first
do
echo "$last $first"
done <test.csv
if [[ $last != "" ]] ; then
echo "$last $first"
fi
That one works without creating another subshell to modify the input to the while statement.
Of course, I'm assuming here that you want to do more inside the loop that just output the values with a space rather than a comma. If that's all you wanted to do, there are other tools better suited than a bash read loop, such as:
tr "," " " <test.csv
cat file |sed -e '${/^$/!s/$/\n/;}'| while IFS=',' read -r last first; do echo "$last $first"; done
If the last (unterminated) line needs to be processed differently from the rest, #paxdiablo's version with the extra if statement is the way to go; but if it's going to be handled like all the others, it's cleaner to process it in the main loop.
You can roll the "if there was an unterminated last line" into the main loop condition like this:
while IFS=',' read -r last first || [ -n "$last" ]
do
echo "$last $first"
done < test.csv

Read a config file in BASH without using "source"

I'm attempting to read a config file that is formatted as follows:
USER = username
TARGET = arrows
I realize that if I got rid of the spaces, I could simply source the config file, but for security reasons I'm trying to avoid that. I know there is a way to read the config file line by line. I think the process is something like:
Read lines into an array
Filter out all of the lines that start with #
search for the variable names in the array
After that I'm lost. Any and all help would be greatly appreciated. I've tried something like this with no success:
backup2.config>cat ~/1
grep '^[^#].*' | while read one two;do
echo $two
done
I pulled that from a forum post I found, just not sure how to modify it to fit my needs since I'm so new to shell scripting.
http://www.linuxquestions.org/questions/programming-9/bash-shell-program-read-a-configuration-file-276852/
Would it be possible to automatically assign a variable by looping through both arrays?
for (( i = 0 ; i < ${#VALUE[#]} ; i++ ))
do
"${NAME[i]}"=VALUE[i]
done
echo $USER
Such that calling $USER would output "username"? The above code isn't working but I know the solution is something similar to that.
The following script iterates over each line in your input file (vars in my case) and does a pattern match against =. If the equal sign is found it will use Parameter Expansion to parse out the variable name from the value. It then stores each part in it's own array, name and value respectively.
#!/bin/bash
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=${line%% =*}
value[i]=${line#*= }
((i++))
fi
done < vars
echo "total array elements: ${#name[#]}"
echo "name[0]: ${name[0]}"
echo "value[0]: ${value[0]}"
echo "name[1]: ${name[1]}"
echo "value[1]: ${value[1]}"
echo "name array: ${name[#]}"
echo "value array: ${value[#]}"
Input
$ cat vars
sdf
USER = username
TARGET = arrows
asdf
as23
Output
$ ./varscript
total array elements: 2
name[0]: USER
value[0]: username
name[1]: TARGET
value[1]: arrows
name array: USER TARGET
value array: username arrows
First, USER is a shell environment variable, so it might be better if you used something else. Using lowercase or mixed case variable names is a way to avoid name collisions.
#!/bin/bash
configfile="/path/to/file"
shopt -s extglob
while IFS='= ' read lhs rhs
do
if [[ $lhs != *( )#* ]]
then
# you can test for variables to accept or other conditions here
declare $lhs=$rhs
fi
done < "$configfile"
This sets the vars in your file to the value associated with it.
echo "Username: $USER, Target: $TARGET"
would output
Username: username, Target: arrows
Another way to do this using keys and values is with an associative array:
Add this line before the while loop:
declare -A settings
Remove the declare line inside the while loop and replace it with:
settings[$lhs]=$rhs
Then:
# set keys
user=USER
target=TARGET
# access values
echo "Username: ${settings[$user]}, Target: ${settings[$target]}"
would output
Username: username, Target: arrows
I have a script which only takes a very limited number of settings, and processes them one at a time, so I've adapted SiegeX's answer to whitelist the settings I care about and act on them as it comes to them.
I've also removed the requirement for spaces around the = in favour of ignoring any that exist using the trim function from another answer.
function trim()
{
local var=$1;
var="${var#"${var%%[![:space:]]*}"}"; # remove leading whitespace characters
var="${var%"${var##*[![:space:]]}"}"; # remove trailing whitespace characters
echo -n "$var";
}
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
setting_name=$(trim "${line%%=*}");
setting_value=$(trim "${line#*=}");
case "$setting_name" in
max_foos)
prune_foos $setting_value;
;;
max_bars)
prune_bars $setting_value;
;;
*)
echo "Unrecognised setting: $setting_name";
;;
esac;
fi
done <"$config_file";
Thanks SiegeX. I think the later updates you mentioned does not reflect in this URL.
I had to edit the regex to remove the quotes to get it working. With quotes, array returned is empty.
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=${line%% =*}
value[i]=${line##*= }
((i++))
fi
done < vars
A still better version is .
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=`echo $line | cut -d'=' -f 1`
value[i]=`echo $line | cut -d'=' -f 2`
((i++))
fi
done < vars
The first version is seen to have issues if there is no space before and after "=" in the config file. Also if the value is missing, i see that the name and value are populated as same. The second version does not have any of these. In addition it trims out unwanted leading and trailing spaces.
This version reads values that can have = within it. Earlier version splits at first occurance of =.
i=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
name[i]=`echo $line | cut -d'=' -f 1`
value[i]=`echo $line | cut -d'=' -f 2-`
((i++))
fi
done < vars

Resources