Name (and set) variables in current shell, based on line input data - shell

I have a SQL*Plus output written into a text file in the following format:
3459906| |2|X1|WAS1| Output1
334596| |2|X1|WAS2| Output1
3495792| |1|X1|WAS1| Output1
687954| |1|X1|WAS2| Output1
I need a shell script to fetch the counts which were at the beginning based on the text after the counts.
For example, If the Text is like |2|X1|WAS1| , then 3459906 should be passed on to a variable x1was12 and if the text is like |2|X1|WAS2| , then 334596 should be passed on to a variable x1was22.
I tried writing a for loop and if condition to pass on the counts, but was unsuccessful:
export filename1='file1.dat'
while read -r line ; do
if [[ grep -i "*|2|X1|WAS1| Output1*" | wc -l -eq 0 ]] ; then
export xwas12=sed -n ${line}p $filename1 | \
sed 's/[^0-9]*//g' | sed 's/..$//'
elif [[ grep -i "*|2|X1|WAS2| Output1*" | wc -l -eq 0 ]] ; then
export x1was22=sed -n ${line}p $filename1 | \
sed 's/[^0-9]*//g' | sed 's/..$//'
elif [[ grep -i "*|1|X1|WAS1| Output1*" | wc -l -eq 0 ]] ; then
export x1was11=sed -n ${line}p $filename1 | \
sed 's/[^0-9]*//g' | sed 's/..$//'
elif [[ grep -i "*|1|X1|WAS2| Output1*" | wc -l -eq 0 ]]
export x1was21=sed -n ${line}p $filename1 | \
sed 's/[^0-9]*//g' | sed 's/..$//'
fi
done < "$filename1"
echo '$x1was12' > output.txt
echo '$x1was22' >> output.txt
echo '$x1was11' >> output.txt
echo '$x1was21' >> output.txt
What I was trying to do was:
Go to the first line in the file
-> Search for the text and if found then assign the sed output to the variable
Then go to the second line of the file
-> Search for the texts in the if commands and assign the sed output to another variable.
same goes for other

while IFS='|' read -r count _ n x was _; do
# remove spaces from all variables
count=${count// /}; n=${n// /}; x=${x// /}; was=${was// /}
varname="${x}${was}${n}"
printf -v "${varname,,}" %s "$count"
done <<'EOF'
3459906| |2|X1|WAS1| Output1
334596| |2|X1|WAS2| Output1
3495792| |1|X1|WAS1| Output1
687954| |1|X1|WAS2| Output1
EOF
With the above executed:
$ echo "$x1was12"
3459906
Of course, the redirection from a heredoc could be replaced with a redirection from a file as well.
How does this work? Let's break it down:
Every time IFS='|' read -r count _ n x was _ is run, it reads a single line, separating it by |s, putting the first column into count, discarding the second by assigning it to _, reading the third into n, the fourth into x, the fifth into was, and the sixth and all following content into _. This practice is discussed in detail in BashFAQ #1.
count=${count// /} is a parameter expansion which prunes spaces from the variable count, by replacing all such spaces with empty strings. See also BashFAQ #100.
"${varname,,}" is another parameter expansion, this one converting a variable's contents to all-lowercase. (This requires bash 4.0; in prior versions, consider "$(tr '[:upper:]' '[:lower:]' <<<"$varname") as a less-efficient alternative).
printf -v "$varname" %s "value" is a mechanism for doing an indirect assignment to the variable named in the variable varname.

If not for the variable names, the whole thing could be done with two commands:
cut -d '|' -f1 file1.dat | tr -d ' ' > output.txt
The variable names make it more interesting. Two bash methods follow, plus a POSIX method...
The following bash code ought to do what the OP's sample code was
meant to do:
declare $(while IFS='|' read a b c d e f ; do
echo $a 1>&2 ; echo x1${e,,}$c=${a/ /}
done < file1.dat 2> output.txt )
Notes:
The bash shell is needed for ${e,,}, (turns "WAS" into "was"), and $a/ /} , (removes a leading space that might be in
$a), and declare.
The while loop parses file1.dat and outputs a bunch of variable assignments. Without the declare this code:
while IFS='|' read a b c d e f ; do
echo x1${e,,}$c=${a/ /} ;
done < file1.dat
Outputs:
x1was12=3459906
x1was22=334596
x1was11=3495792
x1was21=687954
The while loop outputs to two separate streams: stdout (for the declare), and stderr (using the 1>&2 and 2> redirects for
output.txt).
Using bash associative arrays:
declare -A x1was="( $(while IFS='|' read a b c d e f ; do
echo $a 1>&2 ; echo [${e/WAS/}$c]=${a/ /}
done < file1.dat 2> output.txt ) )"
In which case the variable names require brackets:
echo ${x1was[21]}
687954
POSIX shell code (tested using dash):
eval $(while IFS='|' read a b c d e f ; do
echo $a 1>&2; echo x1$(echo $e | tr '[A-Z]' '[a-z]')$c=$(echo $a)
done < file1.dat 2> output.txt )
eval should not be used if there's any doubt about what's in file1.dat. The above code assumes the data in file1.dat is
uniformly dependable.

Related

reading lines in a text file with special characters specifically as quoted '<', '>' in bash shell

I have a text file which is the output difference of two grepped files . the text file has lines like below I need to read the file (loop through the lines in the text file ) and based on text to the left hand side of '<' and right hand side of '>' do something.
editing to add details:
LHS of < OR RHS of >
if either of those, i will need to store the content into a variable, and get the 1st(ABCDEF) 3rd(10) and search (will grep) for them in one of other two files and if found print a message and attach those file(s) names in an email DL. All the file names and directories have been stored in separate variables.
how do i do that.
ps:have basic knowledge on text formatting and bash/shell commands but still learning the scripting syntax.Thanks.
ABCDEF,20200101,10 <
PQRSTU,20200106,11 <
LMNOPQ,20200101,12 <
EFGHIJ,20200102,13 <
KLMNOP,20200103,14 <
STUVWX,20200104,15 <
PQRSTU,20200105,16 <
> LMNOPQ,20200101,10
ABCDEF,20200107,17 <
What wrong am I doing now?
while IFS= read -r line; do
if $line =~ ([^[:blank:]]+)[[:blank:]]+\<
then
IFS=, read -r f1 f2 f3 <<< "${BASH_REMATCH[1]}"
#echo "f1=$f1 f2=$f2 f3=$f3"
zgrep "$f1" file1 | grep "with seq $f3" || zgrep "$f1" file2 | grep "with seq $f3"
elif $line =~ \>[[:blank:]]+([^[:blank:]]+)
then
IFS=, read -r g1 g2 g3 <<< "${BASH_REMATCH[1]}"
#echo "g1=$g1 g2=$g2 g3=$g3"
zgrep "$g1" file3 | grep "with seq $g3" || zgrep "$g1" file3 | grep "with seq $g3"
fi
Would you please try something like:
#!/bin/bash
while IFS= read -r line; do
if [[ $line =~ ([^[:blank:]]+)[[:blank:]]+\< || $line =~ \>[[:blank:]]+([^[:blank:]]+) ]]; then
IFS=, read -r f1 f2 f3 <<< "${BASH_REMATCH[1]}"
echo "f1=$f1 f2=$f2 f3=$f3"
# do something here with "$f1", "$f2" and "$f3"
fi
done < file.txt
Output:
f1=ABCDEF f2=20200101 f3=10
f1=PQRSTU f2=20200106 f3=11
f1=LMNOPQ f2=20200101 f3=12
f1=EFGHIJ f2=20200102 f3=13
f1=KLMNOP f2=20200103 f3=14
f1=STUVWX f2=20200104 f3=15
f1=PQRSTU f2=20200105 f3=16
f1=LMNOPQ f2=20200101 f3=10
f1=ABCDEF f2=20200107 f3=17
Please modify the echo "f1=$f1 f2=$f2 f3=$f3" line to your desired
command such as grep.
The regex ([^[:blank:]]+)[[:blank:]]+\< matches a line which contains <
and assigns the bash variable ${BASH_REMATCH[1]} to the LHS.
On the other hand, the regex \>[[:blank:]]+([^[:blank:]]+) does the similar thing for
a line which contains >.
The statement IFS=, read -r f1 f2 f3 <<< "${BASH_REMATCH[1]}" splits the bash variable
on , and assigns f1, f2 and f3 to the fields.
Please note if the input file is very large, bash solution may not
be efficient in execution time. I used bash just because it will be convenient
to pass the variables to your grep command.
EDIT
Regarding the updated script in your question, please refer to the following modification:
while IFS= read -r line; do
if [[ $line =~ ([^[:blank:]]+)[[:blank:]]+\< ]]; then
IFS=, read -r f1 f2 f3 <<< "${BASH_REMATCH[1]}"
# echo "f1=$f1 f2=$f2 f3=$f3"
result=$(zgrep "$f1" file1 | grep "with seq $f3" || zgrep "$f1" file2 | grep "with seq $f3")
elif [[ $line =~ \>[[:blank:]]+([^[:blank:]]+) ]]; then
IFS=, read -r g1 g2 g3 <<< "${BASH_REMATCH[1]}"
# echo "g1=$g1 g2=$g2 g3=$g3"
result=$(zgrep "$g1" file3 | grep "with seq $g3" || zgrep "$g1" file3 | grep "with seq $g3")
fi
if [[ -n $result ]]; then
echo "result = $result"
fi
done < file.txt

Inline array substitution

I have file with a few lines:
x 1
y 2
z 3 t
I need to pass each line as paramater to some program:
$ program "x 1" "y 2" "z 3 t"
I know how to do it with two commands:
$ readarray -t a < file
$ program "${a[#]}"
How can i do it with one command? Something like that:
$ program ??? file ???
The (default) options of your readarray command indicate that your file items are separated by newlines.
So in order to achieve what you want in one command, you can take advantage of the special IFS variable to use word splitting w.r.t. newlines (see e.g. this doc) and call your program with a non-quoted command substitution:
IFS=$'\n'; program $(cat file)
As suggested by #CharlesDuffy:
you may want to disable globbing by running beforehand set -f, and if you want to keep these modifications local, you can enclose the whole in a subshell:
( set -f; IFS=$'\n'; program $(cat file) )
to avoid the performance penalty of the parens and of the /bin/cat process, you can write instead:
( set -f; IFS=$'\n'; exec program $(<file) )
where $(<file) is a Bash equivalent to to $(cat file) (faster as it doesn't require forking /bin/cat), and exec consumes the subshell created by the parens.
However, note that the exec trick won't work and should be removed if program is not a real program in the PATH (that is, you'll get exec: program: not found if program is just a function defined in your script).
Passing a set of params should be more organized :
In this example case I'm looking for a file containing chk_disk_issue=something etc.. so I set the values by reading a config file which I pass in as a param.
# -- read specific variables from the config file (if found) --
if [ -f "${file}" ] ;then
while IFS= read -r line ;do
if ! [[ $line = *"#"* ]]; then
var="$(echo $line | cut -d'=' -f1)"
case "$var" in
chk_disk_issue)
chk_disk_issue="$(echo $line | tr -d '[:space:]' | cut -d'=' -f2 | sed 's/[^0-9]*//g')"
;;
chk_mem_issue)
chk_mem_issue="$(echo $line | tr -d '[:space:]' | cut -d'=' -f2 | sed 's/[^0-9]*//g')"
;;
chk_cpu_issue)
chk_cpu_issue="$(echo $line | tr -d '[:space:]' | cut -d'=' -f2 | sed 's/[^0-9]*//g')"
;;
esac
fi
done < "${file}"
fi
if these are not params then find a way for your script to read them as data inside of the script and pass in the file name.

How to match 0/1 coded values to a key provided in the same file, and rewrite as a line (instead of a list), in bash

I have an input file, over 1,000,000 lines long which looks something like this:
G A 0|0:2,0:2:3:0,3,32
G A 0|1:2,0:2:3:0,3,32
G C 1|1:0,1:1:3:32,3,0
C G 1|1:0,1:1:3:32,3,0
A G 1|0:0,1:1:3:39,3,0
For my purposes, everything after the first : in the third field is irrelevant (but I left it in as it'll affect the code).
The first field defines the values coded as 0 in the third, and the second field defines the values coded as 1
So, for example:
G A 0|0 = G|G
G A 1|0 = A|G
G A 1|1 = A|A
etc.
I first need to decode the third field, and then convert it from a vertical list to a horizontal list of values, with the values before the | on one line, and the values after on a second line.
So the example at the top would look like this:
HAP0 GGCGG
HAP1 GACGA
I've been working in bash, but any other suggestions are welcome. I have a script which does the job - but it's incredibly slow and long-winded and I'm sure there's a better way.
echo "HAP0 " > output.txt
echo "HAP1 " >> output.txt
while IFS=$'\t' read -a array; do
ref=${array[0]}
alt=${array[1]}
data=${array[2]}
IFS=$':' read -a code <<< $data
IFS=$'|' read -a hap <<< ${code[0]}
if [[ "${hap[0]}" -eq 0 ]]; then
sed -i "1s/$/${ref}/" output.txt
elif [[ "${hap[0]}" -eq 1 ]]; then
sed -i "1s/$/${alt}/" output.txt
fi
if [[ "${hap[1]}" -eq 0 ]]; then
sed -i "2s/$/${ref}/" output.txt
elif [[ "${hap[1]}" -eq 1 ]]; then
sed -i "2s/$/${alt}/" output.txt
fi
done < input.txt
Suggestions?
Instead of running sed in a subshell, use parameter expansion.
#!/bin/bash
printf '%s ' HAP0 > tmp0
printf '%s ' HAP1 > tmp1
while read -a cols ; do
indexes=${cols[2]}
indexes=${indexes%%:*}
idx0=${indexes%|*}
idx1=${indexes#*|}
printf '%s' ${cols[idx0]} >> tmp0
printf '%s' ${cols[idx1]} >> tmp1
done < "$1"
cat tmp0
printf '\n'
cat tmp1
printf '\n'
rm tmp0 tmp1
The script creates two temporaty files, one contains the first line, the second file the second line.
Or, use Perl for even faster solution:
#!/usr/bin/perl
use warnings;
use strict;
my #haps;
while (<>) {
my #cols = split /[\s|:]+/, $_, 5;
$haps[$_] .= $cols[ $cols[ $_ + 2 ] ] for 0, 1;
}
print "HAP$_ $haps[$_]\n" for 0, 1;

Unix file pattern issue: append changing value of variable pattern to copies of matching line

I have a file with contents:
abc|r=1,f=2,c=2
abc|r=1,f=2,c=2;r=3,f=4,c=8
I want a result like below:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
The third column value is r value. A new line would be inserted for each occurrence.
I have tried with:
for i in `cat $xxxx.txt`
do
#echo $i
live=$(echo $i | awk -F " " '{print $1}')
home=$(echo $i | awk -F " " '{print $2}')
echo $live
done
but is not working properly. I am a beginner to sed/awk and not sure how can I use them. Can someone please help on this?
awk to the rescue!
$ awk -F'[,;|]' '{c=0;
for(i=2;i<=NF;i++)
if(match($i,/^r=/)) a[c++]=substr($i,RSTART+2);
delim=substr($0,length($0))=="|"?"":"|";
for(i=0;i<c;i++) print $0 delim a[i]}' file
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Use an inner routine (made up of GNU grep, sed, and tr) to compile a second more elaborate sed command, the output of which needs further cleanup with more sed. Call the input file "foo".
sed -n $(grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n') foo | \
sed 's/|[0-9|]*|/|/'
Output:
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|3
Looking at the inner sed code:
grep -no 'r=[0-9]*' foo | \
sed 's/^[0-9]*/&s#.*#\&/;s/:r=/|/;s/.*/&#p;/' | \
tr -d '\n'
It's purpose is to parse foo on-the-fly (when foo changes, so will the output), and in this instance come up with:
1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;
Which is almost perfect, but it leaves in old data on the last line:
sed -n '1s#.*#&|1#p;2s#.*#&|1#p;2s#.*#&|3#p;' foo
abc|r=1,f=2,c=2|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1
abc|r=1,f=2,c=2;r=3,f=4,c=8|1|3
...which old data |1 is what the final sed 's/|[0-9|]*|/|/' removes.
Here is a pure bash solution. I wouldn't recommend actually using this, but it might help you understand better how to work with files in bash.
# Iterate over each line, splitting into three fields
# using | as the delimiter. (f3 is only there to make
# sure a trailing | is not included in the value of f2)
while IFS="|" read -r f1 f2 f3; do
# Create an array of variable groups from $f2, using ;
# as the delimiter
IFS=";" read -a groups <<< "$f2"
for group in "${groups[#]}"; do
# Get each variable from the group separately
# by splitting on ,
IFS=, read -a vars <<< "$group"
for var in "${vars[#]}"; do
# Split each assignment on =, create
# the variable for real, and quit once we
# have found r
IFS== read name value <<< "$var"
declare "$name=$value"
[[ $name == r ]] && break
done
# Output the desired line for the current value of r
printf '%s|%s|%s\n' "$f1" "$f2" "$r"
done
done < $xxxx.txt
Changes for ksh:
read -A instead of read -a.
typeset instead of declare.
If <<< is a problem, you can use a here document instead. For example:
IFS=";" read -A groups <<EOF
$f2
EOF

sed DON'T remove extra whitespace

It seems everybody else wants to remove any additional whitespace, however I have the opposite problem.
I have a file, call it some_file.txt that looks like
a b c d
and some more
and I'm reading it line-by-line with sed,
num_lines=$(cat some_file.txt | wc -l)
for i in $(seq 1 $num_lines); do
echo $(sed "${i}q;d" $file)
string=$(sed "${i}q;d" $file)
echo $string
done
I would expect the number of whitespace characters to stay the same, however the output I get is
a b c d
a b c d
and some more
and some more
So it seems that the problem is with sed removing the extra whitespace between chars, anyway to fix this?
Have a look at this example:
$ echo Hello World
Hello World
$ echo "Hello World"
Hello World
sed is not your problem, your problem is that bash removes the whitespaces when passing the output of sed into echo.
You just need to surround whatever echo is supposed to print with double quotation marks. So instead of
echo $(sed "${i}q;d" $file)
echo $string
You write
echo "$(sed "${i}q;d" $file)"
echo "$string"
The new script should look like this:
#!/usr/bin/env bash
file=some_file.txt
num_lines=$(cat some_file.txt | wc -l)
for i in $(seq 1 $num_lines); do
echo "$(sed "${i}q;d" $file)"
string=$(sed "${i}q;d" $file)
echo "$string"
done
prints the correct output:
a b c d
a b c d
and some more
and some more
However, if you just want to go through your file line by line, I strongly recommend something like this:
while IFS= read -r line; do
echo "$line"
done < some_file.txt
Question from the comments: What to do if you only want 33 lines starting from line x. One possible solution is this:
#!/usr/bin/env bash
declare -i s=$1
declare -i e=${s}+32
sed -n "${s},${e}p" $file | while IFS= read -r line; do
echo "$line"
done
(Note that I would probably include some validation of $1 in there as well.)
I declare s and e as integer variables, then even bash can do some simple arithmetic on them and calculate the actual last line to print.

Resources