bash script to compare number inside the 2 files - shell

I want to compare 2 number from two different file using Bash script. The file is tmp$i and tmp$(($i-1)). I have tried the script below is not working
#!/bin/bash
for i in `seq 1 5`
do
if [ $tmp$i -lt $tmp$(($i-1)) ];then
cat tmp$i >> inf
else
cat tmp$i >> sup
fi
done
Sample data
Tmp1:
0.8856143905954186 0.8186070632371812 0.7624440603372680 0.7153352945456424 0.6762383806114797 0.6405457936981878
Tmp2:
0.5809579333203458 0.5567050091247218 0.5329405222386163 0.5115305043007474 0.4963898045543342 0.4846139486344327

You are not setting $tmp so you end up simply comparing whether i is smaller than i-1 which of course it isn't.
Removing the dollar sign nominally fixes that, but will just compare two strings (for which numeric cardinality isn't well-defined, so in practice, always false), not access the contents of files named like those strings. tmp2 is neither larger nor smaller than tmp1. (Bash can perform lexical comparison, but test ... -lt isn't the tool to do that.)
Try this instead:
if [ $(cat "tmp$i") -lt $(cat "tmp$((i - 1))") ]; then
In response to the observation that you want to do this on decimal numbers, you need a different tool, because Bash only supports integer arithmetic. My approach would be to write a simple Awk script which performs the comparison.
In order to be able to use it as a conditional, it should exit(0) if the condition is true, exit(1) otherwise.
In order to keep the main script readable, I would encapsulate it in a function, like this:
smaller_first_line () {
awk 'NR==1 && FNR==1 { i=$1; next } FNR==1 { exit($1 < i) }' "$1" "$2"
}
if smaller_first_line "tmp$i" "tmp$((i - 1))"; then
:

Related

Shell Scripting - how to read text file and create a list from it by add a single digit at the end of each word

So let say i have a "fruit.txt" file with a word in each line like
Apple
Banana
Chestnut
how do i read the text file and create a list in the form of
Apple0
Apple1
Apple2
...
Chestnut9
?
Regardless which shell you write your script in, you will need to loop over the names in the file fruit.txt and then loop the number of times you want to append that many digits to each word in the file. Nested loops with the outer looping through the names in fruit.txt and the inner looping once for each digit to append. (to append 0-9 you loop 10 times with the inner loop)
Since you tagged with [shell] you are asking for a POSIX shell compatible solution. That can be something as simple as:
#!/bin/sh
fname="${1:-fruit.txt}" ## set filename to read (default fruit.txt)
nvers="${2:-10}" ## take no. of version to create with digits
## validate file is available and non-empty or exit
[ -s "$fname" ] || {
printf "error: %s is not found or empty.\n" "$fname" >&2
exit 1
}
## read each line in fname
while read -r word; do
n=0 ## set counter zero
while [ "$n" -lt "$nvers" ]; do ## loop $nvers many times
printf "%s%d\n" "$word" "$n" ## write word with digit
n=$((n+1)) ## increment counter
done
done < "$fname"
(if you were using bash, you could use a C-style for((...)) loop for the inner loop)
Example Use/Output
Since there a default values for each fname (fruit.txt) and nver (10), you simply need to run the script if you want the defaults, e.g.
$ sh append_digits.sh
Apple0
Apple1
Apple2
Apple3
...
Chestnut7
Chestnut8
Chestnut9
To create only 4 versions of each fruit you can simply give the necessary arguments, e.g.
$ sh append_digits.sh fruit.txt 4
Apple0
Apple1
Apple2
Apple3
Banana0
Banana1
Banana2
Banana3
Chestnut0
Chestnut1
Chestnut2
Chestnut3
Using awk
The proper tool for the job in shell is awk where you can simply read each line in the file looping however many times needed to append the digits you want, e.g.
$ awk '{ for(i=0; i<10; i++) print $1 i }' fruit.txt
Apple0
Apple1
Apple2
Apple3
...
Chestnut7
Chestnut8
Chestnut9
If you want to pass-in the number of versions to create you can use the -v option to declare an awk variable on the command line, e.g.
awk -v n=4 '{ for(i=0; i<n; i++) print $1 i }' fruit.txt
Which would output 0-3 versions of each fruit. Much easier.

How To Split Up Digits Into Character Array

I'm a bit stuck with something. I have a for loop like this:
#!/bin/bash
for i in {10..15}
do
I want to obtain the last digit of the number, so if i is 12, I want to get 2. I'm having difficulties with the syntax though. I've read that I should convert it into a character array, but when I do something like:
j=${i[#]}
echo $j
I don't get 1 0 1 1 1 2 and so on...I get 10, 11, 12...How do I get the numbers to be split up so I can get the last one of i, when I don't always know how many digits will make up i (ex. it may be 1, or 10, or a 100, etc.)?
Trick is to treat $i like a string.
for i in {10..15}; do j="${i: -1}"; echo $j; done
Of course, you do not need to assign to a variable if you don't want to:
for i in {10..15}; do echo "${i: -1}"; done
This answer which uses GNU shell parameter expansion is the most sensible method, I guess.
However, you can also use the double parenthesis construct which allows C-style manipulation of variables in Bash.
for i in {10..15}
do
(( j = i % 10 )) # modulo 10 always gives the ones' digit
echo $j
done
This awk command could solve your problem:
awk '{print substr($0,length,1)}' test_file
I'm assuming that the numbers are saved in a file test_file
If you want to use for loop:
for i in `cat test_1`
do
echo $i |tail -c 2
done

Is there a way to implement a counter in bash but for letters instead of numbers?

I'm working with an existing script which was written a bit messily. Setting up a loop with all of the spaghetti code could make a bigger headache than I want to deal with in the near term. Maybe when I have more time I can clean it up but for now, I'm just looking for a simple fix.
The script deals with virtual disks on a xen server. It reads multipath output and asks if particular LUNs should be formatted in any way based on specific criteria. However, rather than taking that disk path and inserting it, already formatted, into a configuration file, it simply presents every line in the format
'phy:/dev/mapper/UUID,xvd?,w',
UUID, of course, is an actual UUID.
The script actually presents each of the found LUNs in this format expecting the user to copy and paste them into the config file replacing each ? with a letter in sequence. This is tedious at best.
There are several ways to increment a number in bash. Among others:
var=$((var+1))
((var+=1))
((var++))
Is there a way to do the same with characters which doesn't involve looping over the entire alphabet such that I could easily "increment" the disk assignment from xvda to xvdb, etc?
To do an "increment" on a letter, define the function:
incr() { LC_CTYPE=C printf "\\$(printf '%03o' "$(($(printf '%d' "'$1")+1))")"; }
Now, observe:
$ echo $(incr a)
b
$ echo $(incr b)
c
$ echo $(incr c)
d
Because, this increments up through ASCII, incr z becomes {.
How it works
The first step is to convert a letter to its ASCII numeric value. For example, a is 97:
$ printf '%d' "'a"
97
The next step is to increment that:
$ echo "$((97+1))"
98
Or:
$ echo "$(($(printf '%d' "'a")+1))"
98
The last step is convert the new incremented number back to a letter:
$ LC_CTYPE=C printf "\\$(printf '%03o' "98")"
b
Or:
$ LC_CTYPE=C printf "\\$(printf '%03o' "$(($(printf '%d' "'a")+1))")"
b
Alternative
With bash, we can define an associative array to hold the next character:
$ declare -A Incr; last=a; for next in {b..z}; do Incr[$last]=$next; last=$next; done; Incr[z]=a
Or, if you prefer code spread out over multiple lines:
declare -A Incr
last=a
for next in {b..z}
do
Incr[$last]=$next
last=$next
done
Incr[z]=a
With this array, characters can be incremented via:
$ echo "${Incr[a]}"
b
$ echo "${Incr[b]}"
c
$ echo "${Incr[c]}"
d
In this version, the increment of z loops back to a:
$ echo "${Incr[z]}"
a
How about an array with entries A-Z assigned to indexes 1-26?
IFS=':' read -r -a alpharray <<< ":A:B:C:D:E:F:G:H:I:J:K:L:M:N:O:P:Q:R:S:T:U:V:W:X:Y:Z"
This has 1=A, 2=B, etc. If you want 0=A, 1=B, and so on, remove the first colon.
IFS=':' read -r -a alpharray <<< "A:B:C:D:E:F:G:H:I:J:K:L:M:N:O:P:Q:R:S:T:U:V:W:X:Y:Z"
Then later, where you actually need the letter;
var=$((var+1))
'phy:/dev/mapper/UUID,xvd${alpharray[$var]},w',
The only problem is that if you end up running past 26 letters, you'll start getting blanks returned from the array.
Use a Bash 4 Range
You can use a Bash 4 feature that lets you specify a range within a sequence expression. For example:
for letter in {a..z}; do
echo "phy:/dev/mapper/UUID,xvd${letter},w"
done
See also Ranges in the Bash Wiki.
Here's a function that will return the next letter in the range a-z. An input of 'z' returns 'a'.
nextl(){
((num=(36#$(printf '%c' $1)-9) % 26+97));
printf '%b\n' '\x'$(printf "%x" $num);
}
It treats the first letter of the input as a base 36 integer, subtracts 9, and returns the character whose ordinal number is 'a' plus that value mod 26.
Use Jot
While the Bash range option uses built-ins, you can also use a utility like the BSD jot utility. This is available on macOS by default, but your mileage may vary on Linux systems. For example, you'll need to install athena-jot on Debian.
More Loops
One trick here is to pre-populate a Bash array and then use an index variable to grab your desired output from the array. For example:
letters=( "" $(jot -w %c 26 a) )
for idx in 1 26; do
echo ${letters[$idx]}
done
A Loop-Free Alternative
Note that you don't have to increment the counter in a loop. You can do it other ways, too. Consider the following, which will increment any letter passed to the function without having to prepopulate an array:
increment_var () {
local new_var=$(jot -nw %c 2 "$1" | tail -1)
if [[ "$new_var" == "{" ]]; then
echo "Error: You can't increment past 'z'" >&2
exit 1
fi
echo -n "$new_var"
}
var="c"
var=$(increment_var "$var")
echo "$var"
This is probably closer to what the OP wants, but it certainly seems more complex and less elegant than the original loop recommended elsewhere. However, your mileage may vary, and it's good to have options!

Storing multiple columns of data from a file in a variable

I'm trying to read from a file the data that it contains and get 2 important pieces of data from the file and use it in a bash script. A string and then a number for example:
Box 12
Toy 85
Dog 13
Bottle 22
I was thinking I could write a while loop to loop through the file and store the data into a variable. However I need two different variables, one for the number and one for the word. How do I get them separated into two variables?
Example code:
#!/bin/bash
declare -a textarr numarr
while read -r text num;do
textarr+=("$text")
numarr+=("$num")
done <file
echo ${textarr[1]} ${numarr[1]} #will print Toy 85
data are stored into two array variables: textarr numarr.
You can access each one of them using index ${textarr[$index]} or all of them at once with ${textarr[#]}
To read all the data into a single associative array (in bash 4.0 or newer):
#!/bin/bash
declare -A data=( )
while read -r key value; do
data[$key]=$value
done <file
With that done, you can retrieve a value by key efficiently:
echo "${data[Box]}"
...or iterate over all keys:
for key in "${!data[#]}"; do
value=${data[$key]}
echo "Key $key has value $value"
done
You'll note that read takes multiple names on its argument list. When given more than one argument, it splits fields by IFS, putting columns into their respective variables (with the entire rest of the line going into the last variable named, if more columns exist than variables are named).
Here I provide my own solution which should be discussed. I am not sure this is a good solution or not. Using while read construct has the drawback of starting a new shell and it will not be able to update a variable outside the loop. Here is an example code which you can modify to suite your own need. If you have more column data to use, then slight adjustment is need.
#!/bin/sh
res=$(awk 'BEGIN{OFS=" "}{print $2, $3 }' mytabularfile.tab)
n=0
for x in $res; do
row=$(expr $n / 2)
col=$(expr $n % 2)
#echo "row: $row column: $col value: $x"
if [ $col -eq 0 ]; then
if [ $n -gt 0 ]; then
echo "row: $row "
echo col1=$col1 col2=$col2
fi
col1=$x
else
col2=$x
fi
n=$(expr $n + 1)
done
row=$(expr $row + 1)
echo "last row: $row col1=$col1 col2=$col2"

Bash script that analyzes report files

I have the following bash script which I will use to analyze all report files in the current directory:
#!/bin/bash
# methods
analyzeStructuralErrors()
{
# do something with $1
}
# main
reportFiles=`find $PWD -name "*_report*.txt"`;
for f in $reportFiles
do
echo "Processing $f"
analyzeStructuralErrors $f
done
My report files are formatted as such:
Error Code for Issue X - Description Text - Number of errors.
col1_name,col2_name,col3_name,col4_name,col5_name,col6_name
1143-1-1411-247-1-72953-1
1143-2-1411-247-436-72953-1
2211-1-1888-204-442-22222-1
Error Code for Issue Y - Description Text - Number of errors.
col1_name,col2_name,col3_name,col4_name,col5_name,col6_name
Other data
.
.
.
I'm looking for a way to go through each file and aggregate the report data. In the above example, we have two unique issues of type X, which I would like to handle in analyzeStructural. Other types of issues can be ignored in this routine. Can anyone offer advice on how to do this? I want to read each line until I hit the next error basically, and put that data into some kind of data structure.
Below is a working awk implementation that uses it's pseudo multidimensional arrays. I've included sample output to show you how it looks. I took the liberty to add a 'Count' column to denote how many times a certain "Issue" was hit for a given Error Code
#!/bin/bash
awk '
/Error Code for Issue/ {
errCode[currCode=$5]=$5
}
/^ +[0-9-]+$/ {
split($0, tmpArr, "-")
error[errCode[currCode],tmpArr[1]]++
}
END {
for (code in errCode) {
printf("Error Code: %s\n", code)
for (item in error) {
split(item, subscr, SUBSEP)
if (subscr[1] == code) {
printf("\tIssue: %s\tCount: %s\n", subscr[2], error[item])
}
}
}
}
' *_report*.txt
Output
$ ./report.awk
Error Code: B
Issue: 1212 Count: 3
Error Code: X
Issue: 2211 Count: 1
Issue: 1143 Count: 2
Error Code: Y
Issue: 2961 Count: 1
Issue: 6666 Count: 1
Issue: 5555 Count: 2
Issue: 5911 Count: 1
Issue: 4949 Count: 1
Error Code: Z
Issue: 2222 Count: 1
Issue: 1111 Count: 1
Issue: 2323 Count: 2
Issue: 3333 Count: 1
Issue: 1212 Count: 1
As suggested by Dave Jarvis, awk will:
handle this better than bash
is fairly easy to learn
likely available wherever bash is available
I've never had to look farther than The AWK Manual.
It would make things easier if you used a consistent field separator for both the list of column names and the data. Perhaps you could do some pre-processing in a bash script using sed before feeding to awk. Anyway, take a look at multi-dimensional arrays and reading multiple lines in the manual.
Bash has one-dimensional arrays that are indexed by integers. Bash 4 adds associative arrays. That's it for data structures. AWK has one dimensional associative arrays and fakes its way through two dimensional arrays. If you need some kind of data structure more advanced than that, you'll need to use Python, for example, or some other language.
That said, here's a rough outline of how you might parse the data you've shown.
#!/bin/bash
# methods
analyzeStructuralErrors()
{
local f=$1
local Xpat="Error Code for Issue X"
local notXpat="Error Code for Issue [^X]"
while read -r line
do
if [[ $line =~ $Xpat ]]
then
flag=true
elif [[ $line =~ $notXpat ]]
then
flag=false
elif $flag && [[ $line =~ , ]]
then
# columns could be overwritten if there are more than one X section
IFS=, read -ra columns <<< "$line"
elif $flag && [[ $line =~ - ]]
then
issues+=(line)
else
echo "unrecognized data line"
echo "$line"
fi
done
for issue in ${issues[#]}
do
IFS=- read -ra array <<< "$line"
# do something with ${array[0]}, ${array[1]}, etc.
# or iterate
for field in ${array[#]}
do
# do something with $field
done
done
}
# main
find . -name "*_report*.txt" | while read -r f
do
echo "Processing $f"
analyzeStructuralErrors "$f"
done

Resources