Bash while loop to compare two files and print line number - bash

I have file1:
A
B
C
D
I have file2:
B
C
I want to use a while read loop to go through both files compare them and print out the line number of file1 for any matching lines.
COUNT=0
while read line
do
flag = 0
while read line2
do
COUNT=$(( $COUNT + 1 ))
if ( "$line" = "$line2" )
then
flag = 1
fi
done < file1
if ( flag -eq 1 )
then
echo $COUNT > file3
fi
done < file2
However I get an error: B command not found
Please could someone let me know where I have gone wrong. Thanks.

There are quite a lot of errors in this code, but to answer the question, the reason you are getting B command not found is because........ we use [] rather than () in bash.
other errors include:
COUNT=0
while read line
do
flag=0 # no space between flag and =
while read line2
do
COUNT=$(( $COUNT + 1 ))
echo $line
echo $line2
if [ "_$line" = "_$line2" ]
then
flag=1 #no space again
fi
done < file1
if [ $flag -eq 1 ] # use $flag rather than flag
then
echo $COUNT > file3
fi
done < file2

You could also achieve what you are looking for using grep -c like this:
#!/bin/bash
# Clean file3
>file3
while read line; do
COUNT=$(grep -c "^$line$" file2)
if [ $COUNT -ge 1 ]; then
$COUNT >> file3
fi
done < file1
grep -c prints the number of times an expression matches the file content.

Related

Issue with program check if the number is divisible by 2 with no remainder BASH

I tried to write a program to see if the count divisible by 2 without a remainder
Here is my program
count=$((count+0))
while read line; do
if [ $count%2==0 ]; then
printf "%x\n" "$line" >> file2.txt
else
printf "%x\n" "$line" >> file1.txt
fi
count=$((count+1))
done < merge.bmp
This program doesnt work its every time enter to the true
In the shell, the [ command does different things depending on how many arguments you give it. See https://www.gnu.org/software/bash/manual/bashref.html#index-test
With this:
[ $count%2==0 ]
you give [ a single argument (not counting the trailing ]), and in that case, if the argument is not empty then the exit status is success (i.e. "true"). This is equivalent to [ -n "${count}%2==0" ]
You want
if [ "$(( $count % 2 ))" -eq 0 ]; then
or, if you're using bash
if (( count % 2 == 0 )); then
Some more "exotic" way to do this:
count=0
files=(file1 file2 file3)
num=${#files[#]}
while IFS= read -r line; do
printf '%s\n' "$line" >> "${files[count++ % num]}"
done < input_file
This will put 1st line to file1, 2nd line to file2, 3rd line to file3, 4th line to file1 and so on.
awk to the rescue!
what you're trying to do is a one-liner
$ seq 10 | awk '{print > (NR%2?"file1":"file2")}'
==> file1 <==
1
3
5
7
9
==> file2 <==
2
4
6
8
10
try
count=$((count+0))
while read line; do
if [ $(($count % 2)) == 0 ]; then
printf "%x\n" "$line" >> file2.txt
else
printf "%x\n" "$line" >> file1.txt
fi
count=$((count+1))
done < merge.bmp
You have to use the $(( )) around a mod operator as well.
How to use mod operator in bash?
This will print "even number":
count=2;
if [ $(($count % 2)) == 0 ]; then
printf "even number";
else
printf "odd number";
fi
This will print "odd number":
count=3;
if [ $(($count % 2)) == 0 ]; then
printf "even number";
else
printf "odd number";
fi

Inputs within double `while` loop are not happening

while read line
do
echo "$line"
i=0;
rm -rf b.txt
while [[ $i -lt $line ]]
do
i=`expr $i + 1`
echo "$i " >> b.txt
done
a=`cat b.txt`
for i in $a;
do
echo "Hari $i \c"
read input
done
done < 5.txt
Say 5.txt has the value:
2
3
This script needs to place the cursor inside the for loop, but the script is continuously executing and ending. Can you please help me over this?
Assuming 5.txt consists of:
2
3
The script outputs:
2
Hari 1 \c
Hari 2 \c
...and quits.
The script won't prompt for input after outputting Hari 1 \c because all the input is coming from 5.txt. On the first pass $input would be set to 3, (the 2nd line of 5.txt). On the second pass, the input would be an EOF, in which case read gives up, much like how this outputs nothing:
read -p "Enter a number" n < /dev/null
This would work:
for i in `cat 5.txt` ; do \
echo $i ; for f in `seq $i` ; do read -p "Hari $f \c: " input ; done ; \
done
Note also that $input is never used for anything.

Need to separate information in a text file using grep or awk or sed

I've a text file fruits.txt with the information:
15 Apples 0
155 Bananas 0
250 Mangoes 0
555 Oranges 0
where the first column with values 15, 155, 250, 222 represent the number of fruits(count) second column Apples, Bananas, Mangoes, Oranges represent fruit namethird column 0, 0, 0, 0 represent the type (or some random thing)
I need to extract the content from fruits.txt into other separate files based on the count of number of fruits in first column.
For example,
if the count is in b/w 1 to 100, it should be stored in a file a.txt
Similarly, b/w 101 to 200 in b.txt, b/w 201 to 300 in c.txt and
b/w 500 to 600 in d.txt
Desired output:
a.txt should have the following as its content:
15 Apples 0
b.txt as:
155 Bananas 0
c.txt as:
250 Mangoes 0
d.txt as:
555 Oranges 0
Any ideas to get the output using sed or awk or grep?
Awk would work well here:
awk '$1>=1 && $1<=100 {print > "a.txt"} $1>=101 && $1<=200 {print > "b.txt"} $1>=201 && $1<=300 {print > "c.txt"} $1>=500 && $1<=600 {print > "d.txt"}' fruits.txt
This works by specifying where to print the the line inside each {} block, and adding a condition in front of each block to limit which records hit the block. For the first file a.txt we use the condition $1>=1 && $1<=100 which says "Test the first field to see if it's between 1 and 100". Then we just repeat for your remaining 3 conditions.
In the end, it's a one-liner that creates 4 files based on your conditions.
This solution uses only Bash. It does basic error handling.
# Initialize output files
for outfile in a.txt b.txt c.txt d.txt ; do
echo >$outfile
done
while IFS= read -r line || [[ -n $line ]] ; do
read -r count rest <<<"$line"
if (( count <= 1 )) ; then
echo "ERROR - Invalid count in '$line'" >&2
elif (( count < 100 )) ; then
printf '%s\n' "$line" >> a.txt
elif (( count <= 101 )) ; then
echo "ERROR - Invalid count in '$line'" >&2
elif (( count < 200 )) ; then
printf '%s\n' "$line" >> b.txt
elif (( count <= 201 )) ; then
echo "ERROR - Invalid count in '$line'" >&2
elif (( count < 500 )) ; then
printf '%s\n' "$line" >> c.txt
elif (( count <= 501 )) ; then
echo "ERROR - Invalid count in '$line'" >&2
elif (( count < 1000 )) ; then
printf '%s\n' "$line" >> d.txt
else
echo "ERROR - Invalid count in '$line'" >&2
fi
done < fruits.txt
It's hard to do math in sed and grep.
So that leaves awk. (I'd rather use perl.)
And the requirements are weird.
I'd do something like this:
awk '{ if (0 < $1 && 101 > $1) { print $0 > "a.txt" } }
{ if (100 < $1 && 201 > $1) { print $0 > "b.txt" } }
{ if (200 < $1 && 301 > $1) { print $0 > "c.txt" } }
{ if (499 < $1 && 601 > $1) { print $0 > "d.txt" } }' fruits.txt

How to compare numbers in two files and save the differences into one of files using bash?

$> cat file1.txt
15,20,8,
$> cat file2.txt
10,20,30,
There is only one line in the files. I wanted to compare comma-separated numbers in two files and save the difference to right next to the old value.
So, using file1.txt as a base, after comparing to file2.txt, I would expect to see:
15(+5),20,8(-22),
Would it be possible?
bash is not best for such issue, but still doable, some thing like this:
AA="15,20,8"
BB="10,20,30"
IFS=","
declare -a A=($AA)
declare -a B=($BB)
for ((i=0; i<3; i++)); do
if [ ${A[$i]} -eq ${B[$i]} ]; then
echo -n ${A[$i]},
else
echo -n ${A[$i]}'('$((${A[$i]}-${B[$i]}))')',
fi
done
#!/bin/bash
#
# progname: diffcalc
# syntax: diffcalc file1.txt file2.txt
#
# last element in file1.txt must always be a comma
read LINE1 < "$1"
read LINE2 < "$2"
while [ "$LINE1" ] #while LINE1 is not empty
do
DIFF=$(( ${LINE1%%,*} - ${LINE2%%,*} )) #diff between first elements
[ $DIFF -gt 0 ] && OUT="(+$DIFF)"
[ $DIFF -eq 0 ] && OUT=""
[ $DIFF -lt 0 ] && OUT="($DIFF)"
RESULT="$RESULT""${LINE1%%,*}""$OUT," # append element and (diff),
LINE1=${LINE1#*,} ; LINE2=${LINE2#*,} # cut the first elements
done
echo "$RESULT" >"$1" #write outcome to FILE1

BASH while loop to check lines in file displays too many times

I am writing a script where I want to take each line from a file and check for a match in another file.
If I find a match I want to say that I found a match and if not, say that I did not find a match.
The 2 files contain md5 hashes. The old file is the original and the new file is to check if there have been any changes since the original file.
original file: chksum
new file:chksum1
#!/bin/bash
while read e; do
while read f; do
if [[ $e = $f ]]
then
echo $e "is the same"
else
if [[ $e != $f]]
then
echo $e "has been changed"
fi
fi
done < chksum1
done < chksum
My issue is that for the files that have been changed I get an echo for every time the check in the loop is done and I only want it to display the file once and say that it was not found.
Hope this is clear.
you could use the same script but put a reminder.
#!/bin/bash
while read e; do
rem=0
while read f; do
if [[ $e = $f ]]
then
rem=1
fi
done < chksum1
if [[ rem = 1 ]]
then
echo $e "is the same"
else
echo $e "has been changed"
fi
done < chksum
This should work correctly
You were real close. This will work:
while read e; do
while read f; do
found=0
if [[ $e = $f ]]
then
# echo $e "is the same"
found=1
break
fi
done < chksum1
if [ $found -ne 0 ]
then
echo "$e is the the same"
else
echo "$e has been changed"
fi
done < chksum
A little bit simplified version which avoid the multiple read of the same file (bash 4.0 and above). I assume that the files contain unique filenames and the file format is the output of the md5sum command.
#!/bin/bash
declare -A hash
while read md5 file; do hash[$file]=$md5; done <chksum
while read md5 file; do
[ -z "${hash[$file]}" ] && echo "$file new file" && continue
[ ${hash[$file]} == $md5 ] && echo "$file is same" && continue
echo "$file has been changed"
done <chksum1
This script reads the first file to an associative array, called hash. The index is the name of the file, and the value is the MD5 checksum. The second loop reads the second checksum file; the file name is not in the hash it prints file new file; if it is in the hash and the value equals then it is the same file; if it does not equals it writes file has been changed.
Input files:
$ cat chksum
eed0fc0313f790cec0695914f1847bca ./a.txt
9ee9e1fffbb3c16357bf80c6f7a27574 ./b.txt
a91a408e113adce865cba3c580add827 ./c.txt
$ cat chksum1
eed0fc0313f790cec0695914f1847bca ./a.txt
8ee9e1fffbb3c16357bf80c6f7a27574 ./b.txt
a91a408e113adce865cba3c580add827 ./d.txt
Output:
./a.txt is same
./b.txt has been changed
./d.txt new file
EXTENDED VERSION
Also detect deleted files.
#!/bin/bash
declare -A hash
while read md5 file; do hash[$file]=$md5; done <chksum
while read md5 file; do
[ -z "${hash[$file]}" ] && echo "$file new file" && continue
if [ ${hash[$file]} == $md5 ]; then echo "$file is same"
else echo "$file has been changed"
fi
unset hash[$file]
done <chksum1
for file in ${!hash[*]};{ echo "$file deleted file";}
Output:
./a.txt is same
./b.txt has been changed
./d.txt new file
./c.txt deleted file
I'd like to suggest an alternate solution: How about you don't read line by line, but use sort and uniq -c to see if there are differences. There is no need for a loop where a simple pipe can do your job.
In this case you want all the lines that have changed in file chksum1, so
sort chksum chksum1 chksum1 | uniq -c | egrep '^\s+2\s' | sed 's%\s\+2\s%%'
This also reads chksum1 only 2 times, as compared to the loop based example, which reads it once per line of chksum.
Reusing the input files from one of the other answers:
samveen#precise:~/so$ cat chksum
eed0fc0313f790cec0695914f1847bca ./a.txt
9ee9e1fffbb3c16357bf80c6f7a27574 ./b.txt
a91a408e113adce865cba3c580add827 ./c.txt
samveen#precise:~/so$ cat chksum1
eed0fc0313f790cec0695914f1847bca ./a.txt
8ee9e1fffbb3c16357bf80c6f7a27574 ./b.txt
a91a408e113adce865cba3c580add827 ./d.txt
samveen#precise:~/so$ sort chksum chksum1 chksum1 |uniq -c | egrep '^\s+2\s' |sed 's%\s\+2\s%%'
8ee9e1fffbb3c16357bf80c6f7a27574 ./b.txt
a91a408e113adce865cba3c580add827 ./d.txt
Another possible solution is (as suggested in the question's comments) to use diff in conjunction with sort:
diff <(sort chksum) <(sort chksum1) |grep '^>'
The output:
samveen#precise:~/so$ diff <(sort chksum) <(sort chksum1) |grep '^>'
> 8ee9e1fffbb3c16357bf80c6f7a27574 ./b.txt
> a91a408e113adce865cba3c580add827 ./d.txt
Simple solution:
diff -q chksum1 chksum
What about using the command grep. Every line that you read from chksum will serve as a search pattern in chksum1. If grep finds a match,the "$?" which contains the return value of the grep will be equal to 0, otherwise, it will be equal to 1
while read e; do
grep $e checksum1
if[ $? == "0" ];then
echo $e "is the same"
else
echo $e "has been changed"
fi
done < chksum

Resources