compare two results and print the missing part if exist

compare two results and print the missing part if exist - bash

May be I'm asking a trivial question, but can't find the right way.
I'm ssh'ing in several servers and comparing expected nfs mounts from /etc/fstab with existing mounts from /proc/mounts.
VAR1=$(awk '!/^#/ && $3 == "nfs" {print $2}' /etc/fstab)
VAR2=$(awk '!/^#/ && $3 ~ /nfs[34]/ && $1 !~ /gfs/ {print $2}' /proc/mounts)
For example, from /etc/fstab :
/data1
/data2
/data3
And from /proc/mounts :
/data1
/data2
If all mounts exist and mounted I have to print that all OK ,and if some missing to print them out and remount.
I tried to work for comparison with :
awk 'FNR==NR {a[$1]; next} $1 in a' file1 file2 ( but not works with $VAR1/$VAR2 ).
With nested loops also didn't success.
Thanks in advance.

You can get the missing mounts like this:
comm -23 <(printf '%s\n' "$VAR1") <(printf '%s\n' "$VAR2")
This prints the lines which are only in $VAR1.
Or as a one-liner:
comm -23 <(awk '!/^#/ && $3 == "nfs" {print $2}' /etc/fstab) <(awk '!/^#/ && $3 ~ /nfs[34]/ && $1 !~ /gfs/ {print $2}' /proc/mounts)

You can get the result from $VAR1 and $VAR2 this way:
for i in $VAR1 $VAR2; do echo $i; done \
| sort | uniq -c | awk '/\s+1/ {print $2}'
Basically, we're looking for the directories that are only mentioned once. There may be a more elegant way to do this, but that should get you what you want.

Related

Awk insert a command output into a variable

I am trying to insert the output below into the variable x. The output is a string. I have done this before.
k="psz"
And when I do this it works and i get the expected output when doing echo $x
x=$( awk -v a="k" -F '[:,]' '{ if($1 == "psz") print $5 }' /etc/passwd )
But when i try to use this one below it doesn't work
x=$( awk -v a="k" -F '[:,]' '{ if($1 == a) print $5 }' /etc/passwd )
It does not work, echo $x gives me a blank line.

You are setting a with the string k and not the value of variable $k. If you set it right, the code will work fine. Look:
k='accdias'
x=$(awk -va=$k 'BEGIN{FS=":"} $1==a {print $5}' /etc/passwd)
echo $x
Antonio Dias
I'm editing this to show another way of passing variable values to your awk program without using -v:
k='accdias'
x=$(awk 'BEGIN{FS=":"} $1==ARGV[2] {print $5}' /etc/passwd $k)
echo $x
Antonio Dias
On the above code ARGV[0] will be set to awk, ARGV[1] will be set to /etc/passwd, and finally ARGV[2] will be set to $k value, which is accdias on that example.
Edits from Ed Morton (see comments below):
k='accdias'
x=$(awk -v a="$k" 'BEGIN{FS=":"} $1==a {print $5}' /etc/passwd)
echo "$x"
Antonio Dias
k='accdias'
x=$(awk 'BEGIN{FS=":"; a=ARGV[2]; ARGV[2]=""; ARGC--} $1==a {print $5}' /etc/passwd "$k")
echo "$x"
Antonio Dias

How to avoid generating intermediate files in bash script

I would like to know if it is possible to change the following script, such that "intermediate.tmp" is not generated as output:
To call the script on the command line:
./script.sh file1 file2
script.sh:
#!/bin/bash
FILE_1=$1
FILE_2=$2
awk '{print $1,$2}' $FILE_1 > intermediate.tmp
awk 'NR==FNR {h[$1] = $0; next} {print $0,h[$1]}' intermediate.tmp $FILE_2 > output.file
The awk scripts are not really important per se. I just want to know how to "feed" intermediate.tmp into the second awk command without generating an intermediate.tmp output file in addition to the desired output.file.
Thanks.

awk 'NR==FNR {h[$1] = $1 OFS $2; next} {print $0,h[$1]}' "$FILE_1" "$FILE_2" > output.file
or less sensibly:
awk '{print $1,$2}' "$FILE_1" |
awk 'NR==FNR {h[$1] = $0; next} {print $0,h[$1]}' - "$FILE_2" > output.file

Bash Grep Takes 3 Days To Run. Anyway to Enhance it?

I have a script like this that I would like to seek some suggestions on enhancing it.
cd /home/output/
cat R*op.txt > R.total.op.txt
awk '{if( (length($8)>9) || ($8 ~ /^AAA/) ) {print $0}}' R.total.op.txt > temp && mv temp R.total.op.txt
cat S*op.txt > S.total.op.txt
awk '{if( (length($8)>9) || ($8 ~ /^AAA/) ) {print $0}}' S.total.op.txt > temp && mv temp S.total.op.txt
cat R.total.op.txt S.total.op.txt | awk '{print $4}' | sort -k1,1 | awk '!x[$1]++' > genes.txt
rm *total.op.txt
head genes.txt
cd /home/output/
for j in R1_with-genename R2_with-genename S1_with-genename S2_with-genename
do
**for i in `cat genes.txt`; do cat $j'.op.txt' | grep -w $i >> $j'_'$i'_gene.txt'**;done
done
ls -m1 *gene.txt | wc -l
find . -size 0 -delete
ls -m1 *gene.txt | wc -l
rm genes.txt
cd /home/output/
for i in `ls *gene.txt`
do
paste <(awk '{print $4"\t"$8"\t"$9"\t"$13}' $i | awk '!x[$1]++' | awk '{print $1}') <(awk '{print $4"\t"$8"\t"$9}' $i | awk '{if( (length($2)>9) || ($2 ~ /^AAA/) ) {print $0}}' | sort -k2,2 | awk '{ sum += $3 } END { if (NR > 0) print sum / NR }') <(awk '{print $4"\t"$8"\t"$9}' $i| awk '{if( (length($2)>9) || ($2 ~ /^AAA/) ) {print $0}}' | sort -k2,2 | wc -l) <(awk '{print $4"\t"$8"\t"$9"\t"$13}' $i | awk '{if( (length($2)>9) || ($2 ~ /^AAA/) ) {print $0}}' | sort -k2,2 | grep -v ":::" | wc -l) > $i'_stats.txt'
done
rm *gene.txt
cd /home/output/
for j in R1_with-genename R2_with-genename S1_with-genename S2_with-genename
do
cat $j*stats.txt > $j'.final.txt'
done
rm *stats.txt
cd /home/output/
for i in `ls *final.txt`
do
sed "1iGene_Name\tMean1\tCalculated\tbases" $i > temp && mv temp $i
done
head *final.txt
The very first for loop (marked with asterisks) that has cat genes.txt is the grep loop that is taking 3 days to finish. Can someone please advice any enhancements to the command and if this entire script can be made into a single command? Thanks in advance.

Try replacing the nested loops with a single awk.
awk 'FNR = NR {words[$0] = "\\b" $0 "\\b"; next}
{ for (i in words) if ($0 ~ words[i]) {
fn = FILENAME "_" i "_gene.txt";
print >> fn;
close(fn);
}' genes.txt {{R,S}{1,2}_with-genename}.op.txt

I suggest creating a sed script:
# name script
SEDSCRIPT=split.sed
# Make sure it is empty
echo "" > ${SEDSCRIPT}
# Loop through all the words in genes.txt and
# create sed command that will write that line to a file
for word in `cat genes.txt`; do
echo "/${word}/w ${word}.txt" >> ${SEDSCRIPT}
done
basenames="R1_with-genename R2_with-genename S1_with-genename S2_with-genename"
# Loop over input files
for name in "${basenames}"; do
# Run sed script against file
sed -n -f ${SEDSCRIPT} ${name}.op.txt
# Move the temporary files created by sed to their permanent names
for word in `cat genes.txt`; do
mv ${word}.txt ${name}_${word}_gene.txt
done
done

awk working with intervals

I have this file
goodtime 20:30 21:40
badtime 19:52 24:00
and when I enter for example 21:00 and 21:15 I should get goodtime
So here's my script
#!/bin/sh
last > duom.txt
grep -F 'stud.if.ktu.lt' duom.txt > ktu.txt
echo "Nurodykite laiko intervala "
read h
read min
read h2
read min2
awk '{if ($2 ~ /$h.$m/ && $3 ~ /$h2.$min2/) print $1}' data.txt
But I don't get any results.

The problem with this:
awk '{if ($2 ~ /$h.$m/ && $3 ~ /$h2.$min2/) print $1}' data.txt
Is that you're trying to use shell variables in a single quoted string. You need to pass the shell variables into awk with its -v option:
awk -v patt1="$h.$min" -v patt2="$h2.$min2" '
$2 ~ patt1 && $3 ~ patt2 {print $1}
' data.txt
But, given your sample input, this will not match anything.
Until your requirements are clarified, I can't help with the logic.

Having trouble with awk

I am trying to assign a variable to an awk statement. I am getting an error. Here is the code:
for i in `checksums.txt` do
md=`echo $i|awk -F'|' '{print $1}'`
file=`echo $i|awk -F'|' '{print $2}'`
done
Thanks

for i in `checksums.txt` do
This will try to execute checksums.txt, which is very probably not what you want. If you want the contents of that file do:
for i in $(<checksums.txt) ; do
md=$(echo $i|awk -F'|' '{print $1}')
file=$(echo $i|awk -F'|' '{print $2}')
# ...
done
(This is not optimal, and will not do what you want if the file has lines with spaces in them, but at least it should get you started.)

You don't need external programs for this:
while IFS=\| read m f; do
printf 'md is %s, filename is %s\n' "$m" "$f"
done < checksums.txt
Edited as per new requirement.
Given the file is already sorted, you could use uniq (assuming GNU uniq and md hash length of 33 characters):
uniq -Dw33 checksums.txt
If GNU uniq is not available, you can use awk
(this version doesn't require a sorted input):
awk 'END {
for (M in m)
if (m[M] > 1)
print M, "==>", f[M]
}
{
m[$1]++
f[$1] = f[$1] ? f[$1] FS $2 : $2
}' checksums.txt

while read line
do
set -- `echo $line | tr '|' ' '`
echo md is $1, file is $2
done < checksums.txt

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

compare two results and print the missing part if exist - bash

You can get the missing mounts like this: comm -23 <(printf '%s\n' "$VAR1") <(printf '%s\n' "$VAR2") This prints the lines which are only in $VAR1. Or as a one-liner: comm -23 <(awk '!/^#/ && $3 == "nfs" {print $2}' /etc/fstab) <(awk '!/^#/ && $3 ~ /nfs[34]/ && $1 !~ /gfs/ {print $2}' /proc/mounts)

You can get the result from $VAR1 and $VAR2 this way: for i in $VAR1 $VAR2; do echo $i; done \ | sort | uniq -c | awk '/\s+1/ {print $2}' Basically, we're looking for the directories that are only mentioned once. There may be a more elegant way to do this, but that should get you what you want.

Related

Awk insert a command output into a variable

How to avoid generating intermediate files in bash script

Bash Grep Takes 3 Days To Run. Anyway to Enhance it?

awk working with intervals

Having trouble with awk

Categories

Resources