Log the output of diff command to separate files in linux - bash

I have 2 csv files in 2 different directories,i am running a diff on them like this :
diff -b -r -w <dir-one>/AFB.csv <dir-two>/AFB.csv
I am getting the output as expected:
14c14
< image_collapse,,collapse,,,,,batchcriteria^M
---
> image_collapse1,,collapse1,,,,,batchcriteria^M
16a17
> image_refresh,,refresh,,,,,batchcriteria^M
My requirement is that the lines which have changed should goto changed.log file,lines that have been appended should goto append.log.
The output clearly shows that "c" in 14c14 means that line has changed, and "a" in 16a17 means line has been appended. But how do i log them in different log files.

Edit: Same as original answer below but avoiding options not supported by diff on HP-UX. Use something like:
diff -b -r -w /tmp/one.txt /tmp/two.txt \
| sed -n -e '/c/ {s/[^c]*c\(.*\)/\1 p/;p}' \
| sed -n -f - /tmp/two.txt > /tmp/changed.txt
diff -b -r -w /tmp/one.txt /tmp/two.txt \
| sed -n -e '/a/ {s/[^a]*a\(.*\)/\1 p/;p}' \
| sed -n -f - /tmp/two.txt > /tmp/new.txt
This converts the line numbers output from diff to sed print (p) commands for added (a) and changed (c) line ranges. The resulting sed scripts are applied to the second file to print just the desired lines. (I hope HP-UX sed supports the -f - for taking script from standard input.)
There seems to be a solution which does not require interpreting line numbers from the output of diff. diff supports --side-by-side formatting (-y) which includes a gutter marking old, new, and changed lines with <, >, and | respectively. You can reduce this side-by-side format to just the markers by using --width=1 (or -W1). If you take the changed and new markers (grep -v) and prefix the lines of the second file with it (paste) then you can filter (grep) by prefixed markers and throw away (cut) the markers. You can do this for both new and changed files.
Here is a self-contained "script" as an example:
# create two example files (one character per line)
echo abcdefghijklmnopqrstuvwxyz | grep -o . > /tmp/one.txt
echo abcDeFghiJKlmnopPqrsStuvVVwxyzZZZ | grep -o . > /tmp/two.txt
# diff side-by-side to get markers and apply to new file
diff -b -r -w -y -W1 /tmp/one.txt /tmp/two.txt \
| fgrep -v '<' | paste - /tmp/two.txt \
| grep -e '^|' | cut -c3- > /tmp/changed.txt
diff -b -r -w -y -W1 /tmp/one.txt /tmp/two.txt \
| fgrep -v '<' | paste - /tmp/two.txt \
| grep -e '^>' | cut -c3- > /tmp/new.txt
# dump result
cat /tmp/changed.txt
echo ---
cat /tmp/new.txt
Its output is
D
F
J
K
---
P
S
V
V
Z
Z
Z
I hope this helps you solve your problem.

This can be done through a "grep" command like follows.
diff -b -r -w <dir-one>/AFB.csv <dir-two>/AFB.csv | grep ">" >> append.log
diff -b -r -w <dir-one>/AFB.csv <dir-two>/AFB.csv | grep "<" >> changed.log

Related

BASH Finding palindromes in a .txt file

I have been given a .txt file in which we have to find all the palindromes in the text (must have at least 3 letters and they cant be the same letters e.g. AAA)
it should be displayed with the first column being the amount of times it appears and the second being the word e.g.
123 kayak
3 bob
1 dad
#!/bin/bash
tmp='mktemp'
awk '{for(x=1;$x;++x)print $x}' "${1}" | tr -d [[:punct:]] | tr -s [:space:] | sed -e 's/#//g' -e 's/[0-9]*//g'| sed -r '/^.{,2}$/d' | sort | uniq -c -i > tmp1
This outputs the file as it should do, ignoring case, words less than 3 letters, punctuation and digits.
However i am now stump on how to pull out the palindromes from this, i thought a temp file might be the way, just don't know where to take it.
any help or guidance is much appreciated.
# modify this to your needs; it should take your input on stdin, and return one word per
# line on stdout, in the same order if called more than once with the same input.
preprocess() {
tr -d '[[:punct:][:digit:]#]' \
| sed -E -e '/^(.)\1+$/d' \
| tr -s '[[:space:]]' \
| tr '[[:space:]]' '\n'
}
paste <(preprocess <"$1") <(preprocess <"$1" | rev) \
| awk '$1 == $2 && (length($1) >= 3) { print $1 }' \
| sort | uniq -c
The critical thing here is to paste together your input file with a stream that has each line from that input file reversed. This gives you two separate columns you can compare.

Need help escaping from awk quotations in bash script

I have an alias in my bashrc file that outputs current folder contents and system available storage, updated continuously by the watch function.
alias wtch='watch -n 0 -t "du -sch * -B 1000000 2>/dev/null | sort -h && df -h -B 1000000| head -2 | awk '{print \$4}'"'
The string worked fine until I put in the awk part. I know I need to escape the single quotation marks, while still staying in the double quotation marks and the $4 but I haven't been able to get it to work. What am I doing wrong?
This is the error I get
-bash: alias: $4}": not found
Since the quoting for the alias is making it tough, you could just make it a function instead:
wtch() {
watch -n 0 -t "du -sch * -B 1000000 2>/dev/null | sort -h && df -h -B 1000000| head -2 | awk '{print $4}'"
}
This is a lot like issue 2 in the BashFAQ/050
Also, a minor thing but you can skip the head process at the end and just have awk do it, even exiting after the second row like
wtch() {
watch -n 0 -t "du -sch * -B 1000000 2>/dev/null | sort -h && df -h -B 1000000| awk '{print $4} NR >= 3 {exit}'"
}
In this case you can use cut instead of awk. And you'll have the same effect.
alias wtch="watch -n 0 -t 'du -sch * -B 1000000 2>/dev/null | sort -h && df -h -B 1000000| head -2 | cut -d\ -f4'"
Explaining cut:
-d option defines a delimiter
-d\ means that my delimiter is space
-f selects a column
-f4 gives you the fourth column

How to show merged differences between two files?

How can I get only diff letters between two files?
For example,
file1:
aaa;bbb;ccc
123;456;789
a1a;b1b;c1c
file2:
aAa;bbb;ccc
123;406;789
a1a;b1b;c5c
After diff I should get only this string of difference from the second file: A05
diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2) |
sed 's/.*\(.\)$/\1/' | paste -s -d '' -
This uses process substitution with fold to turn each file into a column of characters that's one character wide and then compares them with diff.
The -y option prints lines next to each other, and --suppress-common-lines skips lines that are the same between both files. Until here, the output looks like this:
$ diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2)
a | A
5 | 0
1 | 5
We're only interested in the the last character of each line. We use sed to discard the rest:
$ diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2) |
> sed 's/.*\(.\)$/\1/'
A
0
5
To get these into a single line, we pipe to paste with the -s option (serial) and the empty string as the delimiter (-d ''). The dash tells paste to read from standard in.
$ diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2) |
> sed 's/.*\(.\)$/\1/' | paste -s -d '' -
A05
An alternative, if you have the GNU diffutils at your disposal, is cmp:
$ cmp -lb file1 file2 | awk '{print $5}' | tr -d '\n'
A05
cmp compares files byte by byte. The -l option ("verbose") makes it print all the differences, not just the first one; the -b options make it add the ASCII interpretation of the differing bytes:
$ cmp -lb file1 file2
2 141 a 101 A
18 65 5 60 0
34 61 1 65 5
The awk command reduces this output to the fifth column, and tr removes the newlines.
For the example given,
you could compare the files character by character and if there is a difference, print the character of the second file. Here's one way to do that:
paste <(fold -w1 file1) <(fold -w1 file2) | \
while read c1 c2; do [[ $c1 = $c2 ]] || printf $c2; done
For the given example, this will print A05.

echo -e cat: argument line too long

I have bash script that would merge a huge list of text files and filter it. However I'll encounter 'argument line too long' error due to the huge list.
echo -e "`cat $dir/*.txt`" | sed '/^$/d' | grep -v "\-\-\-" | sed '/</d' | tr -d \' | tr -d '\\\/<>(){}!?~;.:+`*-_ͱ' | tr -s ' ' | sed 's/^[ \t]*//' | sort -us -o $output
I have seen some similar answers here and i know i could rectify it using find and cat the files 1st. However, i would i like to know what is the best way to run a one liner code using echo -e and cat without breaking the code and to avoid the argument line too long error. Thanks.
First, with respect to the most immediate problem: Using find ... -exec cat -- {} + or find ... -print0 | xargs -0 cat -- will prevent more arguments from being put on the command line to cat than it can handle.
The more portable (POSIX-specified) alternative to echo -e is printf '%b\n'; this is available even in configurations of bash where echo -e prints -e on output (as when the xpg_echo and posix flags are set).
However, if you use read without the -r argument, the backslashes in your input string are removed, so neither echo -e nor printf %b will be able to process them later.
Fixing this can look like:
while IFS= read -r line; do
printf '%b\n' "$line"
done \
< <(find "$dir" -name '*.txt' -exec cat -- '{}' +) \
| sed [...]
grep -v '^$' $dir/*.txt | grep -v "\-\-\-" | sed '/</d' | tr -d \' \
| tr -d '\\\/<>(){}!?~;.:+`*-_ͱ' | tr -s ' ' | sed 's/^[ \t]*//' \
| sort -us -o $output
If you think about it some more you can probably get rid of a lot more stuff and turn it into a single sed and sort, roughly:
sed -e '/^$/d' -e '/\-\-\-/d' -e '/</d' -e 's/\'\\\/<>(){}!?~;.:+`*-_ͱ//g' \
-e 's/ / /g' -e 's/^[ \t]*//' $dir/*.txt | sort -us -o $output

BASH:How do i make output like in watch command

My linux 'watch' command is quite old and doesn't support '--color' option. How can I have same output like it does? because in my script the loop gives output one after another(of course). But i need it to replace the previous. Is there any tricks with terminal output?
#!/bin/bash
while true
do
/usr/sbin/asterisk -rx "show queue My_Compain" \
| grep Agent \
| grep -v \(Unavailable\) \
| sort -t"(" -k 2 \
| GREP_COLOR='01;31' egrep -i --color=always '^.*[0-9] \(Not in use.*$|$' \
| GREP_COLOR='01;36' egrep -i --color=always '^.*\(Busy*$|$'
sleep 2
done
You can use clear to clear the screen before dumping your output to give the appearance of in-place updates.
To reduce blinking, you can use the age old technique of double buffering:
#!/bin/bash
while true
do
buffer=$(
clear
/usr/sbin/asterisk -rx "show queue My_Compain" \
| grep Agent \
| grep -v \(Unavailable\) \
| sort -t"(" -k 2 \
| GREP_COLOR='01;31' egrep -i --color=always '^.*[0-9] \(Not in use.*$|$' \
| GREP_COLOR='01;36' egrep -i --color=always '^.*\(Busy*$|$'
)
echo "$buffer"
sleep 2
done

Resources