linux shell diff two files to get new line - shell

I have two files and I want to get the new line by comparing two files, I know can use 'diff newfile oldfile' to get the new lines, but the output will include "<" and diff infomation which I don't want to have.
for example, now I have an oldfile:
a
b
c
and a newfile
a
b
c
d
e
f
the result of the 'diff newfile oldfile' will be
4,6d3
< d
< e
< f
but the result i want to have is
d
e
f
So how can i get this output? I have searchd many diff options but dont have any ideas
Thank you in advance.

Similar to this question, you can use comm for this purpose.
comm -13 file1 file2
Will print only the lines of file2 that don't exist in file1.

Native diff solution:
diff --changed-group-format='%<' --unchanged-group-format='' new.txt old.txt
The output:
d
e
f

You could also use awk:
$ awk 'NR==FNR{a[$0];next} ($0 in a==0)' oldfile newfile
d
e
f
or grep if the files are not that big (mind the partial matches):
$ grep -v -f oldfile newfile
d
e
f
or join (inputfiles need to be ordered):
$ join -v 2 oldfile newfile
d
e
f

Related

deleting row from a csv file using sed and save the file without deleted row

I have a csv file from which I want to delete the second row using unix sed command.
The file.csv is represented below
a
b
c
d
so it becomes newfile.csv
a
c
d
Based on my search for solutions, the simplest way to do this seems to be using the following sed command;
sed '2d' file.csv > newfile.csv
Yet, the newfile.csv contains the deleted row
d
, not the expected
a
c
d
I am using iTerm2 on macOS Mojave
In those cases, awk is useful, too:
$ awk 'NR != 2' file.csv
a
c
d
There awk prints every row but the number 2.
If it is easier to understand:
$ cat file.csv | awk 'NR != 2'
a
c
d
Possible the formatting of your file is wrong, try this:
dos2unix file.csv
then one of these
sed '2d' file.csv
awk 'NR!=2' file.csv

Concatenation of two columns from the same file

From a text file
file
a d
b e
c f
how are the tab delimited columns concatenated into one column
a
b
c
d
e
f
Now I use awk to output columns to two files that I then concatenated using cat. But there must be a better one line command?
for a generalized approach
$ f() { awk '{print $'$1'}' file; }; f 1; f 2
a
b
c
d
e
f
if the file is tab delimited perhaps simply with cut (the inverse operation of paste)
$ cut -f1 file.t; cut -f2 file.t
This simple awk command should do the job:
awk '{print $1; s=s $2 ORS} END{printf "%s", s}' file
a
b
c
d
e
f
You can use process substitution; that would eliminate the need to create file for each column.
$ cat file
a d
b e
c f
$ cat <(awk '{print $1}' file) <(awk '{print $2}' file)
a
b
c
d
e
f
$
OR
as per the comment you can just combine multiple commands and redirect their output to a different file like this:
$ cat file
a d
b e
c f
$ (awk '{print $1}' file; awk '{print $2}' file) > output
$ cat output
a
b
c
d
e
f
$
try: Without reading file twice or without any external calls of any other commands, only single awk to rescue. Also considering that your Input_file is same like shown sample.
awk '{VAL1=VAL1?VAL1 ORS $1:$1;VAL2=VAL2?VAL2 ORS $2:$2} END{print VAL1 ORS VAL2}' Input_file
Explanation: Simply creating a variable named VAL1 which will contain $1's value and keep on concatenating in it's own value, VAL2 will have $2's value and keep on concatenating value in it's own. In END section of awk printing the values of VAL1 and VAL2.
You can combine bash commands with ; to get a single stream:
$ awk '{print $1}' file; awk '{print $2}' file
a
b
c
d
e
f
Use process substitution if you want that to be as if it were a single file:
$ txt=$(awk '{print $1}' file; awk '{print $2}' file)
$ echo "$txt"
a
b
c
d
e
f
Or for a Bash while loop:
$ while read -r line; do echo "line: $line"; done < <(awk '{print $1}' file; awk '{print $2}' file)
line: a
line: b
line: c
line: d
line: e
line: f
If you're using notepadd++ you could replace all tab values with the newline char "\r\n"
another approach:
for i in $(seq 1 2); do
awk '{print $'$i'}' file
done
output:
a
b
c
d
e
f

awk find and replace variable file2 into file1 when matched

Tried couple of answers from similar questions but not quite getting correct results. Trying to search second file for variable and replace with second variable if there, otherwise keep original...
File1.txt
a
2
c
4
e
f
File2.txt
2 b
4 d
Wanted Output.txt
a
b
c
d
e
f
So far what I have seems to sort of work, but anywhere the replacement is happening I'm getting a blank row instead of the new variable...
Current Output.txt
a
c
e
f
Curent code....
awk -F'\t' 'NR==FNR{a[$1]=$2;next} {print (($1 in a) ? a[$1] : $1)}' file2.txt file1.txt > output.txt
Also tried and got same results...
awk -F'\t' 'NR==FNR{a[$1]=$2;next} {$1 = a[$1]}1' file2.txt file1.txt > output.txt
Sorry first wrote incorrectly..fixed the key value issue.
Did try what you did, still not getting missing in output.txt
awk -F'\t' 'NR==FNR{a[$1]=$2;next} $1 in a{$1 = a[$1]}1' file2.txt file1.txt > output.txt
your key value pair is not right... $1 is the key, $2 is the value.
$ awk -F'\t' 'NR==FNR{a[$1]=$2;next} $1 in a{$1=a[$1]}1' file.2 file.1
a
b
c
d
e
f
try below solution -
awk 'NR==FNR{a[$1]=$NF;next} {print (a[$NF]?a[$NF]:$1)}' file2.txt file1.txt
a
b
c
d
e
f

bash- printing two consecutive lines if a pattern found

I need to search for a string in a file and print the matching lines together with their next lines in another file.
Example:
input file:
>3456/1
A
>1234/2
B
>5678/1
C
>8976/2
D
search for: /2
output:
>1234/2
B
>8976/2
D
Using grep:
$ grep -A1 '/2' file
>1234/2
B
--
>8976/2
D
From the man page:
-A num, --after-context=num
Print num lines of trailing context after each match.
You can remove the -- by piping it to grep -v '--' or if you have GNU grep then you can simply do:
$ grep --no-group-separator -A1 '/2' file
>1234/2
B
>8976/2
D
You can re-direct the output of this command to another file.
Using GNU sed
sed -n '/\/2/,+1p' file
Example:
$ sed -n '/\/2/,+1p' file
>1234/2
B
>8976/2
D
Use grep -A
See the man page:
-A num, --after-context=num
Print num lines of trailing context after each match. See also the -B and -C options.
-B num, --before-context=num
Print num lines of leading context before each match. See also the -A and -C options.
-C[num, --context=num]
Print num lines of leading and trailing context surrounding each match. The default is 2 and is equivalent to -A 2 -B 2. Note: no whitespace may be given between the option and its argument.
Here is an example:
%grep -A2 /2 input
>1234/2
B
>5678/1
--
>8976/2
D
Here is grep the correct tool, but using awk you would get:
awk '/\/2/ {print $0;getline;print $0}' file
>1234/2
B
>8976/2
D
PS you should have found this your self, using goolge. This is asked many times.

how can I send parameter to awk using shell script

I have this file
myfile
a b c d e 1
b c s d e 1
a b d e f 2
d f g h j 2
awk 'if $6==$variable {print #0}' myfile
How can I use this code in shell script that get $variable as parameter by user in command prompt?
You can use awk's -v flag. And since awk prints by default, you can try for example:
variable=1
awk -v var=$variable '$6 == var' file.txt
Results:
a b c d e 1
b c s d e 1
EDIT:
The command is essentially the same, wrapped up in shell. You can use it in a shell script with multiple arguments like this script.sh 2 j
Contents of script.sh:
command=$(awk -v var_one=$1 -v var_two=$2 '$6 == var_one && $5 == var_two' file.txt)
echo -e "$command"
Results:
d f g h j 2
This is question 24 in the comp.unix.shell FAQ (http://cfajohnson.com/shell/cus-faq-2.html#Q24) but the most commonly used alternatives with the most common reasons to pick between the 2 are:
-v var=value '<script>' file1 file2
if you want the variable to be populated in the BEGIN section
or:
'<script>' file1 var=value file2
if you do not want the variable to be populated in the BEGIN section and/or need to change the variables value between files

Resources