Why `cat file1 file2 > file2` command fall into endless loop? - shell

When executing command cat file1 file2 > file2 in terminal at Mac 10.11, it will fall into endless loop. I was expected file1 content is add to the head of file2 and write it into file2.
Why is that happening?
Update:
As Benjamin W. mentioned, I think the reason is the input file and output file has been set to the same file. But why cat file1 file2 > file1 will not hang with endless loop?

Related

How to compare entries in one file to two files?

I have a file (named file1) which consists of names and their IPs. It looks something like this :-
VM1 2.33.4.22
VM2 88.43.21.34
VM3 120.3.45.66
VM4 99.100.34.5
VM5 111.3.4.66
and i have two files (file2 and file3) which consists solely of IPs.
File 2 consists of:-
120.3.45.66
88.43.21.34
File 3 consists of :-
99.100.34.5
I want to compare file1 to file2 and file3 and get the names and IPs that are not present in file2 and file3. So output would be:
VM1 2.33.4.22
VM5 111.3.4.66
How can i get the desired output?
sed 's/\./\\./g; s/.*/ &$/' file2 file3 | grep -vf - file1
Use sed to turn the entries in files 2 and 3 in to appropriate regexes.
Pipe this regex list to grep, with -f - to get the pattern list from standard input, and -v to print non matching lines in file 1.
You can write a shell script that will do it for you.
#!/bin/sh
cat $1.txt $2.txt > mergedFile.txt
grep -v -f mergedFile.txt $3.txt
You can run the script by using the following command
sh check.sh file2 file3 file1
awk 'NR==FNR { out[$1]=1; next} !out[$2]' <(/bin/cat file2 file3) file1
This uses basically the same thing as the sed solution, using awk instead.

Difference between two files without sorting

I have the files file1 and file2, where file2 is a subset of file1. That means, if I iterate over file1, there are some lines that are in file2, and some that aren't, but there is no line in file2 that is not in file1. There may be several lines with the same content in a file. Now I want to get the difference between them, that is, all lines of file1 that aren't in file2.
According to this well received answer
diff(1) isn't the answer, comm(1) is.
(For whatever reason)
But as I understand, for comm the files need to be sorted first. The problem: Both files are ordered (not sorted!), and this order needs to be kept. So what I really want is to iterate over file1, and check for every line, if it is also in file2. If not, write it to file3. If the same content occurs more than once, it should be kept more than once!
Is there any way to do this with the command line?
Try this with GNU grep:
grep -vFf file2 file1 > file3
Update:
grep -vxFf file2 file1 > file3
I think you do not want to sort for avoiding temp files. This is possible with process substitution:
diff <(sort file1) <(sort file2)
# or
comm <(sort file1) <(sort file2)
Edit: Using https://stackoverflow.com/a/4544925/3220113 I found another alternative (for text files with short lines):
diff -a --suppress-common-lines -y file2 file1 | sed 's/\s*>.//'

Why does awk behave differently in terminal vs in a perl script?

I want to get the first occurance of a file in a directory matching some pattern. This is the command I'm using.
ls my/file/path/pattern* | awk '{print $1}'
Should my directory contain files pattern1, pattern2, pattern3, etc... This command will only return pattern1. This works as expected in the terminal window.
It fails in a perl script. My commands are
push #arr, `ls path/to/my/dir/pattern* | awk \'{print \$1}\'`;
print #arr;
The output here is
pattern1
pattern2
pattern3
I expect the output to only be pattern1. Why is the entirety of the ls's output being dumped to the array?
Edit: I have found a workaround doing
my $temp = (`ls path/to/my/dir/pattern* | awk \'{print \$1}\'`)[0];
But I still am curious why the other way won't work.
Edit: There are a couple of commenters saying my terminal command doesn't work as I describe, so here's a screenshot. I know awk returns the first column of each line, but it should work as I described if only a few files match.
Solved it! When I execute an ls in the term window, the results look like this
file1 file2 file3
file4 file5 file6
but the perl command
my $test = `ls`;
stores the results like this
file1
file2
file3
file4
file5
file6
Since awk '{print $1}' returns matching results in column 1 for each row, running the command
ls file* | awk '{print $1}'
in perl returns all matches.

UNIX - Simple merging of two files as in the input

Input File1:
HELLO
HOW
Input File2:
ARE
YOU
output file should be
HELLO
HOW
ARE
YOU
My input files will be in one folder and my script has to fetch the input files from that folder and merge all the files as in the above given order.
Thanks
You can simply use cat as shown below:
cat file1 file2
or, to concatenate all files in a folder (assuming there are not too many):
cat folder/*
sed '' file1 file2
hope this works fine +
cat:
cat file1 file2 >output
perl:
perl -plne '' file1 file2 >output
awk:
awk '1' file1 file2 >output

Unix: One line bash command to merge 3 files together. extracting only the first line of each

I am having time with my syntax here:
I have 3 files with various content file1 file2 file3 (100+ lines). I am trying to merge them together, but only the first line of each file should be merged. The point is to do it using one line of bash code:
sed -n 1p file1 file2 file3 returns only the first line of file1
You might want to try
head -n1 -q file1 file2 file3.
It's not clear if by merge you mean concatenate or join?
In awk by joining (each first line in the files printed side by side):
$ awk 'FNR==1{printf "%s ",$0}' file1 file2 file3
1 2 3
In awk by concatenating (each first line in the files printed one after another):
$ awk 'FNR==1' file1 file2 file3
1
2
3
I suggest you use head as explained by themel's answer. However, if you insist in using sed you cannot simply pass all files to it, since they are implicitly concatenated and you lose information about what the first line is in each file respectively. So, if you really want to do it in sed, you need bash to help you out:
for f in file1 file2 file3; do sed -n 1p "$f"; done
You can avoid calling external processes by using the read built-in command:
for f in file1 file2 file3; do read l < $f; echo "$l"; done > merged.txt

Resources