How to get rid of duplicates? [duplicate] - bash

This question already has answers here:
Remove duplicate entries in a Bash script [duplicate]
(4 answers)
Closed 8 years ago.
Hi I am writing a script in bash which read the contents of files that have the word "contact"(in the current directory) in them and sorts all the data in those files in alphabetical order
and writes them to a file called "out.txt". I was wondering if there was any way in which I could get rid of duplicate content. Any help would be appreciated
The code I have written so far.
#!/bin/bash
cat $(ls | grep contact) > out.txt
sort out.txt -o out.txt

sort has option -u (long option: --unique) to output only unique lines:
sort -u out.txt -o out.txt
EDIT: (Thanks to tripleee)
Your script, at present, contains problems of parsing ls output,
This is a better substitute for what you are trying to do:
sort -u *contact* >out.txt

Use this using the uniq command (easier to remember than flags)
#!/bin/bash
cat $(ls | grep contact) | sort | uniq > out.txt
or the -u flag for sort like this
#!/bin/bash
cat $(ls | grep contact) | sort -u > out.txt

uniq may do what you need. It copies lines from input to output, omitting a line if it was the line it just output.

Take a look at the "uniq" command, and pipe it through there after sorting.

Related

XARGS with for loop pr

Hi I am working in bash shell with a file of file names that contains multiple files for the same sample on different lines
file.txt
Filename1_1 SampleName1
Filename1_2 SampleName1
Filename2_1 SampleName2
Filename2_2 SampleName2
I am trying to use xargs with a for loop to pass the filenames into one argument (i.e print Filename1_1 FileName1_2).
Which would be the effect of :
cat file.txt | xargs bash -c 'echo ${0} ${2}'
Since it is quite a long file i cannot use this repeatedly and thought using a for loop will help. But isn't producing the output i expected
Here is what i thought would be simple to do.
for (( i = 0,j=2; i<=63; i= i+4,j=j+4 ))
do
cat file.txt | xargs bash -c 'echo ${i} ${j}'
done
However running this loops through and prints a bunch of blank lines.
Anyone have an idea of getting this to work like i want?
I am looking for an output that looks like below to pass each line to another function
Filename1_1 Filename1_2
Filename2_1 Filename2_2
Filename3_1 Filename3_2
Filename4_1 Filename4_2
Just use -n2 and specify maximum number of arguments.
<file.txt xargs -n2 bash -c 'echo $1 $2' _

In bash, is there a way to redirect output to a file open for reading? [duplicate]

This question already has answers here:
How can I use a file in a command and redirect output to the same file without truncating it?
(14 answers)
Closed 3 years ago.
If I try to redirect output for a command into a file that is open for reading within the command, I get an empty file.
For example, suppose I have a file named tmp.txt:
ABC
123
Now if I do this:
$ grep --color=auto A tmp.txt > out.txt
$ cat out.txt
ABC
But if I do this:
$ grep --color=auto A tmp.txt > tmp.txt
$ cat out.txt
$
I get no output.
I'd like to be able to redirect to a file that I am reading within the same command.
Okay, so I have my answer and would like to share it with you all.
You simply have to use a pipe with tee.
$ grep --color=auto A tmp.txt | tee tmp.txt
ABC
$cat tmp.txt
ABC
Perhaps someone who understands pipes well can explain why.

Unexpected or empty output from tee command [duplicate]

This question already has answers here:
Why does reading and writing to the same file in a pipeline produce unreliable results?
(2 answers)
Closed 3 years ago.
echo "hello" | tee test.txt
cat test.txt
sudo sed -e "s|abc|def|g" test.txt | tee test.txt
cat test.txt
Output:
The output of 2nd command and last command are different, where as the command is same.
Question:
The following line in above script gives an output, but why it is not redirected to output file?
sudo sed -e "s|abc|def|g" test.txt
sudo sed -e "s|abc|def|g" test.txt | tee test.txt
Reading from and writing to test.txt in the same command line is error-prone. sed is trying to read from the file at the same time that tee wants to truncate it and write to it.
You can use sed -i to modify a file in place. There's no need for tee. (There's also no need for sudo. You made the file, no reason to ask for root access to read it.)
sed -e "s|abc|def|g" -i test.txt
You shouldn't use the same file for both input and output.
tee test.txt is emptying the output file when it starts up. If this happens before sed reads the file, sed will see an empty file. Since you're running sed through sudo, it's likely to take longer to start up, so this is very likely.

How can I delete empty line from my ouput by grep? [duplicate]

This question already has answers here:
Remove empty lines in a text file via grep
(11 answers)
Closed 4 years ago.
Exists way to remove empty lines with cat myfile | grep -w #something ?
I looking for simple way for remove empty lines from my output like in the way the presented above.
This really belongs on the codegolfing stackexchange because it's not related to how anyone would ever write a script. However, you can do it like this:
cat myfile | grep -w '.*..*'
It's equivalent to the more canonical grep ., but adds explicit .*s on either side so that it will always match the complete line, thereby satisfying the word boundary conditions imposed by -w
You can pipe your output to awk to easily remove empty lines
cat myfile | grep -w #something | awk NF
EDIT: so... you just want cat myfile | awk NF?
if you have to use grep, you can do grep myfile -v '^[[:blank:]]*$'

bash redirection to files not working [duplicate]

This question already has answers here:
Why doesn't "sort file1 > file1" work?
(7 answers)
Closed 7 years ago.
I have this two files, yolo.txt and bar.txt:
yolo.txt:
a
b
c
bar.txt:
c
I have the following command, which gets me the desired output:
$ cat yolo.txt bar.txt | sort | uniq -u | sponge
a
b
But when I add the redirection (>) statement, the output changes:
$ cat yolo.txt bar.txt | sort | uniq -u | sponge > yolo.txt && cat yolo.txt
c
I expected the output to remain the same, and I am quite confused. Please help me.
The > yolo.txt shell redirect happens before any of the commands run. In particular, the shell opens yolo.txt for writing and truncates it before executing cat yolo.txt bar.txt. So by the time cat opens yolo.txt, yolo.txt is empty. Therefore the c line in bar.txt is unique, so uniq -u passes it through.
I guess you wanted to use sponge to avoid this problem, since that's what sponge is for. But you used it incorrectly. This is the correct usage:
cat yolo.txt bar.txt | sort | uniq -u | sponge yolo.txt && cat yolo.txt
Note that I just pass the output filename to sponge as a command-line argument, instead of using a shell redirect.

Resources