Extract lines from a list that has a double-repeated character - shell

i have a text file
I need to Extract lines from a list that has a double-repeated character
For example, I have
cat-dog-eat
men-boy
I need to Extract lines double-repeated -
and the desired output is:
cat-dog-eat

Given that:
kent$ cat file
cat-dog-eat
men-boy
a-b-c-d-e
To get lines have exactly two -s:
awk -F'-' 'NF==3' file
cat-dog-eat
To get lines have at least two -s:
awk -F'-' 'NF>2' file
cat-dog-eat
a-b-c-d-e

Related

File grep for specific column

How to search each line in first file against a specific column in a second comma separated file so that whole line in the first file matches the whole column in the second file.
grep -Ff file1 file2, will search the entire line in the second file, but i want to search on a specific column.
Eg.
file1.txt
20
300
file2.txt
200,10
220,2
300,5
I want the result to only match 300,5 and not the first 2 rows.
$ awk -F, 'NR==FNR{a[$1]; next} $1 in a' file{1,2}
there are many answers already on this site with explanation of how this works, please refer to them.

replace different text in different lines using sed

I need to do the following:
I have two files, the first one contains only the lines that are going to be modified:
1
2
3
and the second contains the text that is going to be replaced in original file (final_output.txt)
13e
19f
16a
the original file is
wire1: 0x'd318
wire2: 0x'd415
wire3: 0x'd362
I want to get the following:
wire1: 0x13e
wire2: 0x19f
wire3: 0x16a
This is only a part of final_output.txt, because the file can contain at least 100 lines, and I pretend to do it using for, but I don't know how to implement it
awk to the rescue!
assuming the part after the single quote will be replaced.
$ awk -v q="'" 'NR==FNR {a[$1]=$2;next}
FNR in a {sub(q".*",a[FNR])}1' <(paste index rep) file
index is the index file, rep is the replacement file, and file is the original data file.
Another solution where file1 contains only the lines, file2 contains the text that is going to be replaced in original file and final_output.txt contains your original text.
for ((i=1;i<=$(wc -l < file1);i++)); do sed -i "$(sed -n "${i}p" file1)s#$(sed -n "$(sed -n "${i}p" file1)p" final_output.txt | grep -oP "'.*")#$(sed -n "${i}p" file2)#g" final_output.txt; done
Output
darby#Debian:~/Scrivania$ cat final_output.txt
wire1: 0x13e
wire2: 0x19f
wire3: 0x16a
darby#Debian:~/Scrivania$

grep matching specific position in lines using words from other file

I have 2 file
file1:
12342015010198765hello
12342015010188765hello
12342015010178765hello
whose each line contains fields at fixed positions, for example, position 13 - 17 is for account_id
file2:
98765
88765
which contains a list of account_ids.
In Korn Shell, I want to print lines from file1 whose position 13 - 17 match one of account_id in file2.
I can't do
grep -f file2 file1
because account_id in file2 can match other fields at other positions.
I have tried using pattern in file2:
^.{12}98765.*
but did not work.
Using awk
$ awk 'NR==FNR{a[$1]=1;next;} substr($0,13,5) in a' file2 file1
12342015010198765hello
12342015010188765hello
How it works
NR==FNR{a[$1]=1;next;}
FNR is the number of lines read so far from the current file and NR is the total number of lines read so far. Thus, if FNR==NR, we are reading the first file which is file2.
Each ID in in file2 is saved in array a. Then, we skip the rest of the commands and jump to the next line.
substr($0,13,5) in a
If we reach this command, we are working on the second file, file1.
This condition is true if the 5 character long substring that starts at position 13 is in array a. If the condition is true, then awk performs the default action which is to print the line.
Using grep
You mentioned trying
grep '^.{12}98765.*' file2
That uses extended regex syntax which means that -E is required. Also, there is no value in matching .* at the end: it will always match. Thus, try:
$ grep -E '^.{12}98765' file1
12342015010198765hello
To get both lines:
$ grep -E '^.{12}[89]8765' file1
12342015010198765hello
12342015010188765hello
This works because [89]8765 just happens to match the IDs of interest in file2. The awk solution, of course, provides more flexibility in what IDs to match.
Using sed with extended regex:
sed -r 's#.*#/^.{12}&/p#' file2 |sed -nr -f- file1
Using Basic regex:
sed 's#.*#/^.\\{12\\}&/p#' file1 |sed -n -f- file
Explanation:
sed -r 's#.*#/^.{12}&/p#' file2
will generate an output:
/.{12}98765/p
/.{12}88765/p
which is then used as a sed script for the next sed after pipe, which outputs:
12342015010198765hello
12342015010188765hello
Using Grep
The most convenient is to put each alternative in a separate line of the file.
You can look at this question:
grep multiple patterns single file argument list too long

Diff command for two files and output to third

I just a have a small problem with comparing two files with the diff command in a shell script. Say I have two ascii files, file1.txt and file2.txt, with contents:
file1.txt
blah/blah2/content.fits/
blah3/blah4/content2.fits/
blah5/blah6/content3.fits/
blah7/blah8/content4.fits/
file2.txt
content.fits
content2.fits
I would now like to make a comparison of the two files based on the .fits extensions but write out the output to an ascii file keeping the formatting in file1.txt, i.e in this particular example the output file after comparing these two should give:
blah5/blah6/content3.fits/
blah7/blah8/content4.fits/
any ideas?
You can use this awk to get that output:
awk -F/ 'FNR==NR {a[$1];next} !($(NF-1) in a)' file2.txt file1.txt
blah5/blah6/content3.fits/
blah7/blah8/content4.fits/

Compare two files,delete a line if matches found

I want to compare two files.
If values from file2 are matching with the first two columns of file1 need to delete the whole line from file1 and print the result into output as shown below.
Below contains values of file1:
1,aplle,melle,cyborg
2,bplle,less,vgm
3,minipl,vicy,bgm
4,tag,mob,calic
6,Centurion,sa,hh
Below contains values of file2
2,bplle
4,tag
5,Centurion
And output must contains below:
1,aplle,melle,cyborg
3,minipl,vicy,bgm
6,Centurion,sa,hh
Is it possible to achieve this awk ?
This awk should work:
awk -F, 'FNR==NR{a[$1,$2];next} !(($1,$2) in a)' file2 file1
1,aplle,melle,cyborg
3,minipl,vicy,bgm
6,Centurion,sa,hh
This would also work: grep -Fwvf file2 file1
-F
Interpret PATTERN as a list of fixed strings,
-w
Select only those lines containing matches that form whole words.
-v
Invert the sense of matching, to select non-matching lines.
-f FILE
Obtain patterns from FILE, one per line.

Resources