replace different text in different lines using sed - bash

I need to do the following:
I have two files, the first one contains only the lines that are going to be modified:
1
2
3
and the second contains the text that is going to be replaced in original file (final_output.txt)
13e
19f
16a
the original file is
wire1: 0x'd318
wire2: 0x'd415
wire3: 0x'd362
I want to get the following:
wire1: 0x13e
wire2: 0x19f
wire3: 0x16a
This is only a part of final_output.txt, because the file can contain at least 100 lines, and I pretend to do it using for, but I don't know how to implement it

awk to the rescue!
assuming the part after the single quote will be replaced.
$ awk -v q="'" 'NR==FNR {a[$1]=$2;next}
FNR in a {sub(q".*",a[FNR])}1' <(paste index rep) file
index is the index file, rep is the replacement file, and file is the original data file.

Another solution where file1 contains only the lines, file2 contains the text that is going to be replaced in original file and final_output.txt contains your original text.
for ((i=1;i<=$(wc -l < file1);i++)); do sed -i "$(sed -n "${i}p" file1)s#$(sed -n "$(sed -n "${i}p" file1)p" final_output.txt | grep -oP "'.*")#$(sed -n "${i}p" file2)#g" final_output.txt; done
Output
darby#Debian:~/Scrivania$ cat final_output.txt
wire1: 0x13e
wire2: 0x19f
wire3: 0x16a
darby#Debian:~/Scrivania$

Related

How to ignore headers when merging single column of multiple CSV files?

I need to merge a single column from multiple CSV files whilst disregarding the headers.
file 1:
id,backer_uid,fname,lname
123,uj2uj2,JOHN,SMITH
file 2:
id,backer_uid,fname,lname
124,uj2uh3,BRIAN,DOOLEY
Output:
JOHN
BRIAN
Currently, I am using:
/*Merge 3rd column from all csv files*/
awk -F "\"*,\"*" '{print $3}’ *.csv >merged.csv
But how do I ignore the headers?
You can do it with awk, nearly as you have already done, by adding a condition on the FNR (the record number per file):
awk -F, 'FNR > 1 {print $3}' *.csv > merged.csv
Use tail and cut:
tail -q -n +2 *.csv | cut -f3 -d, > merged.csv
tail -n +2 prints all lines of files starting from line number 2
-q suppresses printing of file names
cut -f3 -d, extracts the third field, treating , as the delimiter
try: If you have to read only 2 files.
awk -F, 'FNR>1{print $(NF-1)}' file[12]
Here I am making field separator as comma and then checking if line number is greater than 1 then printing the second last field. Point to be noted here is file[12] will only read files named file1 and file2, if you have more than that files use file* then.

Bash - reading two files and searching within files

I have two files, file1 and file2. I want to reach each line from file1, and then search if any of the lines in file2 is present in file1. I am using the following bash script, but it does not seem to be working. What should I change? (I am new to bash scripting).
#!/bin/bash
while read line1
do
echo $line1
while read line2
do
if grep -Fxq "line2" "$1"
then
echo "found"
fi
done < "$2"
done < "$1"
Note: Both files are text files.
Use grep -f
grep -f file_with_search_words file_with_content
Note however that if file_with_search_words contains blank lines everything will be matched. But that can be easily avoided with:
grep -f <(sed '/^$/d' file_with_search_words) file_with_content
From the man page:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. If this option is used
multiple times or is combined with the -e (--regexp) option, search
for all patterns given. The empty file contains zero patterns, and
therefore matches nothing.
You may use the command "comm", it compare two sorted files line-by-line
This command show the common lines in file1 and file2
comm -12 file1 file2
The only problem with this command is that you have to sort the files before, like this:
sort file1 > file1sorted
http://www.computerhope.com/unix/ucomm.htm
File 1
Line 1
Line 3
Line 6
Line 9
File 2
Line 3
Line 6
awk 'NR==FNR{con[$0];next} $0 in con{print $0}' file1 file2
will give you
Line 3
Line 6
that is the content in file 2 which is present in file1.
If you wish to ignore the spaces you can achieve with the below one.
awk 'NR==FNR{con[$0];next} !/^$/{$0 in con;print $0}' file1 file2

grep matching specific position in lines using words from other file

I have 2 file
file1:
12342015010198765hello
12342015010188765hello
12342015010178765hello
whose each line contains fields at fixed positions, for example, position 13 - 17 is for account_id
file2:
98765
88765
which contains a list of account_ids.
In Korn Shell, I want to print lines from file1 whose position 13 - 17 match one of account_id in file2.
I can't do
grep -f file2 file1
because account_id in file2 can match other fields at other positions.
I have tried using pattern in file2:
^.{12}98765.*
but did not work.
Using awk
$ awk 'NR==FNR{a[$1]=1;next;} substr($0,13,5) in a' file2 file1
12342015010198765hello
12342015010188765hello
How it works
NR==FNR{a[$1]=1;next;}
FNR is the number of lines read so far from the current file and NR is the total number of lines read so far. Thus, if FNR==NR, we are reading the first file which is file2.
Each ID in in file2 is saved in array a. Then, we skip the rest of the commands and jump to the next line.
substr($0,13,5) in a
If we reach this command, we are working on the second file, file1.
This condition is true if the 5 character long substring that starts at position 13 is in array a. If the condition is true, then awk performs the default action which is to print the line.
Using grep
You mentioned trying
grep '^.{12}98765.*' file2
That uses extended regex syntax which means that -E is required. Also, there is no value in matching .* at the end: it will always match. Thus, try:
$ grep -E '^.{12}98765' file1
12342015010198765hello
To get both lines:
$ grep -E '^.{12}[89]8765' file1
12342015010198765hello
12342015010188765hello
This works because [89]8765 just happens to match the IDs of interest in file2. The awk solution, of course, provides more flexibility in what IDs to match.
Using sed with extended regex:
sed -r 's#.*#/^.{12}&/p#' file2 |sed -nr -f- file1
Using Basic regex:
sed 's#.*#/^.\\{12\\}&/p#' file1 |sed -n -f- file
Explanation:
sed -r 's#.*#/^.{12}&/p#' file2
will generate an output:
/.{12}98765/p
/.{12}88765/p
which is then used as a sed script for the next sed after pipe, which outputs:
12342015010198765hello
12342015010188765hello
Using Grep
The most convenient is to put each alternative in a separate line of the file.
You can look at this question:
grep multiple patterns single file argument list too long

How to fill empty lines from one file with corresponding lines from another file, in BASH?

I have two files, file1.txt and file2.txt. Each has an identical number of lines, but some of the lines in file1.txt are empty. This is easiest to see when the content of the two files is displayed in parallel:
file1.txt file2.txt
cat bear
fish eagle
spider leopard
snail
catfish rainbow trout
snake
koala
rabbit fish
I need to assemble these files together, such that the empty lines in file1.txt are filled with the data found in the lines (of the same line number) from file2.txt. The result in file3.txt would look like this:
cat
fish
spider
snail
catfish
snake
koala
rabbit
The best I can do so far, is create a while read -r line loop, create a counter that counts how many times the while loop has looped, then use an if-conditional to check if $line is empty, then use cut to obtain the line number from file2.txt according to the number on the counter. This method seems really inefficient.
Sometimes file2.txt might contain some empty lines. If file1.txt has an empty line and file2.txt also has an empty line in the same place, the result is an empty line in file3.txt.
How can I fill the empty lines in one file with corresponding lines from another file?
paste file1.txt file2.txt | awk -F '\t' '$1 { print $1 ; next } { print $2 }'
Here is the way to handle these files with awk:
awk 'FNR==NR {a[NR]=$0;next} {print (NF?$0:a[FNR])}' file2 file1
cat
fish
spider
snail
catfish
snake
koala
rabbit
First it store every data of the file2 in array a using record number as index
Then it prints file1, bit it thest if file1 contains data for each record
If there is data for this record, then use it, if not get one from file2
One with getline (harmless in this case) :
awk '{getline p<f; print NF?$0:p; p=x}' f=file2 file1
Just for fun:
paste file1.txt file2.txt | sed -E 's/^ //g' | cut -f1
This deletes tabs that are at the beginning of a line (those missing from file1) and then takes the first column.
(For OSX, \t doesn't work in sed, so to get the TAB character, you type ctrl-V then Tab)
a solution without awk :
paste -d"#" file1 file2 | sed 's/^#\(.*\)/\1/' | cut -d"#" -f1
Here is a Bash only solution.
for i in 1 2; do
while read line; do
if [ $i -eq 1 ]; then
arr1+=("$line")
else
arr2+=("$line")
fi
done < file${i}.txt
done
for r in ${!arr1[#]}; do
if [[ -n ${arr1[$r]} ]]; then
echo ${arr1[$r]}
else
echo ${arr2[$r]}
fi
done > file3.txt

"while read LINE do" and grep problems

I have two files.
file1.txt:
Afghans
Africans
Alaskans
...
where file2.txt contains the output from a wget on a webpage, so it's a big sloppy mess, but does contain many of the words from the first list.
Bashscript:
cat file1.txt | while read LINE; do grep $LINE file2.txt; done
This did not work as expected. I wondered why, so I echoed out the $LINE variable inside the loop and added a sleep 1, so i could see what was happening:
cat file1.txt | while read LINE; do echo $LINE; sleep 1; grep $LINE file2.txt; done
The output looks in terminal looks something like this:
Afghans
Africans
Alaskans
Albanians
Americans
grep: Chinese: No such file or directory
: No such file or directory
Arabians
Arabs
Arabs/East Indians
: No such file or directory
Argentinans
Armenians
Asian
Asian Indians
: No such file or directory
file2.txt: Asian Naruto
...
So you can see it did finally find the word "Asian". But why does it say:
No such file or directory
?
Is there something weird going on or am I missing something here?
What about
grep -f file1.txt file2.txt
#OP, First, use dos2unix as advised. Then use awk
awk 'FNR==NR{a[$1];next}{ for(i=1;i<=NF;i++){ if($i in a) {print $i} } } ' file1 file2_wget
Note: using while loop and grep inside the loop is not efficient, since for every iteration, you need to invoke grep on the file2.
#OP, crude explanation:
For meaning of FNR and NR, please refer to gawk manual. FNR==NR{a[1];next} means getting the contents of file1 into array a. when FNR is not equal to NR (which means reading the 2nd file now), it will check if each word in the file is in array a. If it is, print out. (the for loop is used to iterate each word)
Use more quotes and use less cat
while IFS= read -r LINE; do
grep "$LINE" file2.txt
done < file1.txt
As well as the quoting issue, the file you've downloaded contains CRLF line endings which are throwing read off. Use dos2unix to convert file1.txt before iterating over it.
Although usng awk is faster, grep produces a lot more details with less effort. So, after issuing dos2unix use:
grep -F -i -n -f <file_containing_pattern> <file_containing_data_blob>
You will have all the matches + line numbers (case insensitive)
At minimum this will suffice to find all the words from file_containing_pattern:
grep -F -f <file_containing_pattern> <file_containing_data_blob>

Resources