Extracting multiple lines of data between two delimiters - bash

I have a log file containing multiple lines of data. I need to extract and the all the lines between the delimiters and save it to the output file
input.log
Some data
<delim_begin>ABC<delim_end>
some data
<delim_begin>DEF<delim_end>
some data
The output.log file should look like
ABC
DEF
I tried this code but it does not work, it prints all the content of input.log
sed 's/<delim_begin>\(.*\)<delim_end>/\1/g' input.log > output.log

Using awk you can do it using custom field separator:
awk -F '<(delim_begin|delim_end)>' 'NF>2{print $2}' file
ABC
DEF
Using grep -P (PCRE):
grep -oP '(?<=<delim_begin>).*(?=<delim_end>)' file
ABC
DEF

sed alternative
$ sed -nr 's/<delim_begin>(.*)<delim_end>/\1/p' file
ABC
DEF

This should do it:
cat file | awk -F '<(delim_begin|delim_end)>' '{print $2}'

You can use this command -
cat file | grep "<delim_begin>.*<delim_end>" | sed 's/<delim_begin>//g' | sed 's/<delim_end>//' > output.log

Related

Combine multiple text files (row wise) into columns

I have multiple text files that I want to merge columnwise.
For example:
File 1
0.698501 -0.0747351 0.122993 -2.13516
File 2
-5.27203 -3.5916 -0.871368 1.53945
I want the output file to be like:
0.698501, -5.27203
-0.0747351, -3.5916
0.122993, -0.871368
-2.13516, 1.53945
Is there a one line bash common that can accomplish this?
I'll appreciate any help.
---Lyndz
With awk:
awk '{if(NR==1) {split($0,a1," ")} else {split($0,a2," ")}} END{for(i in a2) print a1[i] ", " a2[i]}' file1 file2
Output:
0.698501, -5.27203
-0.0747351, -3.5916
0.122993, -0.871368
-2.13516, 1.53945
paste <(cat file1 | sed -E 's/ +/&,\n/g') <(cat file2 | sed -E 's/ +/&\n/g') | column -s $',' -t | sed -E 's/\s+/, /g' | sed -E 's/, $//g'
It got a bit complicated, but I guess it can be done in a bit simpler way also.
P.S: Please lookup for the man pages of each command to see what they do.

Match multiple patterns with grep and print only the matched patterns

I have a file that looks like
..<long-text>..."field1":"some-value"...<long-text>...."field2":"some-value"...
..<long-text>..."field1":"some-value"...<long-text>...."field2":"some-value"...
..<long-text>..."field1":"some-value"...<long-text>...."field2":"some-value"...
I want to extract out field1 and field2 from each line of the file in bash. I want field1 and field2 to appear in the same line for each line. So the output should look like-
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"
I wrote a grep expression like -
grep -E '"field1":"[a-z]*".*"field2":"[a-z]*"' -o
But because of .* in between, it produces all the all text between those two expressions. I also tried
grep -E '"field1":"[a-z]*"|"field2":"[a-z]*"' -o
But this outputs all field1s in separate line and then all field2s in separate line.
How do I get the expected output?
You can use grep with awk to format the result:
grep -oE '"(field1|field2)":"[^"]*"' file | awk 'NR%2{p=$0; next} {print p, $0}'
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"
use sed:
echo abcdef | sed 's/\(.\).*\(.\)/\1\2/'
# yields: af
for your situation:
sed 's/.*\("field1":"[a-z]*"\).*\("field2":"[a-z]*"\).*/\1 \2/' yourfile
if some lines don't match at all, then do your grep first, e.g.,
grep -Eo '"field1":"[a-z]*".*"field2":"[a-z]*"' yourfile |
sed 's/.*\("field1":"[a-z]*"\).*\("field2":"[a-z]*"\).*/\1 \2/'

Find unique words

Suppose there is one file.txt in which below content text is written:
ABC/xyz
ABC/xyz/rst
EFG/ghi
I need to write a shell script that can extract the first unique word before the first /.
So as output, I want ABC and EFG to be written in one file.
You can extract the first word with cut (slash as delimiter), then pipe to sort with the -u (for "unique") option:
$ cut -d '/' -f 1 file.txt | sort -u
ABC
EFG
To get the output into a file, just redirect by appending > filename to the command. (Or pipe to tee filename to see the output and get it in a file.)
Try this :
cat file.txt | tr -s "/" ' ' | awk -F " " '{print $1}' | sort | uniq > outfile.txt
Another interesting variation:
awk -F'/' '{print $1 |" sort -u" }' file.txt > outfile.txt
Not that it matters here, but being able to pipe and redirect within awk can be very handy.
Another easy way:
cut -d"/" -f1 file.txt|uniq > out.txt
You can use a mix of cut and sort like so:
cut -d '/' -f 1 file.txt | sort -u > newfile.txt
The first line grabs any string until a slash / and outputs it into newfile.txt.
The second line sorts the text, removing any duplicate strings you might have.

How to get word from text file BASH

I want to get only one word from this txt file: http://pastebin.com/jFDu0Le5 . The word is from last row: WER: 45.67% Correct: 65.87% Acc: 54.33%
I want to get only the value: 45.67 to save it to the file value.txt..I want to create BASH script to get this value. Can you give me an example how to do it??? I am new in Bash and I need it for school. The whole .txt file is saved on my server as text file file.txt.
Try this:
grep WER file.txt | awk '{print $2}' | uniq | sed -e 's/%//' > value.txt
Note that this will overwrite value.txt each time you run the command.
You want grep "WER:" value.txt | cut -???
I have ??? because I do not know the structure of the file. Tab delimited? Fixed Width?
Do man cut an you can get the arguments you need.
There a many ways and instruments to do the task:
sed
tac file.txt | sed -n '/^WER: /{s///;s/%.*//;p;q}' > value.txt
awk
tac file.txt | awk -F'[ %]' '/^WER:/{print $2;exit}' > value.txt
bash
while read a b c
do
if [ $a = "WER:" ]
then
b=${b%\%*}
echo ${b#* }
break
fi
done < <(tac file.txt) > value.txt
If the format is as you said, then this also works
awk -F'[: %]' '/^WER/{print $3}' file.txt > value.txt
Explanation
-F specifies the field separator as one of [: %]
/<PATTERN>/ {<ACTION>} refers to: if a line matches some PATTERN, then do some ACTION
in my case,
the PATTERN is: starts with ^ the string WER
the ACTION is: print field $3 (as split by the -F field separators)
> sends the output to value.txt

replacing strings in a configuration file with shell scripting

I have a configuration file with fields separated by semicolons ;. Something like:
user#raspberrypi /home/pi $ cat file
string11;string12;string13;
string21;string22;string23;
string31;string32;string33;
I can get the strings I need with awk:
user#raspberrypi /home/pi $ cat file | grep 21 | awk -F ";" '{print $2}'
string22
And I'd like to change string22 to hello_world via a script.
Any idea how to do it? I think it should be with sed but I have no idea how.
I prefer perl better than sed. Here a one-liner that modifies the file in-place.
perl -i -F';' -lane '
BEGIN { $" = q|;| }
if ( m/21/ ) { $F[1] = q|hello_world| };
print qq|#F|
' infile
Use -i.bak instead of -i to create a backup file with .bak as suffix.
It yields:
string11;string12;string13
string21;hello_world;string23
string31;string32;string33
First drop the useless use of cat and grep so:
$ cat file | grep 21 | awk -F';' '{print $2}'
Becomes:
$ awk -F';' '/21/{print $2}' file
To change this value you would do:
$ awk '/21/{$2="hello_world"}1' FS=';' OFS=';' file
To store the changes back to the file:
$ awk '/21/{$2="hello_world"}1' FS=';' OFS=';' file > tmp && mv tmp file
However if all you want to do is replace string22 with hello_world I would suggest using sed instead:
$ sed 's/string22;/hello_world;/g' file
With sed you can use the -i option to store the changes back to the file:
$ sed -i 's/string22;/hello_world;/g' file
Even though we can do this in awkeasily as Sudo suggested i prefer perl since it does inline replacement.
perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' your_file
for in line just add an i
perl -pi -e 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' your_file
Tested below:
> perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1"hello_world"$2/g if(/21/)' temp
string11;string12;string13;
string21;"hello_world";string23;
string31;string32;string33;
> perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' temp
string11;string12;string13;
string21;hello_world;string23;
string31;string32;string33;
>

Resources