Extracting multiple lines of data between two delimiters

Extracting multiple lines of data between two delimiters - bash

I have a log file containing multiple lines of data. I need to extract and the all the lines between the delimiters and save it to the output file
input.log
Some data
<delim_begin>ABC<delim_end>
some data
<delim_begin>DEF<delim_end>
some data
The output.log file should look like
ABC
DEF
I tried this code but it does not work, it prints all the content of input.log
sed 's/<delim_begin>\(.*\)<delim_end>/\1/g' input.log > output.log

Using awk you can do it using custom field separator:
awk -F '<(delim_begin|delim_end)>' 'NF>2{print $2}' file
ABC
DEF
Using grep -P (PCRE):
grep -oP '(?<=<delim_begin>).*(?=<delim_end>)' file
ABC
DEF

sed alternative
$ sed -nr 's/<delim_begin>(.*)<delim_end>/\1/p' file
ABC
DEF

This should do it:
cat file | awk -F '<(delim_begin|delim_end)>' '{print $2}'

You can use this command -
cat file | grep "<delim_begin>.*<delim_end>" | sed 's/<delim_begin>//g' | sed 's/<delim_end>//' > output.log

Related

Combine multiple text files (row wise) into columns

I have multiple text files that I want to merge columnwise.
For example:
File 1
0.698501 -0.0747351 0.122993 -2.13516
File 2
-5.27203 -3.5916 -0.871368 1.53945
I want the output file to be like:
0.698501, -5.27203
-0.0747351, -3.5916
0.122993, -0.871368
-2.13516, 1.53945
Is there a one line bash common that can accomplish this?
I'll appreciate any help.
---Lyndz

With awk:
awk '{if(NR==1) {split($0,a1," ")} else {split($0,a2," ")}} END{for(i in a2) print a1[i] ", " a2[i]}' file1 file2
Output:
0.698501, -5.27203
-0.0747351, -3.5916
0.122993, -0.871368
-2.13516, 1.53945

paste <(cat file1 | sed -E 's/ +/&,\n/g') <(cat file2 | sed -E 's/ +/&\n/g') | column -s $',' -t | sed -E 's/\s+/, /g' | sed -E 's/, $//g'
It got a bit complicated, but I guess it can be done in a bit simpler way also.
P.S: Please lookup for the man pages of each command to see what they do.

Match multiple patterns with grep and print only the matched patterns

I have a file that looks like
..<long-text>..."field1":"some-value"...<long-text>...."field2":"some-value"...
..<long-text>..."field1":"some-value"...<long-text>...."field2":"some-value"...
..<long-text>..."field1":"some-value"...<long-text>...."field2":"some-value"...
I want to extract out field1 and field2 from each line of the file in bash. I want field1 and field2 to appear in the same line for each line. So the output should look like-
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"
I wrote a grep expression like -
grep -E '"field1":"[a-z]*".*"field2":"[a-z]*"' -o
But because of .* in between, it produces all the all text between those two expressions. I also tried
grep -E '"field1":"[a-z]*"|"field2":"[a-z]*"' -o
But this outputs all field1s in separate line and then all field2s in separate line.
How do I get the expected output?

You can use grep with awk to format the result:
grep -oE '"(field1|field2)":"[^"]*"' file | awk 'NR%2{p=$0; next} {print p, $0}'
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"
"field1":"some-value" "field2":"some-value"

use sed:
echo abcdef | sed 's/\(.\).*\(.\)/\1\2/'
# yields: af
for your situation:
sed 's/.*\("field1":"[a-z]*"\).*\("field2":"[a-z]*"\).*/\1 \2/' yourfile
if some lines don't match at all, then do your grep first, e.g.,
grep -Eo '"field1":"[a-z]*".*"field2":"[a-z]*"' yourfile |
sed 's/.*\("field1":"[a-z]*"\).*\("field2":"[a-z]*"\).*/\1 \2/'

Find unique words

Suppose there is one file.txt in which below content text is written:
ABC/xyz
ABC/xyz/rst
EFG/ghi
I need to write a shell script that can extract the first unique word before the first /.
So as output, I want ABC and EFG to be written in one file.

You can extract the first word with cut (slash as delimiter), then pipe to sort with the -u (for "unique") option:
$ cut -d '/' -f 1 file.txt | sort -u
ABC
EFG
To get the output into a file, just redirect by appending > filename to the command. (Or pipe to tee filename to see the output and get it in a file.)

Try this :
cat file.txt | tr -s "/" ' ' | awk -F " " '{print $1}' | sort | uniq > outfile.txt

Another interesting variation:
awk -F'/' '{print $1 |" sort -u" }' file.txt > outfile.txt
Not that it matters here, but being able to pipe and redirect within awk can be very handy.

Another easy way:
cut -d"/" -f1 file.txt|uniq > out.txt

You can use a mix of cut and sort like so:
cut -d '/' -f 1 file.txt | sort -u > newfile.txt
The first line grabs any string until a slash / and outputs it into newfile.txt.
The second line sorts the text, removing any duplicate strings you might have.

How to get word from text file BASH

I want to get only one word from this txt file: http://pastebin.com/jFDu0Le5 . The word is from last row: WER: 45.67% Correct: 65.87% Acc: 54.33%
I want to get only the value: 45.67 to save it to the file value.txt..I want to create BASH script to get this value. Can you give me an example how to do it??? I am new in Bash and I need it for school. The whole .txt file is saved on my server as text file file.txt.

Try this:
grep WER file.txt | awk '{print $2}' | uniq | sed -e 's/%//' > value.txt
Note that this will overwrite value.txt each time you run the command.

You want grep "WER:" value.txt | cut -???
I have ??? because I do not know the structure of the file. Tab delimited? Fixed Width?
Do man cut an you can get the arguments you need.

There a many ways and instruments to do the task:
sed
tac file.txt | sed -n '/^WER: /{s///;s/%.*//;p;q}' > value.txt
awk
tac file.txt | awk -F'[ %]' '/^WER:/{print $2;exit}' > value.txt
bash
while read a b c
do
if [ $a = "WER:" ]
then
b=${b%\%*}
echo ${b#* }
break
fi
done < <(tac file.txt) > value.txt

If the format is as you said, then this also works
awk -F'[: %]' '/^WER/{print $3}' file.txt > value.txt
Explanation
-F specifies the field separator as one of [: %]
/<PATTERN>/ {<ACTION>} refers to: if a line matches some PATTERN, then do some ACTION
in my case,
the PATTERN is: starts with ^ the string WER
the ACTION is: print field $3 (as split by the -F field separators)
> sends the output to value.txt

replacing strings in a configuration file with shell scripting

I have a configuration file with fields separated by semicolons ;. Something like:
user#raspberrypi /home/pi $ cat file
string11;string12;string13;
string21;string22;string23;
string31;string32;string33;
I can get the strings I need with awk:
user#raspberrypi /home/pi $ cat file | grep 21 | awk -F ";" '{print $2}'
string22
And I'd like to change string22 to hello_world via a script.
Any idea how to do it? I think it should be with sed but I have no idea how.

I prefer perl better than sed. Here a one-liner that modifies the file in-place.
perl -i -F';' -lane '
BEGIN { $" = q|;| }
if ( m/21/ ) { $F[1] = q|hello_world| };
print qq|#F|
' infile
Use -i.bak instead of -i to create a backup file with .bak as suffix.
It yields:
string11;string12;string13
string21;hello_world;string23
string31;string32;string33

First drop the useless use of cat and grep so:
$ cat file | grep 21 | awk -F';' '{print $2}'
Becomes:
$ awk -F';' '/21/{print $2}' file
To change this value you would do:
$ awk '/21/{$2="hello_world"}1' FS=';' OFS=';' file
To store the changes back to the file:
$ awk '/21/{$2="hello_world"}1' FS=';' OFS=';' file > tmp && mv tmp file
However if all you want to do is replace string22 with hello_world I would suggest using sed instead:
$ sed 's/string22;/hello_world;/g' file
With sed you can use the -i option to store the changes back to the file:
$ sed -i 's/string22;/hello_world;/g' file

Even though we can do this in awkeasily as Sudo suggested i prefer perl since it does inline replacement.
perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' your_file
for in line just add an i
perl -pi -e 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' your_file
Tested below:
> perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1"hello_world"$2/g if(/21/)' temp
string11;string12;string13;
string21;"hello_world";string23;
string31;string32;string33;
> perl -pe 's/(^[^\;]*;)[^\;]*(;.*)/$1hello_world$2/g if(/21/)' temp
string11;string12;string13;
string21;hello_world;string23;
string31;string32;string33;
>

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Extracting multiple lines of data between two delimiters - bash

Using awk you can do it using custom field separator: awk -F '<(delim_begin|delim_end)>' 'NF>2{print $2}' file ABC DEF Using grep -P (PCRE): grep -oP '(?<=<delim_begin>).*(?=<delim_end>)' file ABC DEF

sed alternative $ sed -nr 's/<delim_begin>(.*)<delim_end>/\1/p' file ABC DEF

This should do it: cat file | awk -F '<(delim_begin|delim_end)>' '{print $2}'

You can use this command - cat file | grep "<delim_begin>.*<delim_end>" | sed 's/<delim_begin>//g' | sed 's/<delim_end>//' > output.log

Related

Combine multiple text files (row wise) into columns

Match multiple patterns with grep and print only the matched patterns

Find unique words

How to get word from text file BASH

replacing strings in a configuration file with shell scripting

Categories

Resources