How to grep -o without the -o - bash

I've got BusyBox v1.01 providing my commands. Hence, -o is not included in the grep. How can I get grep -o behavior without the ... -o?

awk solution:
awk '/PATTERN/{match($0,/PATTERN/);print substr($0,RSTART,RLENGTH)}' inputFile

If you have sed you can use simple regex. (see linuxquestions.org)
sed -n 's/.*\(PATTERN\).*/\1/p' FILE
So to find only the text StackOverflow in a file file.txt you'd write
sed -n 's/.*\(StackOverflow\).*/\1/p' file.txt
Remember the pattern in the sed command is a regular expression. So If your pattern contains any meta characters of regular expression, they need to be escaped.

You could use Perl instead:
perl -lne 'print $1 while /(pattern)/g' FILE

Related

How to convert multiple parameters URLs into single parameter URLs in bash

$ cat urls.txt
http://example.com/test/test/test?apple=&bat=&cat=&dog=
https://test.com/test/test/test?aa=&bb=&cc=
http://target.com/test/test?hmm=
I want output like below 👇🏻 , how can i do that in bash ( single line command )
$ cat urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=
With GNU awk:
$ awk -F'?|=&|=' '{for(i=2;i<NF;i++) print $1 "?" $i "="}' urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=
I try use sed but it is complex. if use perl like this:
perl -pe 'if(/(.*\?)/){$url=$1;s#&#\n$url#g;}' url.txt
it works well.
With GNU awk using gensub():
awk '{print gensub(/^(https?:)(.*)(\?[[:alpha:]]+=)(.*)/,"\\1\\2\\3","g")}' file
http://example.com/test/test/test?apple=
https://test.com/test/test/test?aa=
http://target.com/test/test?hmm=
gensub() for specifying components of the regexp in the replacement text, using parentheses in the regexp to mark the components (four here). We print only 3 of them: "\\1\\2\\3" .
This might work for you (GNU sed):
sed -E 's/(([^?]+\?)[^=]+=)&/\1\n\2/;P;D' file
Replace each & by a newline and the substring before the first parameter, print/delete the first line and repeat.

Grep multiple strings from text file

Okay so I have a textfile containing multiple strings, example of this -
Hello123
Halo123
Gracias
Thank you
...
I want grep to use these strings to find lines with matching strings/keywords from other files within a directory
example of text files being grepped -
123-example-Halo123
321-example-Gracias-com-no
321-example-match
so in this instance the output should be
123-example-Halo123
321-example-Gracias-com-no
With GNU grep:
grep -f file1 file2
-f FILE: Obtain patterns from FILE, one per line.
Output:
123-example-Halo123
321-example-Gracias-com-no
You should probably look at the manpage for grep to get a better understanding of what options are supported by the grep utility. However, there a number of ways to achieve what you're trying to accomplish. Here's one approach:
grep -e "Hello123" -e "Halo123" -e "Gracias" -e "Thank you" list_of_files_to_search
However, since your search strings are already in a separate file, you would probably want to use this approach:
grep -f patternFile list_of_files_to_search
I can think of two possible solutions for your question:
Use multiple regular expressions - a regular expression for each word you want to find, for example:
grep -e Hello123 -e Halo123 file_to_search.txt
Use a single regular expression with an "or" operator. Using Perl regular expressions, it will look like the following:
grep -P "Hello123|Halo123" file_to_search.txt
EDIT:
As you mentioned in your comment, you want to use a list of words to find from a file and search in a full directory.
You can manipulate the words-to-find file to look like -e flags concatenation:
cat words_to_find.txt | sed 's/^/-e "/;s/$/"/' | tr '\n' ' '
This will return something like -e "Hello123" -e "Halo123" -e "Gracias" -e" Thank you", which you can then pass to grep using xargs:
cat words_to_find.txt | sed 's/^/-e "/;s/$/"/' | tr '\n' ' ' | dir_to_search/*
As you can see, the last command also searches in all of the files in the directory.
SECOND EDIT: as PesaThe mentioned, the following command would do this in a much more simple and elegant way:
grep -f words_to_find.txt dir_to_search/*

Read file until match pattern

I´m reading a file and I would like to get the info until I found a match.
So having the file here https://ufile.io/182kx
I would like to have the json info from lastActiveTimes: until I found ,chatNotif:0
Returning
{"707514313":1505610703,"1568212945":1505638160,"732898933":1505638352,"100009336847960":1505635266,"721251435":1505570865,"718844397":1505623246,"1461941075":1505501435,"100004389551456":1505637706,"1211838231":1505582601,"1040249145":1505636186,"1242203773":1505628782,"517814298":1505567030,"807572767":1505638353,"738307936":1505638009,"683874946":1505598251,"822469152":1505636589,"727476234":1505627000,"781209703":1505631577,"1058918804":1505629365,"539657070":1505629599,"1506662943":1505606109,"538279690":1505575467,"1122078957":1505633239,"1426504238":1505614371,"1760126206":1505637897,"100009494169236":1505633218,"100000193088625":1505633785,"628050112":1505599301,"692803720":1505602132,"100000982526361":1505611187,"1567918281":1505549275,"562061542":1505633121,"680188549":1505637979,"201400626":1505510516,"709905371":1505635235,"100000921265645":1505637511,"100002576634271":1505633420,"100001152648289":1505638358,"1580474418":1505583268,"1093906498":1505635647,"1568491642":1505613600,"1759941492":1505592915,"1021502749":1505621933,"100001091369712":1505593740,"1201111516":1505631603,"511729394":1505637150,"1228064980":1505627119,"1484357891":1505632720,"773982263":1505636776,"610763631":1505581711,"581839860":1505636663,"100001509228647":1505550106,"100001496847848":1505520708,"553024640":1505631903,"1657607627":1505460838,"100008134920032":1505636261,"518105631":1505610763,"100000167522595":1505559871,"604094302":1505591423,"831534764":1505498705,"716402163":1505625063,"100005862197805":1505615273,"779160397":1505625381,"683029723":1505602056,"1105801871":1505638150,"1007323327":1505618323,"500432034":1505617899,"1019441248":1505593648,"1321064988":1505549642,"600465009":1505557526,"734790522":1505614982,"1139898038":1505597330,"762749332":1505595541,"100006926654236":1505637009,"100007887856728":1505580453,"1073032118":1505602788,"575893114":1505630287,"1463373342":1505609305}
I was trying sed with
sed -n '/lastActiveTimes:/,/chatNotif/p' home.html | sed '1s/.*lastActiveTimes://; $s/chatNotif.*//' > end.json
But did not work
if you do not mind using Perl you can try:
perl -lne 'print $& if /(?<=lastActiveTimes:).*?(?=,chatNotif)/g' home.txt
It prints anything between these two assertions: lastActiveTimes: and ,chatNotif
or
ack -o '(?<=lastActiveTimes:).*?(?=,chatNotif)' home.txt
With GNU grep and Perl regular expression (-P):
grep -Poz '(?<=lastActiveTimes:).*(\n.*)*(?=,chatNotif)' file
Output:
{"707514313":1505610703,"1568212945":1505639008,"732898933":1505641310,"100009336847960":1505641325,"721251435":1505570865,"718844397":1505623246,"1461941075":1505501435,"100004389551456":1505637706,"1211838231":1505582601,"1040249145":1505639741,"1242203773":1505628782,"517814298":1505567030,"807572767":1505638510,"738307936":1505641007,"683874946":1505598251,"822469152":1505636589,"727476234":1505627000,"781209703":1505631577,"1058918804":1505629365,"539657070":1505629599,"1506662943":1505606109,"538279690":1505640516,"1122078957":1505633239,"1426504238":1505614371,"1760126206":1505637897,"100009494169236":1505633218,"100000193088625":1505633785,"628050112":1505599301,"692803720":1505641333,"100000982526361":1505611187,"1567918281":1505549275,"562061542":1505641305,"680188549":1505637979,"201400626":1505510516,"709905371":1505635235,"100000921265645":1505637511,"100002576634271":1505633420,"100001152648289":1505640582,"1580474418":1505583268,"1093906498":1505635647,"1568491642":1505638670,"1759941492":1505592915,"1021502749":1505621933,"100001091369712":1505593740,"1201111516":1505631603,"511729394":1505637150,"1228064980":1505627119,"1484357891":1505632720,"773982263":1505641308,"610763631":1505581711,"581839860":1505641241,"100001509228647":1505550106,"100001496847848":1505520708,"553024640":1505631903,"1657607627":1505460838,"100008134920032":1505636261,"518105631":1505610763,"100000167522595":1505559871,"604094302":1505591423,"831534764":1505498705,"716402163":1505625063,"100005862197805":1505615273,"779160397":1505625381,"683029723":1505602056,"1105801871":1505641175,"1007323327":1505640781,"500432034":1505617899,"1019441248":1505593648,"1321064988":1505549642,"600465009":1505557526,"734790522":1505614982,"1139898038":1505597330,"762749332":1505595541,"100006926654236":1505637009,"100007887856728":1505580453,"1073032118":1505602788,"575893114":1505630287,"1463373342":1505640415}

How to grep with a pattern that includes a " quotation mark?

I want to grep a line that includes a quotation mark, more specifically I want to grep lines that include a " mark.
more specifically I want to grep lines like:
#include "something.h"
then pipe into sed to just return something.h
A single grep will do this job.
grep -oP '(?<=")[^"]*(?=")' file
Example:
$ echo '#include "something.h"' | grep -oP '(?<=")[^"]*(?=")'
something.h
sed '#n
/"/ s/.*"\([^"]*\)" *$/\1/p' YourFile
No need of grep (unless performance on huge file is wanted) with a sed. Sed could filter and adapt directly the content
In your case, /"/ is certainly modified by /#include *"/
in case of several string between quote
sed '#n
/"/ {s/"[^"]*$/"/;s/[^"]*"\([^"]*\)" */\1/gp;}' YourFile
You can use awk to get included filename:
awk -F'"' '{print $2}' file.c
something.h

Text Manipulation using sed or AWK

I get the following result in my script when I run it against my services. The result differs depending on the service but the text pattern showing below is similar. The result of my script is assigned to var1. I need to extract data from this variable
$var1=HOST1*prod*gem.dot*serviceList : svc1 HOST1*prod*kem.dot*serviceList : svc3, svc4 HOST1*prod*fen.dot*serviceList : svc5, svc6
I need to strip the name of the service list from $var1. So the end result should be printed on separate line as follow:
svc1
svc2
svc3
svc4
svc5
svc6
Can you please help with this?
Regards
Using sed and grep:
sed 's/[^ ]* :\|,\|//g' <<< "$var1" | grep -o '[^ ]*'
sed deletes every non-whitespace before a colon and commas. Grep just outputs the resulting services one per line.
Using gnu grep and gnu sed:
grep -oP ': *\K\w+(, \w+)?' <<< "$var1" | sed 's/, /\n/'
svc1
svc3
svc4
svc5
svc6
grep is the perfect tool for the job.
From man grep:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Sounds perfect!
As far as I'm aware this will work on any grep:
echo "$var1" | grep -o 'svc[0-9]\+'
Matches "svc" followed by one or more digits. You can also enable the "highly experimental" Perl regexp mode with -P, which means you can use the \d digit character class and don't have to escape the + any more:
grep -Po 'svc\d+' <<<"$var1"
In bash you can use <<< (a Here String) which supplies "$var1" to grep on the standard input.
By the way, if your data was originally on separate lines, like:
HOST1*prod*gem.dot*serviceList : svc1
HOST1*prod*kem.dot*serviceList : svc3, svc4
HOST1*prod*fen.dot*serviceList : svc5, svc6
This would be a good job for awk:
awk -F': ' '{split($2,a,", "); for (i in a) print a[i]}'

Resources