How to convert multiple parameters URLs into single parameter URLs in bash - bash

$ cat urls.txt
http://example.com/test/test/test?apple=&bat=&cat=&dog=
https://test.com/test/test/test?aa=&bb=&cc=
http://target.com/test/test?hmm=
I want output like below 👇🏻 , how can i do that in bash ( single line command )
$ cat urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=

With GNU awk:
$ awk -F'?|=&|=' '{for(i=2;i<NF;i++) print $1 "?" $i "="}' urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=

I try use sed but it is complex. if use perl like this:
perl -pe 'if(/(.*\?)/){$url=$1;s#&#\n$url#g;}' url.txt
it works well.

With GNU awk using gensub():
awk '{print gensub(/^(https?:)(.*)(\?[[:alpha:]]+=)(.*)/,"\\1\\2\\3","g")}' file
http://example.com/test/test/test?apple=
https://test.com/test/test/test?aa=
http://target.com/test/test?hmm=
gensub() for specifying components of the regexp in the replacement text, using parentheses in the regexp to mark the components (four here). We print only 3 of them: "\\1\\2\\3" .

This might work for you (GNU sed):
sed -E 's/(([^?]+\?)[^=]+=)&/\1\n\2/;P;D' file
Replace each & by a newline and the substring before the first parameter, print/delete the first line and repeat.

Related

How to convert parts of a line to uppercase in a file

I have a file file.txt and it has the lines below. I want the queuename to be converted to uppercase, like this: queuename=SP00245B
# Queue name
#
queuename=sp00245b
awk '$1 == "queuename" {$2 = toupper($2)}1' FS== OFS== input-file
Note that this will fail if there are 2 = in the line, and only the values between the first 2 = will be uppercased. If that's an issue, it's an easy fix (left as an exercise for the reader).
A simple Perl solution:
perl -i -pe 's/^\s*queuename=\K(.*)/\U$1/' file.txt
(Remove -i if you don't want to modify the file in place.)
With GNU sed:
sed -i 's/\(^[[:blank:]]*queuename=\)\(.*\)/\1\U\2/' file.txt
This uses two captures groups and the \U sequence to toggle uppercase substitution for the second group.
You can also use the sed conversion \U to convert the portions of the matched pattern with the substitution command to uppercase. To covert everything following the '=' sign you could use, e.g.
sed '/^queuename=/s/=.*$/\U&/' filename
To edit the file in-place, include the -i option, e.g.
sed -i '/^queuename=/s/=.*$/\U&/' filename
Example Use/Output
$ echo "queuename=sp00245b" | sed '/^queuename=/s/=.*$/\U&/'
queuename=SP00245B

Adding double quotes to beginning, end and around comma's in bash variable

I have a shell script that accepts a parameter that is comma delimited,
-s 1234,1244,1567
That is passed to a curl PUT json field. Json needs the values in a "1234","1244","1567" format.
Currently, I am passing the parameter with the quotes already in it:
-s "\"1234\",\"1244\",\"1567\"", which works, but the users are complaining that its too much typing and hard to do. So I'd like to just take a comma delimited list like I had at the top and programmatically stick the quotes in.
Basically, I want a parameter to be passed in as 1234,2345 and end up as a variable that is "1234","2345"
I've come to read that easiest approach here is to use sed, but I'm really not familiar with it and all of my efforts are failing.
You can do this in BASH:
$> arg='1234,1244,1567'
$> echo "\"${arg//,/\",\"}\""
"1234","1244","1567"
awk to the rescue!
$ awk -F, -v OFS='","' -v q='"' '{$1=$1; print q $0 q}' <<< "1234,1244,1567"
"1234","1244","1567"
or shorter with sed
$ sed -r 's/[^,]+/"&"/g' <<< "1234,1244,1567"
"1234","1244","1567"
translating this back to awk
$ awk '{print gensub(/([^,]+)/,"\"\\1\"","g")}' <<< "1234,1244,1567"
"1234","1244","1567"
you can use this:
echo QV=$(echo 1234,2345,56788 | sed -e 's/^/"/' -e 's/$/"/' -e 's/,/","/g')
result:
echo $QV
"1234","2345","56788"
just add double quotes at start, end, and replace commas with quote/comma/quote globally.
easy to do with sed
$ echo '1234,1244,1567' | sed 's/[0-9]*/"\0"/g'
"1234","1244","1567"
[0-9]* zero more consecutive digits, since * is greedy it will try to match as many as possible
"\0" double quote the matched pattern, entire match is by default saved in \0
g global flag, to replace all such patterns
In case, \0 isn't recognized in some sed versions, use & instead:
$ echo '1234,1244,1567' | sed 's/[0-9]*/"&"/g'
"1234","1244","1567"
Similar solution with perl
$ echo '1234,1244,1567' | perl -pe 's/\d+/"$&"/g'
"1234","1244","1567"
Note: Using * instead of + with perl will give
$ echo '1234,1244,1567' | perl -pe 's/\d*/"$&"/g'
"1234""","1244""","1567"""
""$
I think this difference between sed and perl is similar to this question: GNU sed, ^ and $ with | when first/last character matches
Using sed:
$ echo 1234,1244,1567 | sed 's/\([0-9]\+\)/\"\1\"/g'
"1234","1244","1567"
ie. replace all strings of numbers with the same strings of numbers quoted using backreferencing (\1).

How to extract lines after founding specific string

My example text is,
AA BB CC
DDD
process.get('name1')
process.get('name2')
process.get('name3')
process.get('name4')
process.get('name5')
process.get('name6')
EEE
FFF
...
I want to search the string "process.get('name1')" first, if found then extract the lines from "process.get('name1')" to "process.get('name6')".
How do I extract the lines using sed?
This should work and... it uses sed as per OP request:
$ sed -n "/^process\.get('name1')$/,/^process\.get('name6')$/p" file
sed is for simple substitutions on individual lines, for anything more interesting you should be using awk:
$ awk -v beg="process.get('name1')" -v end="process.get('name6')" \
'index($0,beg){f=1} f; index($0,end){f=0}' file
process.get('name1')
process.get('name2')
process.get('name3')
process.get('name4')
process.get('name5')
process.get('name6')
Note that you could use a range in awk, just like you are forced to in sed:
awk -v beg="process.get('name1')" -v end="process.get('name6')" \
'index($0,beg),index($0,end)' file
and you could use regexps after escaping metachars in awk, just like you are forced to in sed:
awk "/process\.get\('name1'\)/,/process\.get\('name6'\)/" file
but the first awk version above using strings instead of regexps and a flag variable is simpler (in as much as you don't have to figure out which chars are/aren't RE metacharacters), more robust and more easily extensible in future.
It's important to note that sed CANNOT operate on strings, just regexps, so when you say "I want to search for a string" you should stop trying to force sed to behave as if it can do that.
Imagine your search strings are passed in to a script as positional parameters $1 and $2. With awk you'd just init the awk variables from them in the expected way:
awk -v beg="$1" -v end="$2" 'index($0,beg){f=1} f; index($0,end){f=0}' file
whereas with sed you'd have to do something like:
beg=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$1")
end=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$2")
sed -n "/^${beg}$/,/^${end}$/p" file
to deactivate any metacharacters present. See Is it possible to escape regex metacharacters reliably with sed for details on escaping RE metachars for sed.
Finally - as mentioned above you COULD use a range expression with strings in awk:
awk -v beg="$1" -v end="$2" 'index($0,beg),index($0,end)' file
but I personally have never found that useful, there's always some slight requirements change comes along to make me wish I'd started out using a flag. See Is a /start/,/end/ range expression ever useful in awk? for details on that

insert a blank line between every two lines in a file using shell, sed or awk

I have a file with many lines. I want to insert a blank line between each two lines
for example
original file
xfdljflsad
fjdiaopqqq
dioapfdja;
I want to make it as:
xfdljflsad
fjdiaopqqq
dioapfdja;
how to achieve this?
I want to use shell script, awk or sed for this?
thanks!
With sed, use
sed G input-file
If pilcrow is correct and you do not want an additional newline at the end of the file,
then do:
sed '$!G' input-file
Another alternative is to use pr:
pr -dt input-file
awk '{print nl $0; nl="\n"}' file
My approach if I want to quickly regex a file.
vim file.txt
%s/\n/\n\n/g
Idiomatic awk:
awk 1 ORS='\n\n' file
Similar thing with perl:
perl -nE 'say' file
Append | head -n -1 if final newline is unwanted.

How to grep -o without the -o

I've got BusyBox v1.01 providing my commands. Hence, -o is not included in the grep. How can I get grep -o behavior without the ... -o?
awk solution:
awk '/PATTERN/{match($0,/PATTERN/);print substr($0,RSTART,RLENGTH)}' inputFile
If you have sed you can use simple regex. (see linuxquestions.org)
sed -n 's/.*\(PATTERN\).*/\1/p' FILE
So to find only the text StackOverflow in a file file.txt you'd write
sed -n 's/.*\(StackOverflow\).*/\1/p' file.txt
Remember the pattern in the sed command is a regular expression. So If your pattern contains any meta characters of regular expression, they need to be escaped.
You could use Perl instead:
perl -lne 'print $1 while /(pattern)/g' FILE

Resources