Can I use sed to manipulate a variable in bash? - bash

In my program, I would like to first get the user input, and insert a \ before each /
so I write this, but it doesn't work.
echo "input a website"
read website
sed '/\//i\/' $website

Try this:
website=$(sed 's|/|\\/|g' <<< $website)
Bash actually supports this sort of replacement natively:
${parameter/pattern/string} — replace the first match of pattern with string.
${parameter//pattern/string} — replace all matches of pattern with string.
Therefore you can do:
website=${website////\\/}
Explanation:
website=${website // / / \\/}
^ ^ ^ ^
| | | |
| | | string, '\' needs to be backslashed
| | delimiter
| pattern
replace globally

echo $website | sed 's/\//\\\//g'
or, for better readability:
echo $website | sed 's|/|\\/|g'

You can also use Parameter-Expansion to replace sub-strings in variable.
For example:
website="https://stackoverflow.com/a/58899829/658497"
echo "${website//\//\\/}"
https:\/\/stackoverflow.com\/a\/58899829\/658497

Related

sed capture to get string between slashes

I have a filepath like this: /bing/foo/bar/bin and I want to extract only the string between bing/ and the next slash.
So /bing/foo/bar/bin should just produce "foo".
I tried the following:
echo "/bing/foo/bar/bin" | sed -r 's/.*bing\/(.*)\/.*/\1/'
but this produces "foo/bar" instead of "foo".
Try this command
echo "/bing/foo/bar/bin" | sed -r 's|.*bing/([^/]*)/.*|\1|'
use | as delimiters instead of / is proper in your case, reference from "Delimiters in sed substitution",
sed can use any character as a delimiter, it will automatically use the character following the s as a delimiter.
or
echo "/bing/foo/bar/bin" | grep -oP "/bing/\K(\w+)"

Extract values from a property file using bash

I have a variable which contains key/values separated by space:
echo $PROPERTY
server_geo=BOS db.jdbc_url=jdbc\:mysql\://mysql-test.com\:3306/db02 db.name=db02 db.hostname=/mysql-test.com datasource.class.xa=com.mysql.jdbc.jdbc2.optional.MysqlXADataSource server_uid=BOS_mysql57 hibernate33.dialect=org.hibernate.dialect.MySQL5InnoDBDialect hibernate.connection.username=db02 server_labels=mysql57,mysql5,mysql db.jdbc_class=com.mysql.jdbc.Driver db.schema=db02 hibernate.connection.driver_class=com.mysql.jdbc.Driver uuid=a19ua19 db.primary_label=mysql57 db.port=3306 server_label_primary=mysql57 hibernate.dialect=org.hibernate.dialect.MySQL5InnoDBDialect
I'd need to extract the values of the single keys, for example db.jdbc_url.
Using one code snippet I've found:
echo $PROPERTY | sed -e 's/ db.jdbc_url=\(\S*\).*/\1/g'
but that returns also other properties found before my key.
Any help how to fix it ?
Thanks
If db.name always follow db.jdbc_url, then use grep lookaround,
$ echo "${PROPERTY}" | grep -oP '(?<=db.jdbc_url=).*(?=db.name)'
jdbc\:mysql\://mysql-test.com\:3306/db02
or add the VAR to an array,
$ myarr=($(echo $PROPERTY))
$ echo "${myarr[1]}" | grep -oP '(?<=db.jdbc_url=).*(?=$)'
jdbc\:mysql\://mysql-test.com\:3306/db02
This is caused because you are using the substitute command (sed s/.../.../), so any text before your regex is kept as is. Using .* before db\.jdbc_url along with the begin (^) / end ($) of string marks makes you match the whole content of the variable.
In order to be totaly safe, your regex should be :
sed -e 's/^.*db\.jdbc_url=\(\S*\).*$/\1/g'
You can use grep for this, like so:
echo $PROPERTY | grep -oE "db.jdbc_url=\S+" | cut -d'=' -f2
The regex is very close to the one you used with sed.
The -o option is used to print the matched parts of the matching line.
Edit: if you want only the value, cut on the '='
Edit 2: egrep say it is deprecated, so use grep -oE instead, same result. Just to cover all bases :-)

How to assign command output to a variable

I want to assign the ouput of the following command to a variable in shell:
${arr2[0]} | rev | cut -c 9- | rev
For example:
mod=${arr2[0]} | rev | cut -c 9- | rev
echo $mod
The above method is not working: the output is blank.
I also tried:
mod=( "${arr2[0]}" | rev | cut -c 9- | rev )
But I get the error:
34: syntax error near unexpected token `|'
line 34: ` mod=( "${arr2[0]}" | rev | cut -c 9- | rev ) '
To add an explanation to your correct answer:
You had to combine your variable assignment with a command substitution (var=$(...)) to capture the (stdout) output of your command in a variable.
By contrast, your original command used just var=(...) - no $ before the ( - which is used to create arrays[1], with each token inside ( ... ) becoming its own array element - which was clearly not your intent.
As for why your original command broke:
The tokens inside (...) are subject to the usual shell expansions and therefore the usual quoting requirements.
Thus, in order to use $ and the so-called shell metacharacters (| & ; ( ) < > space tab) as literals in your array elements, you must quote them, e.g., by prepending \.
All these characters - except $, space, and tab - cause a syntax error when left unquoted, which is what happened in your case (you had unquoted | chars.)
[1] In bash, and also in ksh and zsh. The POSIX shell spec. doesn't support arrays at all, so this syntax will always break in POSIX-features-only shells.
mod=$(echo "${arr2[0]}" | rev | cut -c 9- | rev )
echo "****:"$mod
or
mod=`echo "${arr2[0]}" | rev | cut -c 9- | rev`
echo "****:"$mod

Get string between strings in bash

I want to get the string between <sometag param=' and '>
I tried to use the method from Get any string between 2 string and assign a variable in bash to get the "x":
echo "<sometag param='x'><irrelevant stuff='nonsense'>" | tr "'" _ | sed -n 's/.*<sometag param=_\(.*\)_>.*/\1/p'
The problem (apart from low efficiency because I just cannot manage to escape the apostrophe correctly for sed) is that sed matches the maximum, i.e. the output is:
x_><irrelevant stuff=_nonsense
but the correct output would be the minimum-match, in this example just "x"
Thanks for your help
You are probably looking for something like this:
sed -n "s/.*<sometag param='\([^']*\)'>.*/\1/p"
Test:
echo "<sometag param='x'><irrelevant stuff='nonsense'>" | sed -n "s/.*<sometag param='\([^']*\)'>.*/\1/p"
Results:
x
Explanation:
Instead of a greedy capture, use a non-greedy capture like: [^']* which means match anything except ' any number of times. To make the pattern stick, this is followed by: '>.
You can also use double quotes so that you don't need to escape the single quotes. If you wanted to escape the single quotes, you'd do this:
-
... | sed -n 's/.*<sometag param='\''\([^'\'']*\)'\''>.*/\1/p'
Notice how that the single quotes aren't really escaped. The sed expression is stopped, an escaped single quote is inserted and the sed expression is re-opened. Think of it like a four character escape sequence.
Personally, I'd use GNU grep. It would make for a slightly shorter solution. Run like:
... | grep -oP "(?<=<sometag param=').*?(?='>)"
Test:
echo "<sometag param='x'><irrelevant stuff='nonsense'>" | grep -oP "(?<=<sometag param=').*?(?='>)"
Results:
x
You don't have to assemble regexes in those cases, you can just use ' as the field separator
in="<sometag param='x'><irrelevant stuff='nonsense'>"
IFS="'" read x whatiwant y <<< "$in" # bash
echo "$whatiwant"
awk -F\' '{print $2}' <<< "$in" # awk

Print word between two characters by going backward in the line

I having problems in extracting the word from a line. What i want is that it picks the first word before the symbol # but after the /. Which is the only delimiter that stand out.
A line looks like this:
,["https://picasaweb.google.com/111560558537332305125/Programming#5743548966953176786",1,["https://lh6.googleusercontent.com/-Is8rb8G1sb8/T7UvWtVOTtI/AAAAAAAAG68/Cht3FzfHXNc/s0-d/Geek.jpg",1920,1200]
I want the word Programming.
To get that line i am using this which narrows it down.
sed -n '/.*picasa.*.jpg/p' 5743548866439293105
So i want it to pretty much find # and then go backward until it hit the first /. Then print it out. In this case the word should be Programming but could be anything.
I want it to be as short as possible and have experimented with
sed -n '/.*picasa.*.jpg/p' 5743548866439293105 | awk '$0=$2' FS="/" RS="[$#]"
You can do that with sed (slightly shortened for formatting but works on your original string as well):
pax> echo ',["https://p.g.com/111/Prog#574' | sed 's/^[^#]*\/\([^#]*\)#.*$/\1/'
Prog
pax>
Explaining in more detail:
/---+------------------> greedy capture up to '/'.
/ |
| | /------+---------> capture the stuff between '/' and '#'.
| |/ |
| || | /-+-----> everything from '#' to end of line.
| || |/ |
| || || |
's/^[^#]*\/\([^#]*\)#.*$/\1/'
||
\+---> replace with captured group.
It basically searches for an entire line that has the pattern you want (first # following a /), whilst capturing (with the \( and \) brackets) just the stuff between / and #.
The substitution then replaces the entire line with just that captured text you're interested in (via \1).
Using grep with some Perl regex extensions:
echo $string | grep -P -o "(?<=/)[^/]+(?=#)"
-P tells grep to use Perl extensions. -o tells grep to display only the matched text. To understand what gets matched, break the regex into three parts: (?<=/), [^/]+?, and (?=#). The first part says that the matched text must follow a '/', without including the '/' in the match. The second parts matches a string of non-'/' characters. The last part says that the matched text must be immediately followed by a '#', without including the '#' in the match.
Another grep, using the "\K" feature to "throw away" the match up to the last '/' before the '#':
# Match as much as possible up to a '/', but throw it away, then match as much as you can
# up to the first #
echo $string | grep -oP ".*/\K.+(?=#)"
Using cut and awk to get the first field (splitting on #) followed by the last field (splitting on /):
echo $string | cut -d# -f1 | awk -F/ '{print $NF}'
Using some temporary variables and bash's parameter expansion facilities:
$ FOO=["https://picasaweb.google.com/111560558537332305125/Programming#5743548966953176786",1,["https://lh6.googleusercontent.com/-Is8rb8G1sb8/T7UvWtVOTtI/AAAAAAAAG68/Cht3FzfHXNc/s0-d/Geek.jpg",1920,1200]
$ BAR=${FOO%#*} # Strip the last # and everything after
$ echo $BAR
[https://picasaweb.google.com/111560558537332305125/Programming
$ BAZ=${BAR##*/} # Strip everything up to and including the last /
$ echo $BAZ
Programming
This might work for you:
sed '/.*\/\([^#]*\)#.*/{s//\1/;q};d' file

Resources