sed pattern parts as input for other bash function - bash

I'm trying to replace floating-point numbers like 1.2e + 3 with their integer value 1200. For this I use sed in the following way:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/$(echo \1*10^\2|bc -l)/"
but the pattern parts \1 and \2 doesn't get evaluated in the echo.
Is there a way to solve this problem with sed?
Thanks in advance

Within the double quotes, \1 and \2 are interpreted as literal 1 and 2.
You need to put additional backslashes to escape them. In addition, $(command substitution) in
sed replacement seems not to work when combined with back references.
If you are using GNU sed, you can instead say something like:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/echo \"\\1*10^\\2\"|bc -l/;e"
which yields:
12000.0
If you want to chop off the decimal point, you'll know what to do ;-).

If you are happy with awk command like this can do the work:
echo 1.2e+4|awk '{printf "%d",$0}'

It is perhaps better to use perl (or other typed language) to manage the variable types:
echo '"1.2e+04"' | perl -lane 'my $a=$_;$a=~ s/"//g;print sprintf("%.10g",$a);print $a;'
In any case, your sed expression is incorrect, it should be:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/$(echo \1*10^\3 + \2*10^$(echo \3 - 1 | bc -l)|bc -l)/"

The best way to solve the problem properly is to use an advanced combination of # tshiono and # Romeo solutions:
sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
So it is possible to convert all such floats into arbitrary contexts.
for example:
echo '"1.2e+04"' | sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
outputs
"12000"
and
echo 'abc"1.2e+04"def' | sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
outputs
abc"12000"def

Related

Replace pipe character "|" with escaped pip character "\|" in string in bash script

I am trying to replace a pipe character in an String with the escaped character in it:
Input: "text|jdbc"
Output: "text\|jdbc"
I tried different things with tr:
echo "text|jdbc" | tr "|" "\\|"
...
But none of them worked.
Any help would be appreciated.
Thank you,
tr is good for one-to-one mapping of characters (read "translate").
\| is two characters, you cannot use tr for this. You can use sed:
echo 'text|jdbc' | sed -e 's/|/\\|/'
This example replaces one |. If you want to replace multiple, add the g flag:
echo 'text|jdbc' | sed -e 's/|/\\|/g'
An interesting tip by #JuanTomas is to use a different separator character for better readability, for example:
echo 'text|jdbc' | sed -e 's_|_\\|_g'
You can take advantage of the fact that | is a special character in bash, which means the %q modifier used by printf will escape it for you:
$ printf '%q\n' "text|jdbc"
text\|jdbc
A more general solution that doesn't require | to be treated specially is
$ f="text|jdbc"
$ echo "${f//|/\\|}"
text\|jdbc
${f//foo/bar} expands f and replaces every occurance of foo with bar. The operator here is /; when followed by another /, it replaces all occurrences of the search pattern instead of just the first one. For example:
$ f="text|jdbc|two"
$ echo "${f/|/\\|}"
text\|jdbc|two
$ echo "${f//|/\\|}"
text\|jdbc\|two
You can try with awk:
echo "text|jdbc" | awk -F'|' '$1=$1' OFS="\\\|"

Replacing Nth Occurrence from End of Line via Sed

Do you know of a better or easier way of using sed or Bash to replace from the last occurrence of a regex match from the end of a line without using rev?
Here is the rev way to match the third from last occurrence of the letter 's' – still forward matching, yet utilizing rev to match from the last character.
echo "a declaration of sovereignty need no witness" | rev | sed 's/s/S/3' | rev
a declaration of Sovereignty need no witness
Update -- a generalized solution based on Cyruses answer:
SEARCH="one"
OCCURRENCE=3
REPLACE="FOUR"
SED_DELIM=$'\001'
SEARCHNEG=$(sed 's/./[^&]*/g' <<< "${SEARCH}")
sed -r "s${SED_DELIM}${SEARCH}((${SEARCHNEG}${SEARCH}){$((${OCCURRENCE}-1))}${SEARCHNEG})\$${SED_DELIM}${REPLACE}\1${SED_DELIM}" <<< "one one two two one one three three one one"
Note: To be truly generic it should escape regex items from the LHS.
With GNU sed:
sed -r 's/s(([^s]*s){2}[^s]*)$/S\1/' file
Output:
a declaration of Sovereignty need no witness
See: The Stack Overflow Regular Expressions FAQ
Using awk:
str='a declaration of sovereignty need no witness'
awk -v r='S' -v n=3 'BEGIN{FS=OFS="s"} {
for (i=1; i<=NF; i++) printf "%s%s", $i, (i<NF)?(i==NF-n?r:OFS):ORS}' <<< "$str"
Output:
a declaration of Sovereignty need no witness
You first need to count the number of occurances and figure out which one to replace:
echo "importantword1 importantword2 importantword3 importantword4 importantword5 importantword6" |
grep -o "importantword" | wc -l
Use that for your sed string
echo "importantword1 importantword2 importantword3 importantword4 importantword5 importantword6" |
sed 's/\(\(.*importantword.*\)\{3\}\)importantword\(\(.*importantword.*\)\{2\}\)/\1ExportAntword\3/'
importantword1 importantword2 importantword3 ExportAntword4 importantword5 importantword6
You get the same answer with setting variables for left and right:
l=3
r=2
echo "importantword1 importantword2 importantword3 importantword4 importantword5 importantword6" |
sed 's/\(\(.*importantword.*\)\{'$l'\}\)importantword\(\(.*importantword.*\)\{'$r'\}\)/\1ExportAntword\3/'
or
str="\(.*importantword.*\)"
echo "importantword1 importantword2 importantword3 importantword4 importantword5 importantword6" |
sed 's/\('${str}'\{'$l'\}\)importantword\('${str}'\{'$r'\}\)/\1ExportAntword\3/'
In Perl, you can use a look-ahead assertion:
echo "a declaration of sovereignty need no witness" \
| perl -pe 's/s(?=(?:[^s]*s[^s]*){2}$)/S/'
(?=...) is the look-ahead assertion, it means "if followed by ...", but the ... part is not matched and therefore not substituted
[^s]*s[^s]* means there's only one s wrapped possibly by non-ses
{2} is the number of repetitions, i.e. we want two s'es after the one we want to replace

Replace string by regex

I have bunch of string like "{one}two", where "{one}" could be different and "two" is always the same. I need to replace original sting with "three{one}", "three" is also constant. It could be easily done with python, for example, but I need it to be done with shell tools, like sed or awk.
If I understand correctly, you want:
{one}two --> three{one}
{two}two --> three{two}
{n}two --> three{n}
SED with a backreference will do that:
echo "{one}two" | sed 's/\(.*\)two$/three\1/'
The search store all text up to your fixed string, and then replace with the your new string pre-appended to the stored text. SED is greedy by default, so it should grab all text up to your fixed string even if there's some repeat in the variable part (e.gxx`., {two}two will still remap to three{two} properly).
Using sed:
s="{one}two"
sed 's/^\(.*\)two/three\1/' <<< "$s"
three{one}
echo "XXXtwo" | sed -E 's/(.*)two/three\1/'
Here's a Bash only solution:
string="{one}two"
echo "three${string/two/}"
awk '{a=gensub(/(.*)two/,"three\\1","g"); print a}' <<< "{one}two"
Output:
three{one}
awk '/{.*}two/ { split($0,s,"}"); print "three"s[1]"}" }' <<< "{one}two"
does also output
three{one}
Here, we are using awk to find the correct lines, and then split on "}" (which means your lines should not contain more than the one to indicate the field).
Through GNU sed,
$ echo 'foo {one}two bar' | sed -r 's/(\{[^}]*\})two/three\1/g'
foo three{one} bar
Basic sed,
$ echo 'foo {one}two bar' | sed 's/\({[^}]*}\)two/three\1/g'
foo three{one} bar

Swap characters in a string using shell script

I need to swap characters of a string (which is mmddyyyy format) and rearrange them in yyyymmdd. This string is obtained from a file name (abc_def_08032011.txt).
string=$(ls abc_def_08032011.txt | awk '{print substr($0,9,8)}')
For example:
Current string: 08032011 (This may not necessarily be the current date)
Desired string: 20110803
I tried split function, but it won't work since the string does not have any delimiter.
Any ideas/suggestions greatly appreciated.
echo 08032011 | sed 's/\(....\)\(....\)/\2\1/'
or
echo 08032011 | perl -pe 's/(....)(....)/$2$1/'
Why not using awk all the way:
echo abc_def_08032011.txt | awk '{print substr($0,13,4) substr($0,9,4)}'
or sed all the way, avoiding one awk:
echo abc_def_08032011.txt | sed 's/^........\(....\)\(....\).*$/\2\1/'
or using ksh substitution all the way to avoid spawning a awk/sed process:
s=abc_def_08032011.txt
s1="${s#????????}"
s2="${s1%.*}"
echo "${s2#????}${s2%????}"

sed help - convert a string of form ABC_DEF_GHI to AbcDefGhi

How can covert a string of form ABC_DEF_GHI to AbcDefGhi using any online command such as sed etc. ?
Here's a one-liner using gawk:
echo ABC_DEF_GHI | gawk 'function cap(s){return toupper(substr(s,1,1))tolower(substr(s,2))}{n=split($0,x,"_");for(i=1;i<=n;i++)o=o cap(x[i]); print o}'
AbcDefGhi
Optimized awk 1-liner
awk -v RS=_ '{printf "%s%s", substr($0,1,1), tolower(substr($0,2))}'
Optimized sed 1-liner
sed 's/\(.\)\(..\)_\(.\)\(..\)_\(.\)\(..\)/\1\L\2\U\3\L\4\U\5\L\6/'
Edit:
Here's a gawk version:
gawk -F_ '{for (i=1;i<=NF;i++) printf "%s%s",substr($i,1,1),tolower(substr($i,2)); printf "\n"}'
Original:
Using sed for this is pretty scary:
sed -r 'h;s/(^|_)./\n/g;y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/;x;s/((^|_)(.))[^_]*/\3\n/g;G;:a;s/(^.*)([^\n])\n\n(.*)\n([^\n]*)$/\1\n\2\4\3/;ta;s/\n//g'
Here it is broken down:
# make a copy in hold space
h;
# replace all the characters which will remain upper case with newlines
s/(^|_)./\n/g;
# lowercase all the remaining characters
y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/;
# swap the copy into pattern space and the lowercase characters into hold space
x;
# discard all but the characters which will remain upper case
s/((^|_)(.))[^_]*/\3\n/g;
# append the lower case characters to the end of pattern space
G;
# top of the loop
:a;
# shuffle the lower case characters back into their proper positions (see below)
s/(^.*)([^\n])\n\n(.*)\n([^\n]*)$/\1\n\2\4\3/;
# if a replacement was made, branch to the top of the loop
ta;
# remove all the newlines
s/\n//g
Here's how the shuffle works:
At the time it starts, this is what pattern space looks like:
A
D
G
bc
ef
hi
The shuffle loop picks up the string that's between the last newline and the end and moves it to the position before the two consecutive newlines (actually three) and moves the extra newline so it's before the character that it previously followed.
After the first step through the loop, this is what pattern space looks like:
A
D
Ghi
bc
ef
And processing proceeds similarly until there's nothing before the extra newline at which point the match fails and the loop branch is not taken.
If you want to title case a sequence of words separated by spaces, the script would be similar:
$ echo 'BEST MOVIE THIS YEAR' | sed -r 'h;s/(^| )./\n/g;y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/;x;s/((^| ).)[^ ]*/\1\n/g;G;:a;s/(^.*)( [^\n]*)\n\n(.*)\n([^\n]*)$/\1\n\2\4\3/;ta;s/^([^\n]*)(.*)\n([^\n]*)$/\1\3\2/;s/\n//g'
Best Movie This Year
One liner using perl:
$ echo 'ABC_DEF_GHI' | perl -npe 's/([A-Z])([^_]+)_?/$1\L$2\E/g;'
AbcDefGhi
This might work for you:
echo "ABC_DEF_GHI" |
sed 'h;s/\(.\)[^_]*\(_\|$\)/\1/g;x;y/'$(printf "%s" {A..Z} / {a..z})'/;G;:a;s/\(\(^[a-z]\)\|_\([a-z]\)\)\([^\n]*\n\)\(.\)/\5\4/;ta;s/\n//'
AbcDefGhi
Or using GNU sed:
echo "ABC_DEF_GHI" | sed 's/\([A-Z]\)\([^_]*\)\(_\|$\)/\1\L\2/g'
AbcDefGhi
Less scary sed version with tr:
echo ABC_DEF_GHI | sed -e 's/_//g' - | tr 'A-Z' 'a-z'

Resources