Replace string by regex - bash

I have bunch of string like "{one}two", where "{one}" could be different and "two" is always the same. I need to replace original sting with "three{one}", "three" is also constant. It could be easily done with python, for example, but I need it to be done with shell tools, like sed or awk.

If I understand correctly, you want:
{one}two --> three{one}
{two}two --> three{two}
{n}two --> three{n}
SED with a backreference will do that:
echo "{one}two" | sed 's/\(.*\)two$/three\1/'
The search store all text up to your fixed string, and then replace with the your new string pre-appended to the stored text. SED is greedy by default, so it should grab all text up to your fixed string even if there's some repeat in the variable part (e.gxx`., {two}two will still remap to three{two} properly).

Using sed:
s="{one}two"
sed 's/^\(.*\)two/three\1/' <<< "$s"
three{one}

echo "XXXtwo" | sed -E 's/(.*)two/three\1/'

Here's a Bash only solution:
string="{one}two"
echo "three${string/two/}"

awk '{a=gensub(/(.*)two/,"three\\1","g"); print a}' <<< "{one}two"
Output:
three{one}

awk '/{.*}two/ { split($0,s,"}"); print "three"s[1]"}" }' <<< "{one}two"
does also output
three{one}
Here, we are using awk to find the correct lines, and then split on "}" (which means your lines should not contain more than the one to indicate the field).

Through GNU sed,
$ echo 'foo {one}two bar' | sed -r 's/(\{[^}]*\})two/three\1/g'
foo three{one} bar
Basic sed,
$ echo 'foo {one}two bar' | sed 's/\({[^}]*}\)two/three\1/g'
foo three{one} bar

Related

sed pattern parts as input for other bash function

I'm trying to replace floating-point numbers like 1.2e + 3 with their integer value 1200. For this I use sed in the following way:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/$(echo \1*10^\2|bc -l)/"
but the pattern parts \1 and \2 doesn't get evaluated in the echo.
Is there a way to solve this problem with sed?
Thanks in advance
Within the double quotes, \1 and \2 are interpreted as literal 1 and 2.
You need to put additional backslashes to escape them. In addition, $(command substitution) in
sed replacement seems not to work when combined with back references.
If you are using GNU sed, you can instead say something like:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/echo \"\\1*10^\\2\"|bc -l/;e"
which yields:
12000.0
If you want to chop off the decimal point, you'll know what to do ;-).
If you are happy with awk command like this can do the work:
echo 1.2e+4|awk '{printf "%d",$0}'
It is perhaps better to use perl (or other typed language) to manage the variable types:
echo '"1.2e+04"' | perl -lane 'my $a=$_;$a=~ s/"//g;print sprintf("%.10g",$a);print $a;'
In any case, your sed expression is incorrect, it should be:
echo '"1.2e+04"' | sed "s/\"\([0-9]\+\.[0-9]\+\)e+\([0-9]\+\)\"/$(echo \1*10^\3 + \2*10^$(echo \3 - 1 | bc -l)|bc -l)/"
The best way to solve the problem properly is to use an advanced combination of # tshiono and # Romeo solutions:
sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
So it is possible to convert all such floats into arbitrary contexts.
for example:
echo '"1.2e+04"' | sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
outputs
"12000"
and
echo 'abc"1.2e+04"def' | sed "s/\(.*\)\([0-9]\+\.[0-9]\+e+[0-9]\+\)\(.*\)/printf '\1'\; echo \2 |awk '{printf \"%d\",\$0}'\;printf '\3'\;/e"
outputs
abc"12000"def

Replace pipe character "|" with escaped pip character "\|" in string in bash script

I am trying to replace a pipe character in an String with the escaped character in it:
Input: "text|jdbc"
Output: "text\|jdbc"
I tried different things with tr:
echo "text|jdbc" | tr "|" "\\|"
...
But none of them worked.
Any help would be appreciated.
Thank you,
tr is good for one-to-one mapping of characters (read "translate").
\| is two characters, you cannot use tr for this. You can use sed:
echo 'text|jdbc' | sed -e 's/|/\\|/'
This example replaces one |. If you want to replace multiple, add the g flag:
echo 'text|jdbc' | sed -e 's/|/\\|/g'
An interesting tip by #JuanTomas is to use a different separator character for better readability, for example:
echo 'text|jdbc' | sed -e 's_|_\\|_g'
You can take advantage of the fact that | is a special character in bash, which means the %q modifier used by printf will escape it for you:
$ printf '%q\n' "text|jdbc"
text\|jdbc
A more general solution that doesn't require | to be treated specially is
$ f="text|jdbc"
$ echo "${f//|/\\|}"
text\|jdbc
${f//foo/bar} expands f and replaces every occurance of foo with bar. The operator here is /; when followed by another /, it replaces all occurrences of the search pattern instead of just the first one. For example:
$ f="text|jdbc|two"
$ echo "${f/|/\\|}"
text\|jdbc|two
$ echo "${f//|/\\|}"
text\|jdbc\|two
You can try with awk:
echo "text|jdbc" | awk -F'|' '$1=$1' OFS="\\\|"

Shell scripting - replace every 5 commas with a newline

How can I replace every 5th comma in some input with a newline?
For example:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
becomes
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
Looking for a one-liner using something like sed...
This should work:
sed 's/\(\([^,]*,\)\{4\}[^,]*\),/\1\n/g'
Example:
$ echo "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15" |
> sed 's/\(\([^,]*,\)\{4\}[^,]*\),/\1\n/g'
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
This expression will do.
sed 's/\(\([0-9]\+,\)\{4\}\)\([0-9]\+\),/\1\3\n/g'
http://ideone.com/d4Va2
$ echo -n 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 | xargs -d, printf '%d,%d,%d,%d,%d\n'
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
The accepted solution works, but is overly complicated. Try:
sed ':d s/,/\n/5; P; D; Td'
Not all sed allow commands to be separated by semi-colons, so you may need a literal newline after each semi-colon. Also, I'm not sure that all sed allow a label followed by a command, so a literal newline may be required before the s command. In other words:
sed ':d
s/,/\n/5
P
D
Td'
nawk -F, '{for(i=1;i<=NF;i++){printf("%s%s",$i,i%5?",":"\n")}}' file3
test:
pearl.246> nawk -F, '{for(i=1;i<=NF;i++){printf("%s%s",$i,i%5?",":"\n")}}' file3
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
pearl.247>

SED script to remove whitespace from a region

I have a file with a bunch of strings that looks like this:
new Tab("Hello World")
I want to turn those particular lines into something like:
new Tab("helloWorld")
Is this possible using SED and if so, how can I accomplish this? I figure I have to use grouping and regions but I can't figure out the replacement string.
This is what I have so far
sed -n 's/new Tab("\(.*\)"/new Tab("\1")'
This solution is not perfect: it assumes the line contains just new Tab("some string blah blah blah") and nothing else on that line. Here is my *remove_space.sed:*
/new Tab/ {
s/ *//g
s/newTab/new Tab/
}
To invoke:
sed -f remove_space.sed data.txt
The first substitution blindly remove all spaces, the second puts back a space between new and tab.
You don't have to put this in a file, the script works on command line as well:
sed '/new Tab/s/ *//g;s/newTab/new Tab/' data.txt
I'm not enough of a sed guru, but here's a piece of Perl:
perl -pe 's/(?<=new Tab\(")[^"]+/ lcfirst(join("", split(" ", $&))) /e'
My first thought was using awk but I came up with something I don't really like:
echo "new Tab(\"Hello World\")" | gawk 'match($0, /new Tab\("(.*)"\)/, r) {print r[1]}' | sed -e 's/ *//g'

sed regexp in a bash script

I want to extract a certain part of a string, if it exists. I'm interested in the xml filename, i.e i want whats between an "_" and ".xml".
This is ok, it prints "555"
MYSTRING=`echo "/sdd/ee/publ/xmlfile_555.xml" | sed 's/^.*_\([0-9]*\).xml/\1/'`
echo "STRING = $MYSTRING"
This is not ok because it returns the whole string. In this case I don't want any result.
It prints "/sdd/ee/publ/xmlfile.xml"
MYSTRING=`echo "/sdd/ee/publ/xmlfile.xml" | sed 's/^.*_\([0-9]*\).xml/\1/'`
echo "STRING = $MYSTRING"
Any ideas how to get an "empty" result in the second case.
thanks!
You just need to tell sed to keep its mouth shut if it doesn't find a match. The -n option is used for that.
MYSTRING=`echo "/sdd/ee/publ/xmlfile_555.xml" | sed -n 's/^.*_\([0-9]*\)\.xml/\1/p'`
I only made two changes to what you had: the aforementioned -n option to sed, and the p flag that comes after the s/// command, which tells sed to print the output only if the substitution was successfully done.
EDIT: I've also escaped the final . as suggested in the comments.
Try this?
basename /sdd/ee/publ/xmlfile_555.xml | awk -F_ '{print $2}'
The output is 555.xml
With the other one.
basename /sdd/ee/publ/xmlfile.xml | awk -F_ '{print $2}'
The output is an empty string.
$ path=/sdd/ee/publ/xmlfile_555.xml
$ echo ${path##*/}
xmlfile_555.xml
$ path=${path##*/}
$ echo ${path%.xml}
xmlfile_555
$ path=${path%.xml}
$ echo ${path##*_}
555

Resources