How to replace text inside another string which matches a regex pattern

How to replace text inside another string which matches a regex pattern - bash

${[a-zA-Z0-9._:]*}
I have used the above pattern for grepping words similar to ${SomeText}. How can I replace 'SomeText' with some other string i.e ${SomeText} --> SomeString in my bash script?
Example:
file.txt
text text text ${SomeText1} text text ${SomeText2} text text text text text text
text text text text ${SomeText3} text text text text text text ${SomeText4} text
...
my script file:
SomeText1="foo"
SomeText2="bar"
..
I want to replace ${SomeText1} to $SomeText1 which will be replaced by foo.
Similarly
${SomeText2} > $SomeText2 > bar
text text text foo text text bar text text text text text text
text text text text baz text text text text text text qux text
...

You can try this script:
while IFS='=' read k v; do
v=$(tr -d '"' <<< $v)
sed -i.bak "s/\${$k}/$v/g" file.txt
done < script

You can use sed
echo '${SomeText}, SomeText' | sed "s/\${SomeText}/\${SomeOtherText}/g"
will output
${SomeOtherText}, SomeText
if you need to use variables in sed, do it like this
a="SomeText"
b="SomeOtherText"
echo '${SomeText}, SomeText' | sed "s/\${$a}/\${$b}/g"
The output will be the same as above
EDIT:
Better use arrays to store that strings
SomeText[1]="foo"
SomeText[2]="bar"
i=0
while [[ $i < $MAX_i ]]
do
((i++))
sed -i "s/\${SomeText[$i]}/${SomeText[$i]}/g"
done
If you really need your approach, you will need double evaluation
SomeText[1]="foo"
SomeText[2]="bar"
i=0
while [[ $i < $MAX_i ]]
do
((i++))
toRepl="SomeText"$i
scriptToCall=$(eval echo "s/\\\${$toRepl}/$`eval echo ${toRepl}`/g")
sed -i $scriptToCall file
done
So you iterate thru all indices (SomeText1, SomeText2, ...), save the current into var toRepl, evaluate the script you want to call, for i=1 it will be scriptToCall="s/\${SomeText[1]}/foo/g" cause you double evaluate toRepl -> ${SomeText1} -> foo and you pass that script to sed :)
Is it clear now?

Related

Remove characters between markers in a bash variable

I'm trying to remove unknown characters between 2 known markers from a variable using bash.
eg
string="This text d #! more text jsdlj end and mo{re ;re end text.text"
I want to remove all the characters between the last word "text " (before the end word) and the first occurance thereafter called "end" . ie between the last occurance of the word "text " after that the first occurance of the word "end", but keeping both these markers)
result="This text d #! more text end and mo{re ;re end text.text"
I'll be using it as part of a find -print0 | xargs -0 bash -c 'command; command...etc.' script.
I've tried
echo $string | sed 's/[de][ex][ft][^\-]*//' ;
but that does it from the first "ext" and "-" (not the last "ext" before the end marker) and also does not retain the markers.
Any suggestions?
EDIT: So far the outcomes are as follows:
string="text text text lk;sdf;-end end 233-end.txt"
start="text "
end="-end"
Method 1
[[ $string =~ (.*'"${start}"').*('"${end}"'.*) ]] || :
nstring="${BASH_REMATCH[1]}${BASH_REMATCH[2]}" ;
echo "$nstring" ;
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 2
temp=${cname%'"$end"'*}
nend=${cname#"$temp"}
nstart=${temp%'"$start"'*}
echo "$nstart$nend"
>"text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 3
nstring=$(sed -E "s/(.*'"$start"').*('"$end"')/\1\2/" <<< "$string")
echo "$nstring";
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 4
nstring=$(sed -En "s/(^.*'"$start"').*('"$end"'.*$)/\1\2/p" <<< "$string")
echo "$nstring" ;
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"

Using Bash's Regex match:
#!/usr/bin/env bash
string='This text and more text jsdlj-end.text'
[[ $string =~ (.*text\ ).*(-end.*) ]] || :
printf %s\\n "${BASH_REMATCH[1]}${BASH_REMATCH[2]}"

UPDATE: question has been updated with more details for dealing with a string that contains multiple start and end markers.
The new input string:
This text d #! more text jsdlj end and mo{re ;re end text.text
Test case:
start marker = 'text'
end marker = 'end'
objective = remove all text between last start marker and before the first end marker (actually replace all said text with a single space)
Input with all markers in bold:
This text d #! more text jsdlj end and mo{re ;re end text.text
Input with the two markers of interest in bold:
This text d #! more text jsdlj end and mo{re ;re end text.text
Desired result:
This text d #! more text end and mo{re ;re end text.text
While we can use sed to remove the desired text (replace <space>jsdlj<space> with <space>), we have to deal with the fact that sed does greedy matching (fine for finding the 'last' start marker) but does not do non-greedy matching (needed to find the 'first' end marker). We can get around this limitation by switching out our end marker with a single-character replacement, simulate a non-greedy match, then switch back to the original end marker.
m1='text' # start marker
m2='end' # end marker
string="This text d #! more text jsdlj end and mo{re ;re end text.text"
sed -E "s/${m2}/#/g;s/(^.*${m1})[^#]*(#.*$)/\1 \2/;s/#/${m2}/g" <<< "${string}"
Where:
-E - enable Extended regex support (includes capture groups)
s/${m2}/#/g - replace our end marker with the single character # (OP needs to determine what character cannot show up in expected input strings)
(^.*${m1}) - 1st capture group; greedy match from start of string up to last start marker before ...
[^#]* - match everything that's not the # character
(#.*$) - 2nd capture group; everything from # character until end of string
\1 \2 - replace entire string with 1st capture group + <space> + 2nd capture group
s/#/${m2}/g - replace single character # with our end marker
This generates:
This text d #! more text end and mo{re ;re end text.text
Personally, I'd probably opt for a more straight forward parameter expansion approach (similiar to Jetchisel's answer) but that could be a bit problematic for inline xargs processing ... ???
Original answer
One sed idea using capture groups:
$ string="This text and more text jsdlj-end.text"
$ sed -En 's/(^.*text ).*(-end.*$)/\1\2/p' <<< "${string}"
This text and more text -end.text
Where:
-En - enable Extended regex support (and capture groups) and (-n) disable default printing of pattern space
(^.*text ) - first capture group = start of line up to last text
.* - everything between the 2 capture groups
(-end.*$) - second capture group = from -end to end of string
\1\2/p - print the contents of the 2 capture groups.
Though this runs into issues if there are multiple -end strings on the 'end' of the string, eg:
$ string="This text and more text jsdlj-end -end.text"
$ sed -En 's/(^.*text ).*(-end.*$)/\1\2/p' <<< "${string}"
This text and more text -end.text
Whether this is correct or not depends on the desired output (and assuming this type of 'double' ending string is possible).

With Parameter Expansion.
string="This text and more text jsdlj-end.text"
temp=${string%-*}
end=${string#"$temp"}
start=${temp% *}
echo "$start$end"

This is a bit tricky using only a posix extended regex (ERE), but easy with a perl compatible regex (PCRE). Therefore, we switch from sed to perl:
To get the last text (that still has a end afterwards), put a .* in front. The closest end to that text can then be matched using a non-greedy .*?.
Here we also put \b around text and end to avoid matching parts of other words (for example, the word send should not be matched even though it contains end too).
perl -pe 's/(.*\btext\b).*?(\bend\b)/\1 \2/' <<< "$string"

replace ending brackets with another string in bash

I would like to replace the last 3 lines with another string.. using sed, tr, or other bash solution.
Given file:
{
[
{
text text text
text text text
text text text
}
],
[
{
text text text
text text text
text text text
}
]
}
desired result:
{
[
{
text text text
text text text
text text text
}
],
[
{
text text text
text text text
text text text
bar
I tried this with sed
sed -i '' 's/\}\s+\]\s+\}/bar/g' foobar.hcl
tried this with tr
tr -s 's/\}[:blank:]\][:blank:]\}/bar/g' <foobar.hcl

With perl where you can read entire input as a single string using -0777 option. Not suitable if input is large enough to run out of available memory.
# this will replace all remaining whitespaces at the end
# with a single newline
perl -0777 -pe 's/\}\s+]\s+\}\s*\z/bar\n/' foobar.hcl
# this will preserve all remaining whitespaces, if any
perl -0777 -pe 's/\}\s+]\s+\}(?=\s*\z)/bar/' foobar.hcl
Once it is working, you can use perl -i -0777 ... for in-place editing.

This might work for you (GNU sed):
sed '1N;:a;N;/^\s*}\s*\n\s*]\s*\n}\s*$/{s//bar/;N;ba};P;D' file
Open a 3 line window and pattern match.

Using an array - assumes "text text text" has some actual nonspace, non-punctuation characters.
mapfile x < file # throw into an array
c=${#x[#]} # count the lines
let c-- # point c at last index
until [[ "${x[-1]}" =~ [^[:space:][:punct:]] ]] # while last line has no data
do let c-- # decrement the last line pointer
x=( "${x[#]:0:$c}" ) # reassign array without last line
done
x+=( bar ) # add the desired string
echo "${x[#]}" > file # write file without unwanted lines
Allows for any number of blank lines &c. Even }]} and such, so long as it isn't on the same line with the data.

Sed - Insert line with text after match pattern between two strings

I want to insert text below line with certain string only in block between two patterns.
Example input file:
text text
text text
...
[textabc pattern 1]
text text
text text
xyz = 123 #below this string I want to insert new text
text text
[textdef pattern 2]
text text
text text
I want to insert "NEW STRING" below line "xyz = 123" but only if string is between "[textabc pattern 1]" and "[textdef pattern 2]".
Output file:
text text
text text
...
[textabc pattern 1]
text text
text text
xyz = 123
NEW STRING
text text
[textdef pattern 2]
text text
text text
I have tried something like this:
sed -i '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/ ^xyz .*/a NEW STRING/' /folder/file.txt
How do I do this using sed?

Data set:
$ cat test.txt
text text
text text
[textabc pattern 1]
text text
text text
xyz = 123
text text
[textdef pattern 2]
text text
text text
A couple small changes to OPs current sed command:
# current
sed '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/ ^xyz .*/a NEW STRING/' test.txt
# new/proposed (2 lines); the 'a'ppend option requires a new line before the end '}'
sed -e '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/{/^xyz .*/aNEW STRING
}' test.txt
# new/proposed (1 line); break into 2 segments via a 2nd '-e' flag to eliminate need for embedded newline character
sed -e '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/{/^xyz .*/a'"NEW STRING" -e '}' test.txt
Both of the above new/proposed sed commands generate the following:
text text
text text
[textabc pattern 1]
text text
text text
xyz = 123
NEW STRING
text text
[textdef pattern 2]
text text
text text
NOTE: Once OP is satisfied with the results the -i flag can be reintroduced to allow sed to make in-place changes to data file.

This might work for you (GNU sed):
sed '/\[textabc pattern 1\]/{ # match the first pattern
:a # loop name
N # append next line
/\[textdef pattern 2\]/!ba # match the second pattern or repeat
s/^xyz = 123.*$/&\nNEW STRING/m}' file # match third pattern and append

Print text between two strings on the same line

I've been searching for a ling time, and have not been able to find a working answer for my problem.
I have a line from an HTML file extracted with sed '162!d' skinlist.html, which contains the text
<a href="/skin/dwarf-red-beard-734/" title="Dwarf Red Beard">.
I want to extract the text Dwarf Red Beard, but that text is modular (can be changed), so I would like to extract the text between title=" and ".
I cannot, for the life of me, figure out how to do this.

awk 'NR==162 {print $4}' FS='"' skinlist.html
set field separator to "
print only line 162
print field 4

Solution in sed
sed -n '162 s/^.*title="\(.*\)".*$/\1/p' skinlist.html
Extracts line 162 in skinlist.html and captures the title attributes contents in\1.

The shell's variable expansion syntax allows you to trim prefixes and suffixes from a string:
line="$(sed '162!d' skinlist.html)" # extract the relevant line from the file
temp="${line#* title=\"}" # remove from the beginning through the first match of ' title="'
if [ "$temp" = "$line" ]; then
echo "title not found in '$line'" >&2
else
title="${temp%%\"*}" # remote from the first '"' through the end
fi

You can pass it through another sed or add expressions to that sed like -e 's/.*title="//g' -e 's/">.*$//g'

also sed
sed -n '162 s/.*"\([a-zA-Z ]*\)"./\1/p' skinlist.html

How to split a string in bash delimited by tab

I'm trying to split a tab delimitted field in bash.
I am aware of this answer: how to split a string in shell and get the last field
But that does not answer for a tab character.
I want to do get the part of a string before the tab character, so I'm doing this:
x=`head -1 my-file.txt`
echo ${x%\t*}
But the \t is matching on the letter 't' and not on a tab. What is the best way to do this?
Thanks

If your file look something like this (with tab as separator):
1st-field 2nd-field
you can use cut to extract the first field (operates on tab by default):
$ cut -f1 input
1st-field
If you're using awk, there is no need to use tail to get the last line, changing the input to:
1:1st-field 2nd-field
2:1st-field 2nd-field
3:1st-field 2nd-field
4:1st-field 2nd-field
5:1st-field 2nd-field
6:1st-field 2nd-field
7:1st-field 2nd-field
8:1st-field 2nd-field
9:1st-field 2nd-field
10:1st-field 2nd-field
Solution using awk:
$ awk 'END {print $1}' input
10:1st-field
Pure bash-solution:
#!/bin/bash
while read a b;do last=$a; done < input
echo $last
outputs:
$ ./tab.sh
10:1st-field
Lastly, a solution using sed
$ sed '$s/\(^[^\t]*\).*$/\1/' input
10:1st-field
here, $ is the range operator; i.e. operate on the last line only.
For your original question, use a literal tab, i.e.
x="1st-field 2nd-field"
echo ${x% *}
outputs:
1st-field

Use $'ANSI-C' strings in the parameter expansion:
$ x=$'abc\tdef\tghi'
$ echo "$s"
abc def ghi
$ echo ">>${x%%$'\t'*}<<"
>>abc<<

read field1 field2 <<< ${tabDelimitedField}
or
read field1 field2 <<< $(command_producing_tab_delimited_output)

Use awk.
echo $yourfield | awk '{print $1}'
or, in your case, for the first field from the the last line of a file
tail yourfile | awk '{x=$1}END{print x}'

There is an easy way for a tab separated string : convert it to an array.
Create a string with tabs ($ added before for '\t' interpretation) :
AAA=$'ABC\tDEF\tGHI'
Split the string as an array using parenthesis :
BBB=($AAA)
Get access to any element :
echo ${BBB[0]}
ABC
echo ${BBB[1]}
DEF
echo ${BBB[2]}
GHI

x=first$'\t'second
echo "${x%$'\t'*}"
See QUOTING in man bash

The answer from https://stackoverflow.com/users/1815797/gniourf-gniourf hints at the use of built in field parsing in bash, but does not really complete the answer. The use of the IFS shell parameter to set the input field separate will complete the picture and give the ability to parse files which are tab-delimited, of a fixed number of fields, in pure bash.
echo -e "a\tb\tc\nd\te\tf" > myfile
while IFS='<literaltab>' read f1 f2 f3;do echo "$f1 = $f2 + $f3"; done < myfile
a = b + c
d = e + f
Where, of course, is replaced by a real tab, not \t. Often, Control-V Tab does this in a terminal.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to replace text inside another string which matches a regex pattern - bash

You can try this script: while IFS='=' read k v; do v=$(tr -d '"' <<< $v) sed -i.bak "s/\${$k}/$v/g" file.txt done < script

Related

Remove characters between markers in a bash variable

replace ending brackets with another string in bash

Sed - Insert line with text after match pattern between two strings

Print text between two strings on the same line

How to split a string in bash delimited by tab

Categories

Resources