Sed - Insert line with text after match pattern between two strings - bash

I want to insert text below line with certain string only in block between two patterns.
Example input file:
text text
text text
...
[textabc pattern 1]
text text
text text
xyz = 123 #below this string I want to insert new text
text text
[textdef pattern 2]
text text
text text
I want to insert "NEW STRING" below line "xyz = 123" but only if string is between "[textabc pattern 1]" and "[textdef pattern 2]".
Output file:
text text
text text
...
[textabc pattern 1]
text text
text text
xyz = 123
NEW STRING
text text
[textdef pattern 2]
text text
text text
I have tried something like this:
sed -i '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/ ^xyz .*/a NEW STRING/' /folder/file.txt
How do I do this using sed?

Data set:
$ cat test.txt
text text
text text
[textabc pattern 1]
text text
text text
xyz = 123
text text
[textdef pattern 2]
text text
text text
A couple small changes to OPs current sed command:
# current
sed '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/ ^xyz .*/a NEW STRING/' test.txt
# new/proposed (2 lines); the 'a'ppend option requires a new line before the end '}'
sed -e '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/{/^xyz .*/aNEW STRING
}' test.txt
# new/proposed (1 line); break into 2 segments via a 2nd '-e' flag to eliminate need for embedded newline character
sed -e '/^\[textabc pattern 1\]$/,/^\[textdef pattern 2\]/{/^xyz .*/a'"NEW STRING" -e '}' test.txt
Both of the above new/proposed sed commands generate the following:
text text
text text
[textabc pattern 1]
text text
text text
xyz = 123
NEW STRING
text text
[textdef pattern 2]
text text
text text
NOTE: Once OP is satisfied with the results the -i flag can be reintroduced to allow sed to make in-place changes to data file.

This might work for you (GNU sed):
sed '/\[textabc pattern 1\]/{ # match the first pattern
:a # loop name
N # append next line
/\[textdef pattern 2\]/!ba # match the second pattern or repeat
s/^xyz = 123.*$/&\nNEW STRING/m}' file # match third pattern and append

Related

Remove characters between markers in a bash variable

I'm trying to remove unknown characters between 2 known markers from a variable using bash.
eg
string="This text d #! more text jsdlj end and mo{re ;re end text.text"
I want to remove all the characters between the last word "text " (before the end word) and the first occurance thereafter called "end" . ie between the last occurance of the word "text " after that the first occurance of the word "end", but keeping both these markers)
result="This text d #! more text end and mo{re ;re end text.text"
I'll be using it as part of a find -print0 | xargs -0 bash -c 'command; command...etc.' script.
I've tried
echo $string | sed 's/[de][ex][ft][^\-]*//' ;
but that does it from the first "ext" and "-" (not the last "ext" before the end marker) and also does not retain the markers.
Any suggestions?
EDIT: So far the outcomes are as follows:
string="text text text lk;sdf;-end end 233-end.txt"
start="text "
end="-end"
Method 1
[[ $string =~ (.*'"${start}"').*('"${end}"'.*) ]] || :
nstring="${BASH_REMATCH[1]}${BASH_REMATCH[2]}" ;
echo "$nstring" ;
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 2
temp=${cname%'"$end"'*}
nend=${cname#"$temp"}
nstart=${temp%'"$start"'*}
echo "$nstart$nend"
>"text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 3
nstring=$(sed -E "s/(.*'"$start"').*('"$end"')/\1\2/" <<< "$string")
echo "$nstring";
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Method 4
nstring=$(sed -En "s/(^.*'"$start"').*('"$end"'.*$)/\1\2/p" <<< "$string")
echo "$nstring" ;
>"text text text -end.txt"
Required output = "text text text -end end 233-end.txt"
Using Bash's Regex match:
#!/usr/bin/env bash
string='This text and more text jsdlj-end.text'
[[ $string =~ (.*text\ ).*(-end.*) ]] || :
printf %s\\n "${BASH_REMATCH[1]}${BASH_REMATCH[2]}"
UPDATE: question has been updated with more details for dealing with a string that contains multiple start and end markers.
The new input string:
This text d #! more text jsdlj end and mo{re ;re end text.text
Test case:
start marker = 'text'
end marker = 'end'
objective = remove all text between last start marker and before the first end marker (actually replace all said text with a single space)
Input with all markers in bold:
This text d #! more text jsdlj end and mo{re ;re end text.text
Input with the two markers of interest in bold:
This text d #! more text jsdlj end and mo{re ;re end text.text
Desired result:
This text d #! more text end and mo{re ;re end text.text
While we can use sed to remove the desired text (replace <space>jsdlj<space> with <space>), we have to deal with the fact that sed does greedy matching (fine for finding the 'last' start marker) but does not do non-greedy matching (needed to find the 'first' end marker). We can get around this limitation by switching out our end marker with a single-character replacement, simulate a non-greedy match, then switch back to the original end marker.
m1='text' # start marker
m2='end' # end marker
string="This text d #! more text jsdlj end and mo{re ;re end text.text"
sed -E "s/${m2}/#/g;s/(^.*${m1})[^#]*(#.*$)/\1 \2/;s/#/${m2}/g" <<< "${string}"
Where:
-E - enable Extended regex support (includes capture groups)
s/${m2}/#/g - replace our end marker with the single character # (OP needs to determine what character cannot show up in expected input strings)
(^.*${m1}) - 1st capture group; greedy match from start of string up to last start marker before ...
[^#]* - match everything that's not the # character
(#.*$) - 2nd capture group; everything from # character until end of string
\1 \2 - replace entire string with 1st capture group + <space> + 2nd capture group
s/#/${m2}/g - replace single character # with our end marker
This generates:
This text d #! more text end and mo{re ;re end text.text
Personally, I'd probably opt for a more straight forward parameter expansion approach (similiar to Jetchisel's answer) but that could be a bit problematic for inline xargs processing ... ???
Original answer
One sed idea using capture groups:
$ string="This text and more text jsdlj-end.text"
$ sed -En 's/(^.*text ).*(-end.*$)/\1\2/p' <<< "${string}"
This text and more text -end.text
Where:
-En - enable Extended regex support (and capture groups) and (-n) disable default printing of pattern space
(^.*text ) - first capture group = start of line up to last text
.* - everything between the 2 capture groups
(-end.*$) - second capture group = from -end to end of string
\1\2/p - print the contents of the 2 capture groups.
Though this runs into issues if there are multiple -end strings on the 'end' of the string, eg:
$ string="This text and more text jsdlj-end -end.text"
$ sed -En 's/(^.*text ).*(-end.*$)/\1\2/p' <<< "${string}"
This text and more text -end.text
Whether this is correct or not depends on the desired output (and assuming this type of 'double' ending string is possible).
With Parameter Expansion.
string="This text and more text jsdlj-end.text"
temp=${string%-*}
end=${string#"$temp"}
start=${temp% *}
echo "$start$end"
This is a bit tricky using only a posix extended regex (ERE), but easy with a perl compatible regex (PCRE). Therefore, we switch from sed to perl:
To get the last text (that still has a end afterwards), put a .* in front. The closest end to that text can then be matched using a non-greedy .*?.
Here we also put \b around text and end to avoid matching parts of other words (for example, the word send should not be matched even though it contains end too).
perl -pe 's/(.*\btext\b).*?(\bend\b)/\1 \2/' <<< "$string"

replace ending brackets with another string in bash

I would like to replace the last 3 lines with another string.. using sed, tr, or other bash solution.
Given file:
{
[
{
text text text
text text text
text text text
}
],
[
{
text text text
text text text
text text text
}
]
}
desired result:
{
[
{
text text text
text text text
text text text
}
],
[
{
text text text
text text text
text text text
bar
I tried this with sed
sed -i '' 's/\}\s+\]\s+\}/bar/g' foobar.hcl
tried this with tr
tr -s 's/\}[:blank:]\][:blank:]\}/bar/g' <foobar.hcl
With perl where you can read entire input as a single string using -0777 option. Not suitable if input is large enough to run out of available memory.
# this will replace all remaining whitespaces at the end
# with a single newline
perl -0777 -pe 's/\}\s+]\s+\}\s*\z/bar\n/' foobar.hcl
# this will preserve all remaining whitespaces, if any
perl -0777 -pe 's/\}\s+]\s+\}(?=\s*\z)/bar/' foobar.hcl
Once it is working, you can use perl -i -0777 ... for in-place editing.
This might work for you (GNU sed):
sed '1N;:a;N;/^\s*}\s*\n\s*]\s*\n}\s*$/{s//bar/;N;ba};P;D' file
Open a 3 line window and pattern match.
Using an array - assumes "text text text" has some actual nonspace, non-punctuation characters.
mapfile x < file # throw into an array
c=${#x[#]} # count the lines
let c-- # point c at last index
until [[ "${x[-1]}" =~ [^[:space:][:punct:]] ]] # while last line has no data
do let c-- # decrement the last line pointer
x=( "${x[#]:0:$c}" ) # reassign array without last line
done
x+=( bar ) # add the desired string
echo "${x[#]}" > file # write file without unwanted lines
Allows for any number of blank lines &c. Even }]} and such, so long as it isn't on the same line with the data.

If line and the next line starts with a number, append text to the matching line

How can I add a string to the start of a line if the line and the next line start with numbers?
From:
random text
random text
65345
234
random text
random text
random text
random text
random text
random text
9875
789709
random text
random text
random text
To:
random text
random text
appended text 65345
234
random text
random text
random text
random text
random text
random text
appended text 9875
789709
random text
random text
random text
Adding to all lines that start with numbers is as simple as
$ printf "hello\n123\n" | sed 's/^[0-9]/appended text &/'
hello
appended text 123
No idea how to do what I am trying to do though.
"random text" might end in a number
Any ideas?
This sort of thing is best done with awk. Something like:
awk 'prev ~ /^[0-9]/ && /^[0-9]/ { prev = "prepended text " prev}
NR>1 {print prev}
{prev=$0}
END {print prev}' input
Actually, it's probably "best" done in perl, but that seems to be unfashionable these days:
perl -0777 -pe '1 while s/(?<=^)(\d.*\n\d)/prepended text $1/cm' input
Just read in the whole file
perl -0777pe's/^(?=\d.*\n\d)/prepended text /mg'
You could also work with a two-line rolling window.
perl -ne'
push #buf, $_;
next if #buf < 2;
$buf[0] = "prepended text $buf[0]" if $buf[0] =~ /^\d/ && $buf[1] =~ /^\d/;
print(shift(#buf));
END { print #buf; }
'
See Specifying file to process to Perl one-liner.
This might work for you (GNU sed):
sed -E ':a;N;s/\n/&/2;Ta;s/\n([0-9]+\n[0-9]+)$/ \1/;ta;P;D' file
Open a window of 3 lines in the pattern space. If the 2nd and 3rd lines are numbers only, replace the 1st newline with a space and refill the window. Otherwise print/delete the first line in the pattern space and repeat.

create simple lingvo .dsl dict using sed or awk?

I have some text files and every file contains a definition for a word and looks like this:
word1
<TAB> some text
<TAB> some text
title 1
<TAB> some text
<TAB> some text
title 2
<TAB> some text
.
.
I want to create a simple lingvo .DSL dictionary so the desired output should be like this :
word1
[m2][trn]
<TAB> some text
<TAB> some text
[b]title 1[/b]
<TAB> some text
<TAB> some text
[b]title 2[/b]
<TAB> some text
<TAB> some text
.
.
[/m2][/trn]
so what I need to do is :
add [m2][trn] after the first word .
if a line begins with a letter or a number (not a tab) so it's a title and should be [b]title[/b]
Add [/m2][/trn] to the end of the file .
Any help will be appreciated.
This sed command should do it:
sed -e '1s/$/\n[m2][trn]/' \
-e '1!s/^[a-Z0-9].*/[B]&[\/B]/' \
-e '$s/$/\n[\/m2][\/trn]/' \
file
I'd say
sed '1! { /^[[:alnum:]]/ s/.*/[b]&[\/b]/; }; 1 s/$/\n[m2][trn]/; $ s/$/\n[\/trn][\/m2]/' filename
That is:
1! { /^[[:alnum:]]/ s/.*/[b]&[\/b]/; } # If the current line is not the first and
# starts with a letter or number, encase
# it in [b][/b]
1 s/$/\n[m2][trn]/ # If the current line is the first, put
# [m2][trn] behind it
$ s/$/\n[\/trn][\/m2]/ # If the current line is the last, put
# [/trn][/m2] behind it.
analysis
add [m2][trn] after the first word.
use a head splitter to handle the first line differently from the rest.
and just printf this start tag.
if a line begins with a letter or a number (not a tab) so it's a title and should be [b]title[/b]
sed to search for lines starting with word characters \w
Add [/m2][/trn] to the end of the file .
printf to add end tag
example script
head -n 1 input.txt 1>output.txt;
printf "[m2][trn]\n" 1>>output.txt;
tail -n +2 input.txt |
sed 's/^\(\w\+.\+\)/[b]\1[\/b]/g' 1>>output.txt;
printf "[/m2][/trn]\n" 1>>output.txt;
output
word1
[m2][trn]
some text
some text
[b]title 1[/b]
some text
some text
[b]title 2[/b]
some text
[/m2][/trn]

How to replace text inside another string which matches a regex pattern

${[a-zA-Z0-9._:]*}
I have used the above pattern for grepping words similar to ${SomeText}. How can I replace 'SomeText' with some other string i.e ${SomeText} --> SomeString in my bash script?
Example:
file.txt
text text text ${SomeText1} text text ${SomeText2} text text text text text text
text text text text ${SomeText3} text text text text text text ${SomeText4} text
...
my script file:
SomeText1="foo"
SomeText2="bar"
..
I want to replace ${SomeText1} to $SomeText1 which will be replaced by foo.
Similarly
${SomeText2} > $SomeText2 > bar
text text text foo text text bar text text text text text text
text text text text baz text text text text text text qux text
...
You can try this script:
while IFS='=' read k v; do
v=$(tr -d '"' <<< $v)
sed -i.bak "s/\${$k}/$v/g" file.txt
done < script
You can use sed
echo '${SomeText}, SomeText' | sed "s/\${SomeText}/\${SomeOtherText}/g"
will output
${SomeOtherText}, SomeText
if you need to use variables in sed, do it like this
a="SomeText"
b="SomeOtherText"
echo '${SomeText}, SomeText' | sed "s/\${$a}/\${$b}/g"
The output will be the same as above
EDIT:
Better use arrays to store that strings
SomeText[1]="foo"
SomeText[2]="bar"
i=0
while [[ $i < $MAX_i ]]
do
((i++))
sed -i "s/\${SomeText[$i]}/${SomeText[$i]}/g"
done
If you really need your approach, you will need double evaluation
SomeText[1]="foo"
SomeText[2]="bar"
i=0
while [[ $i < $MAX_i ]]
do
((i++))
toRepl="SomeText"$i
scriptToCall=$(eval echo "s/\\\${$toRepl}/$`eval echo ${toRepl}`/g")
sed -i $scriptToCall file
done
So you iterate thru all indices (SomeText1, SomeText2, ...), save the current into var toRepl, evaluate the script you want to call, for i=1 it will be scriptToCall="s/\${SomeText[1]}/foo/g" cause you double evaluate toRepl -> ${SomeText1} -> foo and you pass that script to sed :)
Is it clear now?

Resources