Replace multiple lines using sed in a bash script - macos

I've a file with multiple lines. I'm looking for help to modify only these lines that are matching the regex pattern and then add some text after each result.
I use a mac but the bash script will run on linux, I don't know if it is relevant.
i.e
someText
StringToSearch:
| isoCode |
someOthertext
StringToSearch:
| isoCode |
againSomOtherText
StringToSearch:
| isoCode |
after matching "StringToSearch:" I need to add "| uk |" after each "| isoCode |" so the result will be something like:
someText
StringToSearch:
| isoCode |
| uk |
someOtherText
StringToSearch:
| isoCode |
| uk |
againSomeOtherText
StringToSearch:
| isoCode |
| uk |
My regex is ^\s*StringToSearch:\n[^\n]+ and a full working example is available at regex101 following the link
I can't figure out how to implement it in bash using sed.
Actually my sed looks like this: sed -E 's,\^\s*StringToSearch:\n\([^\n]+\),| uk |,' < inputFile

$ awk '1; p~/StringToSearch/ && /isoCode/{print " | uk |"} {p=$0}' ip.txt
someText
StringToSearch:
| isoCode |
| uk |
someOthertext
StringToSearch:
| isoCode |
| uk |
againSomOtherText
StringToSearch:
| isoCode |
| uk |
1 idiomatic way to print contents of $0 which contains current record
{p=$0} saves the current record in p variable
p~/StringToSearch/ && /isoCode/ this checks if previous line contains StringToSearch and current line contains isoCode
if the condition is satisfied, print " | uk |" will add the new content you need
As far as I know, this should work on all versions of awk. So mac/linux will not affect you.
If you insist on sed, you can use
sed '/StringToSearch/{N; s/$/\n | uk |/}' ip.txt
which I tested on GNU sed and not sure if syntax/feature varies with other implementations. N command will add next line of input to current pattern space. s/$/\n | uk |/ will add the new content after the two lines. sed by default prints pattern space when -n option is not used.

sed -E 's,\^\s*StringToSearch:\n\([^\n]+\),| uk |,'
\(...\) saves backreference in regular regex epressions. In extended regex use (...). Also you do not use anywhere the backreference.
\n - sed parses one line at a time. So it can't match \n, unless you append multiple lines to pattern space with N commands.
\^ is strange - it matches a ^ character. There is no such character in your text...
You can match easily multi-line with GNU sed by using -z option. Note that it will load the whole file into seds memory, so it will be memory consuming. Then write a proper regex that will globally match your expression.
Also not to remove replaced string, use & to re-restore it. Then suffix it with the string you want to add.
The commmand:
$ sed -z -E 's,\nStringToSearch:\n[^\n]+\n,& | uk |\n,g' <<EOF
someText
StringToSearch:
| isoCode |
someOthertext
StringToSearch:
| isoCode |
againSomOtherText
StringToSearch:
| isoCode |
EOF
outputs:
someText
StringToSearch:
| isoCode |
| uk |
someOthertext
StringToSearch:
| isoCode |
| uk |
againSomOtherText
StringToSearch:
| isoCode |
| uk |

Use sed:
sed -e '/^[[:space:]]*StringToSearch:/{' -e n -e n -e 'i\
\ \ \ \ | uk |' -e '}' file > outputfile
Output:
someText
StringToSearch:
| isoCode |
| uk |
someOthertext
StringToSearch:
| isoCode |
| uk |
againSomOtherText
StringToSearch:
| isoCode |
| uk |
This will match any line with optional whitespace and StringToSearch:, then -e n -e n will read two lines and clear pattern space, then -e 'i\
\ \ \ \ | uk | will insert a line of your choice, and -e '}' will close the block.

This might work for you (GNU sed):
sed '/StringToSearch/{n;p;s/[^| ]\+/uk/}' file
Match on a line containing StringToSearch.
Print that line and fetch the next.
Print that line and substitute uk for the isoCode (this line will also be printed as part of the normal sed flow).
See here for demo.

Related

Bash extract strings between two characters

I have the output of query result into a bash variable, stored as a single line.
-------------------------------- | NAME | TEST_DATE | ----------------
--------------------- | TESTTT_1 | 2019-01-15 | | TEST_2 | 2018-02-16 | | TEST_NAME_3 | 2020-03-17 | -------------------------------------
I would like to ignore the column names(NAME | TEST_DATE) and store actual values of each name and test_date as a tuple in an array.
So here is the logic I am thinking, I would like to extract third string onwards between two '|' characters. These strings are comma separated and when a space is encountered we start the next tuple in the array.
Expected output:
array=(TESTTT_1,2019-01-15 TEST_2,2018-02-16 TEST_NAME_3,2020-03-17)
Any help is appreciated. Thanks.
let say your
String is stored in variable a (or pipe our query output to below command
echo "$a"
-------------------------------- | NAME | TEST_DATE | ----------------
--------------------- | TESTTT_1 | 2019-01-15 | | TEST_2 | 2018-02-16 | | TEST_NAME_3 | 2020-03-17 | ------------------------------------
Command to obtain desired results is:
array="$(echo "$a" | cut -d '|' -f2,3,5,6,8,9 | tail -n1 | sed 's/ | /,/g')
Above will store ourput in variable named array as you expected
Output of above command is:
echo "$array"
TESTTT_1,2019-01-15,TEST_2,2018-02-16,TEST_NAME_3,2020-03-17
Explanation of command: output of echo $a will be piped into cut and using '|' as delimeter it will cut fields 2,3,5,6,8,9 then the output is piped into tail to remove the undesired NAME and TEST_DATE columns and provide values only and then as per your expected output | will be converted to , using sed.
Here in this string you are having only three dates if you have more then just in cut command add more field numbers and as per format of your string field numbers will be in following style 2,3,5,6,8,9,11,12,14,15 .... and so on.
Hope it solved your problem.
echo "$a" | awk -F "|" '{ for(i=2; i<=NF; i++){ print $i }}' | sed -e '1,3d' -e '$d' | tr ' ' '\n' | sed '/^$/d' | sed 's/^/,/g' | sed -e 'N;s/\n/ /' | sed 's/^.//g' | xargs | sed 's/ ,/, /g'
Above is awk based solution
Output:
TESTTT_1, 2019-01-15 TEST_2, 2018-02-16 TEST_NAME_3, 2020-03-17
Is it ok.

Replacing Spaces In File Names w/Only a Bash Cmd Script

Not actually a question, but a response to How to replace spaces in file names using a bash script; as I've not yet acrued enough 'reputation'. However, I too have recently suffered from the issue of 'rogue' character including pathnames arriving into secondary storage, that needed Ux fixing; spaces included. Expanding on Michael Krelin soln there, though, I was finally able to figure out an 'all Ux' command solution; no extra special code. So, as a follow on, check this out, as it is able to change any Ux 'roguely; named file/dir name, into one that is composed of only satisfactorily composed character string names. It is also able to, (and does) deal w/I18n modification capability, as well. It too, is able to clean up full directory tree worth of fns/dirnms. Have fun:
$ idx=0; find . -depth -name "*[ &;()]*" | while IFS= read -r pathNm ; do ((idx++)); printf "\n%d\t%s\t-->\n\t%s" "$idx" "$pathNm" "$(dirname "$pathNm" | tr '\050' '\137' | tr '\051' '\137')/$(basename "$pathNm" | tr '\040' '\055' | tr '\041' '\055' | tr '\042' '\055' | tr '\043' '\055' | tr '\044' '\055' | tr '\045' '\055' | tr '\046' '\055' | tr '\047' '\055' | tr '\050' '\137' | tr '\051' '\137' | tr '\052' '\055' | tr '\053' '\055' | tr '\054' '\055' | tr '\072' '\055' | tr '\073' '\055' | tr '\342' '\055' | tr '\200' '\055' | tr '\223' '\055' | sed s/[_-]/_/g | sed s/-_/_/g | sed 's/--/_/g' | sed s/\\.-/_/g | sed s/[_-]\\./_/g | sed 's!__!!g' )"; done

Simplify lots of SED command

I have the following command that I use to rewrite some maxscale output to be able to use it in other software:
maxadmin list servers | sed -r 's/[^a-z 0-9]//gi;/^\s*$/d;1,3d;' | awk '$1=$1' | cut -d ' ' -f 1,5 | sed -e 's/ /":"/g' | sed -e 's/\(.*\)/"\1"/' | tr '\n' ',' | sed 's/.$/}\n/' | sed 's/^/{/'
I am thinking this is way to complex for what I want to do, but I am not able to see a simpler version of this myself. What I want is to rewrite this (output of maxadmin list servers):
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
svr_node1 | 192.168.178.1 | 3306 | 0 | Master, Synced, Running
svr_node2 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
svr_node3 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
-------------------+-----------------+-------+-------------+--------------------
Into this:
{"svrnode1":"Master","svrnode2":"Slave","svrnode3":"Slave"}
My command does a good job but as I said, there should be a simpler way with less sed commands being run hopefully.
You can use awk, like this:
json.awk
BEGIN {
printf "{"
}
# Everything after line for and before the last ------ line
# plus the last empty line (if any).
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9) # Remove trailing comma
printf "%s\"%s\":\"%s\"",s,$1,$9
s="," # Set comma separator after first iteration
}
END {
print "}"
}
Run it like this:
maxadmin list servers | awk -f json.awk
Output:
{"svr_node1":"Master","svr_node2":"Slave","svr_node3":"Slave"}
In comments there came up the question how to achieve that without an extra json.awk file:
maxadmin list servers | awk 'BEGIN{printf"{"}NR>4&&!/^([-]|$)/{sub(/,/,"",$9);printf"%s\"%s\":\"%s\"",s,$1,$9;s=","}END{print"}"}'
Ugly, but works. ;)
If you want to put this into a shell script, consider a multiline version like this:
maxadmin list servers | awk '
BEGIN{printf"{"}
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9)
printf"%s\"%s\":\"%s\"",s,$1,$9
s=","
}
END{print"}"}'

Cleaning up IP output on command line [duplicate]

This question already has answers here:
How to clean up masscan output (-oL)
(4 answers)
Closed 6 years ago.
I have a problem with the output L options ("grep-able" output); for instance, it outputs this:
| 14.138.12.21:123 | unknown | disabled |
| 14.138.184.122:123 | unknown | disabled |
| 14.138.179.27:123 | unknown | disabled |
| 14.138.20.65:123 | unknown | disabled |
| 14.138.12.235:123 | unknown | disabled |
| 14.138.178.97:123 | unknown | disabled |
| 14.138.182.153:123 | unknown | disabled |
| 14.138.178.124:123 | unknown | disabled |
| 14.138.201.191:123 | unknown | disabled |
| 14.138.180.26:123 | unknown | disabled |
| 14.138.13.129:123 | unknown | disabled |
The above is neither very readable nor easy to understand.
How can I use Linux command-line utilities, e.g. sed, awk, or grep, to output something as follows, using the file above?
output
14.138.12.21
14.138.184.122
14.138.179.27
14.138.20.65
14.138.12.235
Using awk with field separator as space, and : and getting the second field:
awk -F '[ :]' '{print $2}' file.txt
Example:
% cat file.txt
| 14.138.12.21:123 | unknown | disabled |
| 14.138.184.122:123 | unknown | disabled |
| 14.138.179.27:123 | unknown | disabled |
| 14.138.20.65:123 | unknown | disabled |
| 14.138.12.235:123 | unknown | disabled |
| 14.138.178.97:123 | unknown | disabled |
| 14.138.182.153:123 | unknown | disabled |
| 14.138.178.124:123 | unknown | disabled |
| 14.138.201.191:123 | unknown | disabled |
| 14.138.180.26:123 | unknown | disabled |
| 14.138.13.129:123 | unknown | disabled |
% awk -F '[ :]' '{print $2}' file.txt
14.138.12.21
14.138.184.122
14.138.179.27
14.138.20.65
14.138.12.235
14.138.178.97
14.138.182.153
14.138.178.124
14.138.201.191
14.138.180.26
14.138.13.129
AWK is perfect for cases when you want to split the file by "columns", and you know exactly that the order of values/columns is constant. AWK splits the lines by a field separator (which can be a regular expression like '[: ]'). The column names are accessible by their positions from the left: $1, $2, $3, etc.:
awk -F '[ :]' '{print $2}' src.log
awk -F '[ :|]' '{print $3}' src.log
awk 'BEGIN {FS="[ :|]"} {print $3}' src.log
You can also filter the lines with a regular expression:
awk -F '[ :]' '/138\.179\./ {print $2}' src.log
However, it is impossible to capture substrings with the regular expression groups.
SED is more flexible in regard to regular expressions:
sed -r 's/^[^0-9]*([0-9\.]+)\:.*/\1/' src.log
However, it lacks many useful features of the Perl-like regular expressions we used to use in every day programming. For example, even the extended syntax (-r) fails to interpret \d as a number.
Perhaps, Perl is the most flexible tool for parsing files. You can opt to simple expressions:
perl -n -e '/^\D*([^:]+):/ and print "$1\n"' src.log
or make the matching as strict as you like:
perl -n -e '/^\D*((?:\d{1,3}\.){3}\d{1,3}):/ and print "$1\n"' src.log
using sed
sed -r 's/^ *[|] *([0-9]+[.][0-9]+[.][0-9]+[.][0-9]+):[0-9]{3}.*/\1/

shell script to extract the name and IP address

Is there a way to use shell script to get only the name and net from the result as below:
Result
6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;
Expected Result
name-erkoev4ja3rv: 10.1.1.4
$ input="6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;"
$ echo "$input" | sed -E 's,^[^|]+ \| ([^ ]+).* net=([0-9.]+).*$,\1: \2,g'
name-erkoev4ja3rv: 10.1.1.4
echo "6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;" | awk -F ' ' '{print $3}{print $13}'
Does this satisfy your case?

Resources