Simplify lots of SED command - bash

I have the following command that I use to rewrite some maxscale output to be able to use it in other software:
maxadmin list servers | sed -r 's/[^a-z 0-9]//gi;/^\s*$/d;1,3d;' | awk '$1=$1' | cut -d ' ' -f 1,5 | sed -e 's/ /":"/g' | sed -e 's/\(.*\)/"\1"/' | tr '\n' ',' | sed 's/.$/}\n/' | sed 's/^/{/'
I am thinking this is way to complex for what I want to do, but I am not able to see a simpler version of this myself. What I want is to rewrite this (output of maxadmin list servers):
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
svr_node1 | 192.168.178.1 | 3306 | 0 | Master, Synced, Running
svr_node2 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
svr_node3 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
-------------------+-----------------+-------+-------------+--------------------
Into this:
{"svrnode1":"Master","svrnode2":"Slave","svrnode3":"Slave"}
My command does a good job but as I said, there should be a simpler way with less sed commands being run hopefully.

You can use awk, like this:
json.awk
BEGIN {
printf "{"
}
# Everything after line for and before the last ------ line
# plus the last empty line (if any).
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9) # Remove trailing comma
printf "%s\"%s\":\"%s\"",s,$1,$9
s="," # Set comma separator after first iteration
}
END {
print "}"
}
Run it like this:
maxadmin list servers | awk -f json.awk
Output:
{"svr_node1":"Master","svr_node2":"Slave","svr_node3":"Slave"}
In comments there came up the question how to achieve that without an extra json.awk file:
maxadmin list servers | awk 'BEGIN{printf"{"}NR>4&&!/^([-]|$)/{sub(/,/,"",$9);printf"%s\"%s\":\"%s\"",s,$1,$9;s=","}END{print"}"}'
Ugly, but works. ;)
If you want to put this into a shell script, consider a multiline version like this:
maxadmin list servers | awk '
BEGIN{printf"{"}
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9)
printf"%s\"%s\":\"%s\"",s,$1,$9
s=","
}
END{print"}"}'

Related

Bash extract strings between two characters

I have the output of query result into a bash variable, stored as a single line.
-------------------------------- | NAME | TEST_DATE | ----------------
--------------------- | TESTTT_1 | 2019-01-15 | | TEST_2 | 2018-02-16 | | TEST_NAME_3 | 2020-03-17 | -------------------------------------
I would like to ignore the column names(NAME | TEST_DATE) and store actual values of each name and test_date as a tuple in an array.
So here is the logic I am thinking, I would like to extract third string onwards between two '|' characters. These strings are comma separated and when a space is encountered we start the next tuple in the array.
Expected output:
array=(TESTTT_1,2019-01-15 TEST_2,2018-02-16 TEST_NAME_3,2020-03-17)
Any help is appreciated. Thanks.
let say your
String is stored in variable a (or pipe our query output to below command
echo "$a"
-------------------------------- | NAME | TEST_DATE | ----------------
--------------------- | TESTTT_1 | 2019-01-15 | | TEST_2 | 2018-02-16 | | TEST_NAME_3 | 2020-03-17 | ------------------------------------
Command to obtain desired results is:
array="$(echo "$a" | cut -d '|' -f2,3,5,6,8,9 | tail -n1 | sed 's/ | /,/g')
Above will store ourput in variable named array as you expected
Output of above command is:
echo "$array"
TESTTT_1,2019-01-15,TEST_2,2018-02-16,TEST_NAME_3,2020-03-17
Explanation of command: output of echo $a will be piped into cut and using '|' as delimeter it will cut fields 2,3,5,6,8,9 then the output is piped into tail to remove the undesired NAME and TEST_DATE columns and provide values only and then as per your expected output | will be converted to , using sed.
Here in this string you are having only three dates if you have more then just in cut command add more field numbers and as per format of your string field numbers will be in following style 2,3,5,6,8,9,11,12,14,15 .... and so on.
Hope it solved your problem.
echo "$a" | awk -F "|" '{ for(i=2; i<=NF; i++){ print $i }}' | sed -e '1,3d' -e '$d' | tr ' ' '\n' | sed '/^$/d' | sed 's/^/,/g' | sed -e 'N;s/\n/ /' | sed 's/^.//g' | xargs | sed 's/ ,/, /g'
Above is awk based solution
Output:
TESTTT_1, 2019-01-15 TEST_2, 2018-02-16 TEST_NAME_3, 2020-03-17
Is it ok.

From awk output, how to cut or trim characters in columns

At the moment
I want to trim .fmbi1a5nn9sp5o4qy3eyazeq5.eddvrl9sa8t448pb38vibj8ef: and .ilwio0k43fgqt4jqzyfadx19v: so the output take less space :)
First step:
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | column -t
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5.eddvrl9sa8t448pb38vibj8ef: Up 7 days
mon_prometheus.1.ilwio0k43fgqt4jqzyfadx19v: Up 7 days
I know
I can do something like:
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | rev | cut -d"." -f2- | rev
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5
mon_prometheus.1
The issue
is that I'm losing the other columns :-/
Idea
It would sound logical to do something like this (with awk) but it does not work. Any ideas?
docker ps --format "{{.Names}} : {{.Status}}" | sort -k1 | awk '{(print $1 | rev | cut -d"." -f2- | rev),$2,$3,$4,$5,$6}' | column -t
Thank you in advance!
P
to cut the last dot extension
$ docker ... | sort | awk '{sub(/\.[^.]*$/,"",$1)}1' file | column -t
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5 Up 7 days
mon_prometheus.1 Up 7 days
or, delete anything longer than 20 chars after a dot.
$ ... | sed -e 's/\(\.[a-z0-9:]\{20,\}\)* / /' | column -t
mon_node-exporter Up 7 days
mon_prometheus.1 Up 7 days
Works! This trick will make my life so much easier.
(I removed file)
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | awk '{sub(/\.[^.]*$/,"",$1)}1' | column -t;
mon_grafana.1 Up 24 hours
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5 Up 23 hours
Question #2:
Now how would you proceed to cut the characters after the first dot?
Cheers!

Foreach loop in bash

I have two files, one with about 100 root domains, and second file with URLs only. Now I have to filter that URL list to get third file which contains only URLs that have domains from the list.
Example of URL list:
| URL |
| ------------------------------|
| http://github.com/name |
| http://stackoverflow.com/name2|
| http://stackoverflow.com/name3|
| http://www.linkedin.com/name3 |
Example of word list:
github.com
youtube.com
facebook.com
Resut:
| http://github.com/name |
My goal is to filter out whole row where URL contain specific word. This is what I tried:
for i in $(cat domains.csv);
do grep "$i" urls.csv >> filtered.csv ;
done
Result is strange, I've got some of the links, but not all of them that contain root domains from the first file. Then I tried to do the same thing with python and saw that bash doesn't do what I wanted, I've got better result with python script, but it takes more time to write python script than running bash commands.
How shoud I accomplish this with bash in further ?
Using grep:
grep -F -f domains.csv url.csv
Test Results:
$ cat wordlist
github.com
youtube.com
facebook.com
$ cat urllist
| URL |
| ------------------------------|
| http://github.com/name |
| http://stackoverflow.com/name2|
| http://stackoverflow.com/name3|
| http://www.linkedin.com/name3 |
$ grep -F -f wordlist urllist
| http://github.com/name |

Cleaning up IP output on command line [duplicate]

This question already has answers here:
How to clean up masscan output (-oL)
(4 answers)
Closed 6 years ago.
I have a problem with the output L options ("grep-able" output); for instance, it outputs this:
| 14.138.12.21:123 | unknown | disabled |
| 14.138.184.122:123 | unknown | disabled |
| 14.138.179.27:123 | unknown | disabled |
| 14.138.20.65:123 | unknown | disabled |
| 14.138.12.235:123 | unknown | disabled |
| 14.138.178.97:123 | unknown | disabled |
| 14.138.182.153:123 | unknown | disabled |
| 14.138.178.124:123 | unknown | disabled |
| 14.138.201.191:123 | unknown | disabled |
| 14.138.180.26:123 | unknown | disabled |
| 14.138.13.129:123 | unknown | disabled |
The above is neither very readable nor easy to understand.
How can I use Linux command-line utilities, e.g. sed, awk, or grep, to output something as follows, using the file above?
output
14.138.12.21
14.138.184.122
14.138.179.27
14.138.20.65
14.138.12.235
Using awk with field separator as space, and : and getting the second field:
awk -F '[ :]' '{print $2}' file.txt
Example:
% cat file.txt
| 14.138.12.21:123 | unknown | disabled |
| 14.138.184.122:123 | unknown | disabled |
| 14.138.179.27:123 | unknown | disabled |
| 14.138.20.65:123 | unknown | disabled |
| 14.138.12.235:123 | unknown | disabled |
| 14.138.178.97:123 | unknown | disabled |
| 14.138.182.153:123 | unknown | disabled |
| 14.138.178.124:123 | unknown | disabled |
| 14.138.201.191:123 | unknown | disabled |
| 14.138.180.26:123 | unknown | disabled |
| 14.138.13.129:123 | unknown | disabled |
% awk -F '[ :]' '{print $2}' file.txt
14.138.12.21
14.138.184.122
14.138.179.27
14.138.20.65
14.138.12.235
14.138.178.97
14.138.182.153
14.138.178.124
14.138.201.191
14.138.180.26
14.138.13.129
AWK is perfect for cases when you want to split the file by "columns", and you know exactly that the order of values/columns is constant. AWK splits the lines by a field separator (which can be a regular expression like '[: ]'). The column names are accessible by their positions from the left: $1, $2, $3, etc.:
awk -F '[ :]' '{print $2}' src.log
awk -F '[ :|]' '{print $3}' src.log
awk 'BEGIN {FS="[ :|]"} {print $3}' src.log
You can also filter the lines with a regular expression:
awk -F '[ :]' '/138\.179\./ {print $2}' src.log
However, it is impossible to capture substrings with the regular expression groups.
SED is more flexible in regard to regular expressions:
sed -r 's/^[^0-9]*([0-9\.]+)\:.*/\1/' src.log
However, it lacks many useful features of the Perl-like regular expressions we used to use in every day programming. For example, even the extended syntax (-r) fails to interpret \d as a number.
Perhaps, Perl is the most flexible tool for parsing files. You can opt to simple expressions:
perl -n -e '/^\D*([^:]+):/ and print "$1\n"' src.log
or make the matching as strict as you like:
perl -n -e '/^\D*((?:\d{1,3}\.){3}\d{1,3}):/ and print "$1\n"' src.log
using sed
sed -r 's/^ *[|] *([0-9]+[.][0-9]+[.][0-9]+[.][0-9]+):[0-9]{3}.*/\1/

shell script to extract the name and IP address

Is there a way to use shell script to get only the name and net from the result as below:
Result
6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;
Expected Result
name-erkoev4ja3rv: 10.1.1.4
$ input="6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;"
$ echo "$input" | sed -E 's,^[^|]+ \| ([^ ]+).* net=([0-9.]+).*$,\1: \2,g'
name-erkoev4ja3rv: 10.1.1.4
echo "6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;" | awk -F ' ' '{print $3}{print $13}'
Does this satisfy your case?

Resources