From awk output, how to cut or trim characters in columns - bash

At the moment
I want to trim .fmbi1a5nn9sp5o4qy3eyazeq5.eddvrl9sa8t448pb38vibj8ef: and .ilwio0k43fgqt4jqzyfadx19v: so the output take less space :)
First step:
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | column -t
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5.eddvrl9sa8t448pb38vibj8ef: Up 7 days
mon_prometheus.1.ilwio0k43fgqt4jqzyfadx19v: Up 7 days
I know
I can do something like:
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | rev | cut -d"." -f2- | rev
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5
mon_prometheus.1
The issue
is that I'm losing the other columns :-/
Idea
It would sound logical to do something like this (with awk) but it does not work. Any ideas?
docker ps --format "{{.Names}} : {{.Status}}" | sort -k1 | awk '{(print $1 | rev | cut -d"." -f2- | rev),$2,$3,$4,$5,$6}' | column -t
Thank you in advance!
P

to cut the last dot extension
$ docker ... | sort | awk '{sub(/\.[^.]*$/,"",$1)}1' file | column -t
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5 Up 7 days
mon_prometheus.1 Up 7 days
or, delete anything longer than 20 chars after a dot.
$ ... | sed -e 's/\(\.[a-z0-9:]\{20,\}\)* / /' | column -t
mon_node-exporter Up 7 days
mon_prometheus.1 Up 7 days

Works! This trick will make my life so much easier.
(I removed file)
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | awk '{sub(/\.[^.]*$/,"",$1)}1' | column -t;
mon_grafana.1 Up 24 hours
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5 Up 23 hours
Question #2:
Now how would you proceed to cut the characters after the first dot?
Cheers!

Related

How to sort data according to the date in bash?

I need to write a bash program that sorts the data according to the date and displays the name of the person who recently joined the organization.
I have an employees.txt file with data in it with delimiter |. But when I am trying to sort the data using sort command like
sort -t'|' -k5,5 employees.txt | head -1 | cut -d'|' -f2
this is only sorting according to the first column of the whole date i.e DD-MM-YYYY sorting only on DD.
employees.txt File data format
ID | NAME | POST | DEPARTMENT | JOINING DATE | SALARY
101 | Jhon McClare | Manager | Content | 23-02-2001 | 83000
102 | Alena Croft | Snr. Manager | Accounts | 01-01-2019 | 88888
103 | Jeremy | Director | Sales | 20-03-2012 | 89786
104 | Williams | Manager | Marketing | 23-06-2001 | 73000
The above image should give Alena Croft as the answer.
The relevant field be must rendered suitable for sorting, that is, in the form of YYYY-MM-DD, using a utility such as sed or awk. For example, with GNU sed:
sed -E 's/([0-9]{2})-([0-9]{2})-([0-9]{4})/\3-\2-\1/' employees.txt |
sort -r -t'|' -k5,5 | head -n1 | cut -d'|' -f2
The trick is to change the format of date to YYYY-MM-DD.
$ cat people.txt | sed -E 's/([0-9]+)\-([0-9]+)\-([0-9]+)/\3\-\2\-\1/' | sort -t'|' -k5,5r | head -1 | cut -d'|' -f2
Alena Croft
Also note that when sorting we need to do in reverse order (descending order) since we want the most recent date.

Inconsistency in output field separator

We have to find the difference(d) Between last 2 nos and display rows with the highest value of d in ascending order
INPUT
1 | Latha | Third | Vikas | 90 | 91
2 | Neethu | Second | Meridian | 92 | 94
3 | Sethu | First | DAV | 86 | 98
4 | Theekshana | Second | DAV | 97 | 100
5 | Teju | First | Sangamithra | 89 | 100
6 | Theekshitha | Second | Sangamithra | 99 |100
Required OUTPUT
4$Theekshana$Second$DAV$97$100$3
5$Teju$First$Sangamithra$89$100$11
3$Sethu$First$DAV$86$98$12
awk 'BEGIN{FS="|";OFS="$";}{
avg=sqrt(($5-$6)^2)
print $1,$2,$3,$4,$5,$6,avg
}'|sort -nk7 -t "$"| tail -3
Output:
4 $ Theekshana $ Second $ DAV $ 97 $ 100$3
5 $ Teju $ First $ Sangamithra $ 89 $ 100$11
3 $ Sethu $ First $ DAV $ 86 $ 98$12
As you can see there is space before and after $ sign but for the last column (avg) there is no space, please explain why its happening
2)
awk 'BEGIN{FS=" | ";OFS="$";}{
avg=sqrt(($5-$6)^2)
print $1,$2,$3,$4,$5,$6,avg
}'|sort -nk7 -t "$"| tail -3
OUTPUT
4$|$Theekshana$|$Second$|$0
5$|$Teju$|$First$|$0
6$|$Theekshitha$|$Second$|$0
I have not mentiond | as the output field separator but still it appears, why is this happening and the difference is zero too
I am just 6 days old in unix,please answer even if its easy
your field separator is only the pipe symbol, so surrounding whitespace is part of the field definitions and that's what you see in the output. In combined uses pipe has the regex special meaning and need to be escaped. In your second case it means space or space is the field separator.
$ awk 'BEGIN {FS=" *\\| *"; OFS="$"}
{d=sqrt(($NF-$(NF-1))^2); $1=$1;
print d "\t" $0,d}' file | sort -n | tail -3 | cut -f2-
4$Theekshana$Second$DAV$97$100$3
5$Teju$First$Sangamithra$89$100$11
3$Sethu$First$DAV$86$98$12
a slight rewrite will eliminate the number of fields dependency and fixes the format.

Bash extract strings between two characters

I have the output of query result into a bash variable, stored as a single line.
-------------------------------- | NAME | TEST_DATE | ----------------
--------------------- | TESTTT_1 | 2019-01-15 | | TEST_2 | 2018-02-16 | | TEST_NAME_3 | 2020-03-17 | -------------------------------------
I would like to ignore the column names(NAME | TEST_DATE) and store actual values of each name and test_date as a tuple in an array.
So here is the logic I am thinking, I would like to extract third string onwards between two '|' characters. These strings are comma separated and when a space is encountered we start the next tuple in the array.
Expected output:
array=(TESTTT_1,2019-01-15 TEST_2,2018-02-16 TEST_NAME_3,2020-03-17)
Any help is appreciated. Thanks.
let say your
String is stored in variable a (or pipe our query output to below command
echo "$a"
-------------------------------- | NAME | TEST_DATE | ----------------
--------------------- | TESTTT_1 | 2019-01-15 | | TEST_2 | 2018-02-16 | | TEST_NAME_3 | 2020-03-17 | ------------------------------------
Command to obtain desired results is:
array="$(echo "$a" | cut -d '|' -f2,3,5,6,8,9 | tail -n1 | sed 's/ | /,/g')
Above will store ourput in variable named array as you expected
Output of above command is:
echo "$array"
TESTTT_1,2019-01-15,TEST_2,2018-02-16,TEST_NAME_3,2020-03-17
Explanation of command: output of echo $a will be piped into cut and using '|' as delimeter it will cut fields 2,3,5,6,8,9 then the output is piped into tail to remove the undesired NAME and TEST_DATE columns and provide values only and then as per your expected output | will be converted to , using sed.
Here in this string you are having only three dates if you have more then just in cut command add more field numbers and as per format of your string field numbers will be in following style 2,3,5,6,8,9,11,12,14,15 .... and so on.
Hope it solved your problem.
echo "$a" | awk -F "|" '{ for(i=2; i<=NF; i++){ print $i }}' | sed -e '1,3d' -e '$d' | tr ' ' '\n' | sed '/^$/d' | sed 's/^/,/g' | sed -e 'N;s/\n/ /' | sed 's/^.//g' | xargs | sed 's/ ,/, /g'
Above is awk based solution
Output:
TESTTT_1, 2019-01-15 TEST_2, 2018-02-16 TEST_NAME_3, 2020-03-17
Is it ok.

Simplify lots of SED command

I have the following command that I use to rewrite some maxscale output to be able to use it in other software:
maxadmin list servers | sed -r 's/[^a-z 0-9]//gi;/^\s*$/d;1,3d;' | awk '$1=$1' | cut -d ' ' -f 1,5 | sed -e 's/ /":"/g' | sed -e 's/\(.*\)/"\1"/' | tr '\n' ',' | sed 's/.$/}\n/' | sed 's/^/{/'
I am thinking this is way to complex for what I want to do, but I am not able to see a simpler version of this myself. What I want is to rewrite this (output of maxadmin list servers):
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
svr_node1 | 192.168.178.1 | 3306 | 0 | Master, Synced, Running
svr_node2 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
svr_node3 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
-------------------+-----------------+-------+-------------+--------------------
Into this:
{"svrnode1":"Master","svrnode2":"Slave","svrnode3":"Slave"}
My command does a good job but as I said, there should be a simpler way with less sed commands being run hopefully.
You can use awk, like this:
json.awk
BEGIN {
printf "{"
}
# Everything after line for and before the last ------ line
# plus the last empty line (if any).
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9) # Remove trailing comma
printf "%s\"%s\":\"%s\"",s,$1,$9
s="," # Set comma separator after first iteration
}
END {
print "}"
}
Run it like this:
maxadmin list servers | awk -f json.awk
Output:
{"svr_node1":"Master","svr_node2":"Slave","svr_node3":"Slave"}
In comments there came up the question how to achieve that without an extra json.awk file:
maxadmin list servers | awk 'BEGIN{printf"{"}NR>4&&!/^([-]|$)/{sub(/,/,"",$9);printf"%s\"%s\":\"%s\"",s,$1,$9;s=","}END{print"}"}'
Ugly, but works. ;)
If you want to put this into a shell script, consider a multiline version like this:
maxadmin list servers | awk '
BEGIN{printf"{"}
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9)
printf"%s\"%s\":\"%s\"",s,$1,$9
s=","
}
END{print"}"}'

Find most frequent line in file in bash

Suppose I have a file similar to as follows:
Abigail 85
Kaylee 25
Kaylee 25
kaylee
Brooklyn
Kaylee 25
kaylee 25
I would like to find the most repeated line, the output must be just the line.
I've tried
sort list | uniq -c
but I need clean output, just the most repeated line (in this example Kaylee 25).
Kaizen ~
$ sort zlist | uniq -c | sort -r | head -1| xargs | cut -d" " -f2-
Kaylee 25
does this help ?
IMHO, none of these answers will sort the results correctly. The reason is that sort, without the -n, option will sort like this "1 10 11 2 3 4", etc., instead of "1 2 3 4 10 11 12". So, add -n like so:
sort zlist | uniq -c | sort -n -r | head -1
You can then, of course, pipe that to either xargs or sed as described earlier.
awk -
awk '{a[$0]++; if(m<a[$0]){ m=a[$0];s[m]=$0}} END{print s[m]}' t.lis
$ uniq -c list | sort -r | head -1 | awk '{$1=""}1'
Kaylee 25
Is this what you're looking for?

Resources