Bash Substring multiple parameters - bash

I need to extract two parameters from each line of a svn log but I am not able to do it with grep.
My Svn log command, such as
svn log http://svn.apache.org/repos/asf/xerces/java/trunk/ | grep "^r[0-9]\+ | " | cut -c2-
Outputs the result in this format:
318150 | lehors | 2002-01-28 20:48:11 +0100 (Mon, 28 Jan 2002) | 2 lines
318149 | elena | 2002-01-28 20:46:33 +0100 (Mon, 28 Jan 2002) | 12 lines
318148 | lehors | 2002-01-28 20:33:36 +0100 (Mon, 28 Jan 2002) | 2 lines
318147 | lehors | 2002-01-28 20:22:51 +0100 (Mon, 28 Jan 2002) | 2 lines
How can I grep the release number (first parameter) and the date in this format?
318150 2002-01-28
318149 2002-01-28
318148 2002-01-28
318147 2002-01-28

Use a more robust Awk for this to pattern-match/extract from individual columns.
.. | awk 'BEGIN{FS="|"}{split($3,temp, " "); print $1,temp[1]}'
318150 2002-01-28
318149 2002-01-28
318148 2002-01-28
318147 2002-01-28
The .. | part represents the command to be included that produces the required output which is pipe-lined to Awk
The logic is pretty straight-forward, split input lines with de-limiter as | which is done by FS="|". Now $1 represents the first field you want, and for the second part, split the part $3 and use the split() function to split on delimiter, a white-space character and store it in array temp, so that it can be accessed as temp[1], the other space fields are present in the array from the next index on wards.
So ideally I guess it should be,
svn log http://svn.apache.org/repos/asf/xerces/java/trunk/ | \
awk 'BEGIN{FS="|"}{split($3,temp, " "); print $1,temp[2]}'
Alternatively you could use GNU grep with its -E extended regular expression capabilities, but it is just not good enough to show the matching entries on same line like,
grep -oE '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}' file
(and)
grep -oE '^[[:digit:]]{6}' file
but not together as I have used the -o flag to print the match only part.

As your file is separated by a single space and you want to have the first and 5th columns, this is another solution by using cut:
cut -d' ' -f1,5 < svn_log_output_file
(or piping cut -d' ' -f1,5 to your command)

A much simpler approach with multiple delimiters-
awk -F '[| ]' '{print $1, $7}' file
Where file contains the output you showed in your question.
Output-
318150 2002-01-28
318149 2002-01-28
318148 2002-01-28
318147 2002-01-28
Of course, you don't need to store in an intermediate file. You can do-
svn log http://svn.apache.org/repos/asf/xerces/java/trunk/ \
| grep "^r[0-9]\+ | " | cut -c2- | \
awk -F '[| ]' '{print $1, $7}'

awk '{print $1,$5}' file
318150 2002-01-28
318149 2002-01-28
318148 2002-01-28
318147 2002-01-28

Related

Bash Shell: How do I sort by values on last column, but ignoring the header of a file?

file
ID First_Name Last_Name(s) Average_Winter_Grade
323 Popa Arianna 10
317 Tabarcea Andreea 5.24
326 Balan Ionut 9.935
327 Balan Tudor-Emanuel 8.4
329 Lungu Iulian-Gabriel 7.78
365 Brailean Mircea 7.615
365 Popescu Anca-Maria 7.38
398 Acatrinei Andrei 8
How do I sort it by last column, except for the header ?
This is what file should look like after the changes:
ID First_Name Last_Name(s) Average_Winter_Grade
323 Popa Arianna 10
326 Balan Ionut 9.935
327 Balan Tudor-Emanuel 8.4
398 Acatrinei Andrei 8
329 Lungu Iulian-Gabriel 7.78
365 Brailean Mircea 7.615
365 Popescu Anca-Maria 7.38
317 Tabarcea Andreea 5.24
If it's always 4th column:
head -n 1 file; tail -n +2 file | sort -n -r -k 4,4
If all you know is that it's the last column:
head -n 1 file; tail -n +2 file | awk '{print $NF,$0}' | sort -n -r | cut -f2- -d' '
You'd like to just sort by the last column, but sort doesn't allow you to do that easily. So rewrite the data with the column to be sorted at the beginning of each line:
Ignoring the header for the moment (although this will often work by itself):
awk '{print $NF, $0 | "sort -nr" }' input | cut -d ' ' -f 2-
If you do need to trim the order (eg, it's getting mixed in the sort), you can do things like:
< input awk 'NR==1; NR>1 {print $NF, $0 | "sh -c \"sort -nr | cut -d \\\ -f 2-\"" }'
or
awk 'NR==1{ print " ", $0} NR>1 {print $NF, $0 | "sort -nr" }' OFS=\; input | cut -d \; -f 2-

From awk output, how to cut or trim characters in columns

At the moment
I want to trim .fmbi1a5nn9sp5o4qy3eyazeq5.eddvrl9sa8t448pb38vibj8ef: and .ilwio0k43fgqt4jqzyfadx19v: so the output take less space :)
First step:
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | column -t
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5.eddvrl9sa8t448pb38vibj8ef: Up 7 days
mon_prometheus.1.ilwio0k43fgqt4jqzyfadx19v: Up 7 days
I know
I can do something like:
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | rev | cut -d"." -f2- | rev
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5
mon_prometheus.1
The issue
is that I'm losing the other columns :-/
Idea
It would sound logical to do something like this (with awk) but it does not work. Any ideas?
docker ps --format "{{.Names}} : {{.Status}}" | sort -k1 | awk '{(print $1 | rev | cut -d"." -f2- | rev),$2,$3,$4,$5,$6}' | column -t
Thank you in advance!
P
to cut the last dot extension
$ docker ... | sort | awk '{sub(/\.[^.]*$/,"",$1)}1' file | column -t
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5 Up 7 days
mon_prometheus.1 Up 7 days
or, delete anything longer than 20 chars after a dot.
$ ... | sed -e 's/\(\.[a-z0-9:]\{20,\}\)* / /' | column -t
mon_node-exporter Up 7 days
mon_prometheus.1 Up 7 days
Works! This trick will make my life so much easier.
(I removed file)
docker ps --format "{{.Names}}: {{.Status}}" | sort -k1 | awk '{sub(/\.[^.]*$/,"",$1)}1' | column -t;
mon_grafana.1 Up 24 hours
mon_node-exporter.fmbi1a5nn9sp5o4qy3eyazeq5 Up 23 hours
Question #2:
Now how would you proceed to cut the characters after the first dot?
Cheers!

Count of Request Patterns

I want to find the count of request patterns in a requests.log file.The requests.log files has requests in the following format
102.232.32.322 "/v1/places?name=ass&lat=22.3&lng=12.12 HTTP 1.1" 23 111
102.232.32.322 "/v1/places/23232 HTTP 1.1" 23 111
102.232.32.322 "/v1/places?name=bcdd&lat=22.23&lng=12.12&quality_score=true HTTP1.1" 23 111
.....
I have so far only been able to cut strings and strip out numbers
cat requests.log | grep /v1/places | cut -c53- |cut -d '"' -f 1 | cut -d' ' -f1 | sed 's/[0-9]//g'
100 /v1/places?name=<name>&lat=<lng>
110 /v1/places/<placeid>
10 /v1/places?name=<name>&lat=<lat>&lng=<&lng>&country_code=<country>
in the above fashion for all the possible patterns ignoring the order of request params
The output should be in the following manner
Another major problem is that the parameters orders is not guaranteed
Using awk:
awk '{sub(/^"/, "", $2); a[$2]++} END{for (i in a) print a[i], i}' OFS='\t' log.file
2 /v1/places/23232
1 /v1/places?name=ass&lat=22.3&lng=12.12
1 /v1/places?name=bcdd&lat=22.23&lng=12.12&quality_score=true

Print unique names of users logged on with finger

I'm trying to write a shell script that prints the full names of users logged on to a machine. The finger command gives me a list of users, but there are many duplicates. How can I loop through and print out only the unique ones?
Edit:
This is the format of what finger gives me:
xxxx XX of group XXX pts/59 1:00 Feb 13 16:38
xxxx XX of group XXX pts/71 1:11 Feb 13 16:27
xxxx XX of group XXX pts/105 1d Feb 12 15:22
xxxx YY of group YYY pts/102 2:19 Feb 13 14:13
xxxx ZZ of group ZZZ pts/42 2d Feb 7 12:11
I'm trying to extract the full name (i.e. whatever comes before 'of group' in column 2), so I would be using awk together with finger.
What you want is actually fairly difficult in a shell script, here is, for example, my full output of finger(1):
Login Name TTY Idle Login Time Office Phone
martin Martin Tournoij *v0 1d Wed 14:11
martin Martin Tournoij pts/2 22 Wed 15:37
martin Martin Tournoij pts/5 41 Thu 23:16
martin Martin Tournoij pts/7 31 Thu 23:24
martin Martin Tournoij pts/8 Thu 23:29
You want the full name, but this may contain 1 space (as per my example), or it may just be 'Teller' (no space), or it may be 'Captain James T. Kirk' (3 spaces). So you can't just use the space as delimiter. You could use the character position of 'TTY' in the header as an indicator, but that's not very elegant IMHO (especially with shell scripting).
My solution is therefore slightly different, we get only the username from finger(1), then we get the full name from /etc/passwd
#!/bin/sh
prev=""
for u in $(finger | tail +2 | cut -w -f1 | sort); do
[ "$u" = "$prev" ] && continue
echo "$u $(grep "^$u" /etc/passwd | cut -d: -f5)"
prev="$u"
done
Which gives me both the username & login name:
martin Martin Tournoij
Obviously, you can also print just the real name (without the $u).
The sort and uniq BinUtils commands can be used to removed duplicates.
finger | sort -u
This will remove all duplicate lines, but you will still see similar lines due to how verbose the finger command is. If you just want a list of usernames, you can filter it out further to be very specific.
finger | cut -d ' ' -f1 | sort -u
Now, you can take this one step further, and remove the "header/label" line printed out by the finger command.
finger | cut -d ' ' -f1 | sort -u | grep -iv login
Hope this helps.
Other possible solution:
finger | tail -n +2 | awk '{ print $1 }' | sort | uniq
tail -n +2 to omit the first line.
awk '{ print $1 }' to extract the first column.
sort to prepare input for uniq.
uniq remove duplicates.
If you want to iterate use:
for user in $(finger | tail -n +2 | awk '{ print $1 }' | sort | uniq)
do
echo "$user"
done
Could this be simpler?
No spaces or any other special characters to worry about!
finger -l | awk '/^Login/'
Edit: To remove the content after of group
finger -l | awk '/^Login/' | sed 's/of group.*//g'
Output:
Login: xx Name: XX
Login: yy Name: YY
Login: zz Name: ZZ

Find most frequent line in file in bash

Suppose I have a file similar to as follows:
Abigail 85
Kaylee 25
Kaylee 25
kaylee
Brooklyn
Kaylee 25
kaylee 25
I would like to find the most repeated line, the output must be just the line.
I've tried
sort list | uniq -c
but I need clean output, just the most repeated line (in this example Kaylee 25).
Kaizen ~
$ sort zlist | uniq -c | sort -r | head -1| xargs | cut -d" " -f2-
Kaylee 25
does this help ?
IMHO, none of these answers will sort the results correctly. The reason is that sort, without the -n, option will sort like this "1 10 11 2 3 4", etc., instead of "1 2 3 4 10 11 12". So, add -n like so:
sort zlist | uniq -c | sort -n -r | head -1
You can then, of course, pipe that to either xargs or sed as described earlier.
awk -
awk '{a[$0]++; if(m<a[$0]){ m=a[$0];s[m]=$0}} END{print s[m]}' t.lis
$ uniq -c list | sort -r | head -1 | awk '{$1=""}1'
Kaylee 25
Is this what you're looking for?

Resources