Sort a marix by timestamp - bash

I am not sure if that applicable but I need to arrange and sort below output by timestamp below in column 2 under From , the newer should be on first line and the older on last line, what is needed is to keep the time format as it is, only I need to arrange by date
COUNT FROM TO
97 Oct 10 10:00:56 Oct 10 10:18:35
9 Mar 10 10:02:09 Oct 10 10:02:55
768 Oct 10 10:01:09 Oct 10 10:18:24
764 Oct 10 10:00:53 Oct 10 10:18:24
33 Oct 10 10:18:35 Oct 10 10:18:39
306 May 10 10:00:52 Oct 10 10:21:20
3 Oct 10 10:00:52 Oct 10 10:00:52
3 Oct 12 15:33:26 Nov 2 03:30:06
2 Oct 17 09:16:53 Oct 17 09:17:05
18 Nov 2 00:07:24 Nov 2 01:03:13
11 Oct 10 10:00:52 Oct 10 10:00:56
10095 Jun 10 10:00:52 Oct 10 10:18:24
10 Oct 10 10:18:40 Oct 10 10:18:45
1 Nov 2 03:21:32 Nov 2 03:21:32
1 Feb 2 01:31:53 Nov 2 01:31:53
1 Aug 2 03:26:24 Nov 2 03:26:24
1 Nov 2 03:21:32 Nov 2 03:21:32
1 Oct 10 10:18:05 Oct 10 10:18:05
1 Oct 17 09:16:52 Oct 17 09:16:52
1 Jan 10 10:02:55 Oct 10 10:02:55
1 Nov 2 23:24:09 Nov 2 23:29:09
1 Oct 10 10:00:52 Oct 10 10:00:52
1 Oct 10 10:00:53 Oct 10 10:00:53
1 Nov 2 03:22:22 Nov 2 03:22:22
1 Apr 2 06:41:29 Nov 2 06:41:29
The output should be with the same header with below as first line
1 Nov 2 23:24:09 Nov 2 23:29:09
, and below as the last line.
1 Jan 10 10:02:55 Oct 10 10:02:55

Take a look at man sort and you will see that you can sort by columns using the -k option.
This option supports a column number, and optional sort method.
For your case this might work:
sort -k2Mr -k3nr -k4r file.txt
-k2Mr do month sort on column two and reverse it.
-k3nr do numeric sort on column three and reverse it.
-k4r sort on column four and reverse it.

Related

How can I get AWK to start reading by the end?

I need to parse all a file into a better format to produce an outcome with columns delimited by a comma, thinking of being able to export the content in CSV file.
This is an example of my input;
. D 0 Mon Dec 10 11:07:46 2018
.. D 0 Mon Feb 19 11:38:06 2018
RJ9-5 D 0 Fri Nov 30 10:34:24 2018
WorkingOnClass D 0 Wed Feb 28 09:37:52 2018
ML-Test001 D 0 Fri Dec 7 16:38:56 2018
TestML4Testing D 0 Wed Aug 22 08:58:42 2018
ML-NewDataSE SetCases1.xlsx A 1415577 Wed Aug 29 14:00:16 2018
DR0001-Dum01 D 0 Thu Aug 16 08:24:25 2018
DR0002-Dum02 D 0 Thu Aug 16 09:04:50 2018
Readme File for Documentation And Data Description.docx A 16136 Wed Aug 29 14:00:24 2018
ML Database Prototype D 0 Thu Dec 6 15:11:11 2018
OneNote D 0 Mon Dec 3 09:39:20 2018
Data A 0 Mon Dec 10 11:07:46 2018
\RJ9-5
. D 0 Fri Nov 30 10:34:24 2018
.. D 0 Mon Dec 10 11:07:46 2018
KLR0151_Set023_Files_RJ9_05.xlsx A 182462 Wed Apr 4 02:48:55 2018
KLR0152_Set023_Files_RJ9_05.xlsx A 525309 Wed Apr 4 02:53:57 2018
\ML-Test001
. D 0 Wed Feb 28 09:37:52 2018
.. D 0 Mon Dec 10 11:07:46 2018
WT_Conforming_Format1_1.docx A 500914 Mon Feb 26 08:50:55 2018
Conforming_Format_1_1.xlsx A 130647 Mon Feb 26 08:52:33 2018
DR0135_Dum01_text.xls A 974848 Mon Feb 12 08:11:11 2018
DR0139_Dum02_body.xls A 1061888 Tue Jun 19 13:43:54 2018
DataSet_File_mod0874953.xlsx A 149835 Mon Feb 26 14:17:02 2018
File Path For Dataset-2018.07.11.xlsx A 34661 Mon Feb 12 09:27:17
This is script right here can make the job:
#!/bin/bash
awk -v OFS=, '
BEGIN { print "PATH, FILENAME, SIZE, TIMESTAMP" }
/[\\]/ { path=$0 }
$2 ~ /A/ {print path"\\"$1,$3,$4 " " $5 " " $6 " " $7 " "$8 }
' "$#"
But is ignoring the names with spaces on it, so I need to validate them with something like:
awk -v FS="\t" '{print $1}'
But I could't integrate into the shell script, because the way the shell script is working, so I was thinking on make AWK to start reading by the end, since the end is always the same, and leave the rest.
The output should something like this:
/RJ9-5/KLR0151_Set023_Files_RJ9_05.xlsx,182462,Wed Apr 4 02:48:55 2018
/RJ9-5/KLR0152_Set023_Files_RJ9_05.xlsx,25309,Wed Apr 4 02:53:57 2018
/ML-Test001/WT_Conforming_Format1_1.docx,500914,Mon Feb 26 08:50:55 2018
/ML-Test001/Format_1_1.xlsx,130647,Mon Feb 26 08:52:33 2018
/ML-Test001/DR0135_Dum01_text.xls,974848,Mon Feb 12 08:11:11 2018
/ML-Test001/DR0139_Dum02_body.xls,1061888,Tue Jun 19 13:43:54 2018
/ML-Test001/DataSet_File_mod0874953.xlsx,149835,Mon Feb 26 14:17:02 2018
/ML-Test001/File Path For Dataset-2018.07.11.xlsx,34661,Mon Feb 12 09:27:17 2018
With GNU awk for the 3rd arg to match() (and far less importantly \s shorthand for [[:space:]]):
$ cat tst.awk
BEGIN { OFS="," }
{ gsub(/^\s+|\s+$/,"") }
sub(/^\\/,"/") { path = $0; next }
path == "" { next }
match($0,/^(.*[^ ]) +A +([^ ]+) +(.*)/,a) { print path "/" a[1], a[2], a[3] }
$ awk -f tst.awk file
/RJ9-5/KLR0151_Set023_Files_RJ9_05.xlsx,182462,Wed Apr 4 02:48:55 2018
/RJ9-5/KLR0152_Set023_Files_RJ9_05.xlsx,525309,Wed Apr 4 02:53:57 2018
/ML-Test001/WT_Conforming_Format1_1.docx,500914,Mon Feb 26 08:50:55 2018
/ML-Test001/Conforming_Format_1_1.xlsx,130647,Mon Feb 26 08:52:33 2018
/ML-Test001/DR0135_Dum01_text.xls,974848,Mon Feb 12 08:11:11 2018
/ML-Test001/DR0139_Dum02_body.xls,1061888,Tue Jun 19 13:43:54 2018
/ML-Test001/DataSet_File_mod0874953.xlsx,149835,Mon Feb 26 14:17:02 2018
/ML-Test001/File Path For Dataset-2018.07.11.xlsx,34661,Mon Feb 12 09:27:17
Try this Perl solution:
$ perl -lane ' if(/^\s*$/) { $x=0;$y=0} if(/^\\/) {$x=1 ;($a=$_)=~s/\s*$//g;$a=~s/\\/\//g; } $y++ if $x==1 ; if($y>3) { s/^\s*//g; $_=~s/(.+?)\s+\S+\s+((\d+)\s+.+)/$1 $2/g;print "$a/$_" } ' essparaq.txt
/RJ9-5/KLR0151_Set023_Files_RJ9_05.xlsx 182462 Wed Apr 4 02:48:55 2018
/RJ9-5/KLR0152_Set023_Files_RJ9_05.xlsx 525309 Wed Apr 4 02:53:57 2018
/ML-Test001/WT_Conforming_Format1_1.docx 500914 Mon Feb 26 08:50:55 2018
/ML-Test001/Conforming_Format_1_1.xlsx 130647 Mon Feb 26 08:52:33 2018
/ML-Test001/DR0135_Dum01_text.xls 974848 Mon Feb 12 08:11:11 2018
/ML-Test001/DR0139_Dum02_body.xls 1061888 Tue Jun 19 13:43:54 2018
/ML-Test001/DataSet_File_mod0874953.xlsx 149835 Mon Feb 26 14:17:02 2018
/ML-Test001/File Path For Dataset-2018.07.11.xlsx 34661 Mon Feb 12 09:27:17
$ cat essparaq.txt
. D 0 Mon Dec 10 11:07:46 2018
.. D 0 Mon Feb 19 11:38:06 2018
RJ9-5 D 0 Fri Nov 30 10:34:24 2018
WorkingOnClass D 0 Wed Feb 28 09:37:52 2018
ML-Test001 D 0 Fri Dec 7 16:38:56 2018
TestML4Testing D 0 Wed Aug 22 08:58:42 2018
ML-NewDataSE SetCases1.xlsx A 1415577 Wed Aug 29 14:00:16 2018
DR0001-Dum01 D 0 Thu Aug 16 08:24:25 2018
DR0002-Dum02 D 0 Thu Aug 16 09:04:50 2018
Readme File for Documentation And Data Description.docx A 16136 Wed Aug 29 14 :00:24 2018
ML Database Prototype D 0 Thu Dec 6 15:11:11 2018
OneNote D 0 Mon Dec 3 09:39:20 2018
Data A 0 Mon Dec 10 11:07:46 2018
\RJ9-5
. D 0 Fri Nov 30 10:34:24 2018
.. D 0 Mon Dec 10 11:07:46 2018
KLR0151_Set023_Files_RJ9_05.xlsx A 182462 Wed Apr 4 02:48:55 2018
KLR0152_Set023_Files_RJ9_05.xlsx A 525309 Wed Apr 4 02:53:57 2018
\ML-Test001
. D 0 Wed Feb 28 09:37:52 2018
.. D 0 Mon Dec 10 11:07:46 2018
WT_Conforming_Format1_1.docx A 500914 Mon Feb 26 08:50:55 2018
Conforming_Format_1_1.xlsx A 130647 Mon Feb 26 08:52:33 2018
DR0135_Dum01_text.xls A 974848 Mon Feb 12 08:11:11 2018
DR0139_Dum02_body.xls A 1061888 Tue Jun 19 13:43:54 2018
DataSet_File_mod0874953.xlsx A 149835 Mon Feb 26 14:17:02 2018
File Path For Dataset-2018.07.11.xlsx A 34661 Mon Feb 12 09:27:17

Shell Script to print a calendar after the user specifies the month and a day

*****Shell Script*******
Given a month and the day of the week that's the first of that month, print a calendar for the month. (Remember, number of days in months is different and use \n to go to a new line.)
Unix has a cal command especially for this purpose.
By default, cal shows the current month's calendar.
mayankp#mayank:~/$ cal
November 2018
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
If you want a calendar for a specific month of a specific year, do this:
mayankp#mayank:~/$ cal 1 2018
January 2018
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
This displays the calendar for January 2018.
So, your shell script would be:(ex: calendar.sh)
#!/usr/bin/env bash
month=$1
year=$2
cal $1 $2
Run the script like this:
mayankp#mayank:~/$ sh calendar.sh 3 2018
March 2018
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Let me know if this helps.

Listing filenames and timestamps in subdirectories recursively in UNIX

The command that I've been using is "ls -lR". The results usually look like this:
.:
total 4
lrwxrwxrwx 1 root root 9 Oct 11 03:35 dos -> /root/dos
drwxr-xr-x 2 root root 80 Oct 11 03:35 folder1
drwxr-xr-x 2 root root 100 Oct 11 03:35 folder2
-rw-r--r-- 1 root root 242 Oct 11 03:35 hello.c
./folder1:
total 0
-rw-r--r-- 1 root root 0 Oct 11 03:25 file1001
-rw-r--r-- 1 root root 0 Oct 11 03:35 file1002
./folder2:
total 0
-rw-r--r-- 1 root root 0 Oct 11 03:39 file2001
-rw-r--r-- 1 root root 0 Oct 11 03:45 file2002
How do I optimize the command so that it would only display the following?
./folder1:
Oct 11 03:25 | file1001
Oct 11 03:35 | file1002
./folder2:
Oct 11 03:39 | file2001
Oct 11 03:45 | file2002
Here's something that might work:
~/mydir ls -lR | grep -vi "total" | egrep -o "^\.\/.*|^\..*|([A-Z])\w+\s+[0-9]+\s+[0-9]+\s+.*"
.:
Sep 4 2015 es_jira.py
Sep 4 2015 es_slurp_jira.py
Aug 21 2015 __init__.py
./plugins:
Sep 4 2015 __init__.py
Sep 4 2015 lrt_report.py
Sep 11 2015 mr_fingerprint.py
Mar 6 2016 mr_tunable.py
Dec 1 2015 plugin.py
Dec 1 2015 test
Dec 1 2015 utils.py
./plugins/test:
Sep 4 2015 _test_ca_space_plugin.py
Sep 4 2015 _test_lrt_report_plugin.py
Sep 4 2015 _test_mr_failover_plugin.py
Sep 4 2015 _test_mr_fingerprint_plugin.py
Dec 1 2015 _test_mr_tunable_plugin.py
Sep 4 2015 _test_spacedays_plugin.py
If you want to start adding tabs for nested lines and stuff, you're looking for a script and variable work, which is doable in a one-liner, but gets more complicated than a quick and dirty grep.

extract data date wise and do average calculation

How to extract data date wise and do average calculation per date from the below shown output. last column is average.
Sun Jul 5 00:00:02 IST 2015, 97
Sun Jul 5 00:02:01 IST 2015, 97
Sun Jul 5 00:04:02 IST 2015, 97
Mon Jul 6 00:00:01 IST 2015, 73
Mon Jul 6 00:02:02 IST 2015, 93
Mon Jul 6 00:04:02 IST 2015, 97
Tue Jul 7 00:00:02 IST 2015, 97
Tue Jul 7 00:02:02 IST 2015, 97
Tue Jul 7 00:04:01 IST 2015, 97
Wed Jul 8 00:00:01 IST 2015, 98
Wed Jul 8 00:02:02 IST 2015, 98
Wed Jul 8 00:04:01 IST 2015, 98
Thu Jul 9 00:00:02 IST 2015, 100
Thu Jul 9 00:02:01 IST 2015, 100
Thu Jul 9 00:04:01 IST 2015, 100
Fri Jul 10 00:00:01 IST 2015, 100
Fri Jul 10 00:02:02 IST 2015, 100
Fri Jul 10 00:04:02 IST 2015, 100
Sat Jul 11 00:00:01 IST 2015, 73
Sat Jul 11 00:02:01 IST 2015, 73
Sat Jul 11 00:04:02 IST 2015, 73
want output as
Jun 6 - 97
Jun 7 - 86.66
...
You can use this awk:
awk -F ', ' '{
split($1, a, " ");
k=a[2] OFS a[3];
if(!(k in c))
b[++n]=k;
c[k]++;
sum[k]+=$2
}
END{
for(i=1; i<=n; i++)
printf "%s - %.2f\n", b[i], (sum[b[i]]/c[b[i]])
}' file
Jul 5 - 97.00
Jul 6 - 87.67
Jul 7 - 97.00
Jul 8 - 98.00
Jul 9 - 100.00
Jul 10 - 100.00
Jul 11 - 73.00

how to print 3 lines in 1 column?

I've a file like below.
ab13p29im-sss29511
0
Jan 12 22:43
ab13p29im-sss29531
0
Jan 12 22:43
ab13p29im-sss29512
0
Feb 2 16:11
ab13p29im-sss29522
0
Feb 2 16:12
ab13p29im-sss29532
0
Feb 2 16:12
ab21p30im-sss30511
0
Jan 12 22:43
ab21p30im-sss30531
0
Jan 12 22:43
ab21p30im-sss30512
0
Feb 2 16:13
ab21p30im-sss30522
3
Feb 2 16:12
i want to print this is below format.
ab13p29im-sss29511 0 Jan 12 22:43
ab21p30im-sss30522 0 Feb 2 16:12
ab21p30im-sss30531 0 Jan 12 22:43
I'm using the command paste - - - < inputfile.But if any of the value is null, the format is all messed up like below?
ab13p29im-sss29511 0 Jan 12 22:43
ab21p30im-sss30522 0 ab21p30im-sss30531
0 Jan 12 22:43 ab21p30im-sss30523.
Like if there's no date for any host or if any value is null, it breaks the 3,3,3 pattern.
You like some like this:
awk 'ORS=NR%3?" ":RS' file
ab13p29im-sss29511 0 Jan 12 22:43
ab13p29im-sss29531 0 Jan 12 22:43
ab13p29im-sss29512 0 Feb 2 16:11
ab13p29im-sss29522 0 Feb 2 16:12
ab13p29im-sss29532 0 Feb 2 16:12
ab21p30im-sss30511 0 Jan 12 22:43
ab21p30im-sss30531 0 Jan 12 22:43
ab21p30im-sss30512 0 Feb 2 16:13
ab21p30im-sss30522 3 Feb 2 16:12
sed 'N;N;s/\n/ /g' YourFile
Load 2 lines, remove new line before printing it then cycle
you could secude by putting a pattern check to initiate the cycle like /[a-b0-9]\{9\}-[a-b0-9]\{8\}/!d; before first N
By using awk command and explicit concatenation:
$ awk 'NR % 3 == 1 { lines=$0 ; next } { lines=lines" "$0 } NR % 3 == 0 { print lines ; lines="" }' file
ab13p29im-sss29511 0 Jan 12 22:43
ab13p29im-sss29531 0 Jan 12 22:43
ab13p29im-sss29512 0 Feb 2 16:11
ab13p29im-sss29522 0 Feb 2 16:12
ab13p29im-sss29532 0 Feb 2 16:12
ab21p30im-sss30511 0 Jan 12 22:43
ab21p30im-sss30531 0 Jan 12 22:43
ab21p30im-sss30512 0 Feb 2 16:13
ab21p30im-sss30522 3 Feb 2 16:12

Resources