How do I start loop dynamic line of file - bash

I have log file like this:
...
Tue Nov 18 11:54:59 2014 1 10.8.0.6 148 /home/spy/test/spy.csv b _ i r spy ftp 0 * c
Tue Nov 18 11:55:00 2014 1 10.8.0.6 428 /home/spy/test/spy-atma.csv b _ i r spy ftp 0 * c
Tue Nov 18 11:55:01 2014 1 10.8.0.6 289 /home/spy/test/spy-xfer.csv b _ i r spy ftp 0 * c
Tue Nov 18 11:55:02 2014 1 10.8.0.6 148 /home/spy/test/spy.csv b _ o r spy ftp 0 * c
Tue Nov 18 11:55:03 2014 1 10.8.0.6 428 /home/spy/test/spy-atma.csv b _ o r spy ftp 0 * c
Tue Nov 18 11:55:04 2014 1 10.8.0.6 289 /home/spy/test/spy-xfer.csv b _ o r spy ftp 0 * c
END OF FILE
I need to start print last 5 mins interval(from last date) like this:
Tue Nov 18 11:55:00 2014 1 10.8.0.6 428 /home/spy/test/spy-atma.csv b _ i r spy ftp 0 * c
Tue Nov 18 11:55:01 2014 1 10.8.0.6 289 /home/spy/test/spy-xfer.csv b _ i r spy ftp 0 * c
Tue Nov 18 11:55:02 2014 1 10.8.0.6 148 /home/spy/test/spy.csv b _ o r spy ftp 0 * c
Tue Nov 18 11:55:03 2014 1 10.8.0.6 428 /home/spy/test/spy-atma.csv b _ o r spy ftp 0 * c
Tue Nov 18 11:55:04 2014 1 10.8.0.6 289 /home/spy/test/spy-xfer.csv b _ o r spy ftp 0 * c
I have function for date:
entry_time() {
date -d "$(cut -c 1-24 <<< "$1")" +%s
}
And I can print start date like this:
cutoff=$(( $(entry_time "$(tail -n1 "$LOG")") - $(entry_time "$(tail -n1 "$LOG")") % (5 * 60) ))
I can check all lines date is >= cutoff or not, print and going loop for something. But log file is so big and when I check like this, it's working 10-15 hours. Maybe it's work, but I need more simple way. How can I do this?

If i understand what you want properly then this should work
awk '{
Month=index("JanFebMarAprMayJunJulAugSepOctNovDec",$2)
Month=(Month+2)/3
split($4,a,":")
Time=mktime($5" "Month" "$3" "a[1]" "a[2]" "a[3])
}
(systime()-Time)<300 ' file

Related

Split a column into multiple rows

I have data like this my table:
Year
Month
Daytype
"2020"
"01"
"HBBBBHHBBBBBHHBBBBBHHBBBBBHHBBB"
"2020"
"02"
"BBHHBBBBBHHBBBBBHHBBBBBHHBBB"
Now I need to convert this table as below ;
Year
Month
Day
Daytype
2020
01
01
H
2020
01
02
B
2020
01
03
B
2020
01
04
B
2020
01
05
B
2020
01
06
H
2020
01
07
H
I tried option using regular expression. but my problem the daytype string doesn't have any delimiter string.
Can someone assist me. I want this in SQL only
You can use a recursive query and simple string functions (which are much faster than regular expressions):
WITH days (year, month, daytypes, day, daytype) AS (
SELECT year, month, daytype, 1, SUBSTR(daytype, 1, 1)
FROM table_name
UNION ALL
SELECT year, month, daytypes, day + 1, SUBSTR(daytypes, day + 1, 1)
FROM days
WHERE day < LENGTH(daytypes)
)
SELECT year, month, day, daytype
FROM days
ORDER BY year, month, day;
Which, for the sample data:
CREATE TABLE table_name (Year,Month,Daytype) AS
SELECT 2020, 1, 'HBBBBHHBBBBBHHBBBBBHHBBBBBHHBBB' FROM DUAL UNION ALL
SELECT 2020, 2, 'BBHHBBBBBHHBBBBBHHBBBBBHHBBB' FROM DUAL;
Outputs:
YEAR
MONTH
DAY
DAYTYPE
2020
1
1
H
2020
1
2
B
2020
1
3
B
2020
1
4
B
2020
1
5
B
2020
1
6
H
2020
1
7
H
2020
1
8
B
2020
1
9
B
2020
1
10
B
2020
1
11
B
2020
1
12
B
2020
1
13
H
2020
1
14
H
2020
1
15
B
2020
1
16
B
2020
1
17
B
2020
1
18
B
2020
1
19
B
2020
1
20
H
2020
1
21
H
2020
1
22
B
2020
1
23
B
2020
1
24
B
2020
1
25
B
2020
1
26
B
2020
1
27
H
2020
1
28
H
2020
1
29
B
2020
1
30
B
2020
1
31
B
2020
2
1
B
2020
2
2
B
2020
2
3
H
2020
2
4
H
2020
2
5
B
2020
2
6
B
2020
2
7
B
2020
2
8
B
2020
2
9
B
2020
2
10
H
2020
2
11
H
2020
2
12
B
2020
2
13
B
2020
2
14
B
2020
2
15
B
2020
2
16
B
2020
2
17
H
2020
2
18
H
2020
2
19
B
2020
2
20
B
2020
2
21
B
2020
2
22
B
2020
2
23
B
2020
2
24
H
2020
2
25
H
2020
2
26
B
2020
2
27
B
2020
2
28
B
fiddle

concat two files side-by-side, append difference between fields, and print in tabular format

Consider I have a two files as below: I need to concatenate and find difference in the new file.
a.txt
a 2019 66
b 2020 50
c 2018 48
b.txt
a 2019 50
b 2019 40
c 2018 45
Desired output:
a 2019 66 a 2019 50 16
b 2020 50 b 2019 40 10
c 2018 48 c 2018 45 3
I tried:
awk -F, -v OFS=" " '{$7=$3-$6}1' file3.txt
it prints
a 2019 66 a 2019 50 0
b 2020 50 b 2019 40 0
c 2018 48 c 2018 45 0
Also can help in printing in tabular format?
Your awk command seems fine except -F,. You should paste those files first.
$ paste a.txt b.txt | awk '{print $0,$3-$6}' | column -t
a 2019 66 a 2019 50 16
b 2020 50 b 2019 40 10
c 2018 48 c 2018 45 3
Within single awk could you please try following.
awk 'FNR==NR{a[FNR]=$0;b[FNR]=$NF;next} {print a[FNR],$0,b[FNR]-$NF}' a.txt b.txt | column -t
Output will be as follows.
a 2019 66 a 2019 50 16
b 2020 50 b 2019 40 10
c 2018 48 c 2018 45 3

How can I get AWK to start reading by the end?

I need to parse all a file into a better format to produce an outcome with columns delimited by a comma, thinking of being able to export the content in CSV file.
This is an example of my input;
. D 0 Mon Dec 10 11:07:46 2018
.. D 0 Mon Feb 19 11:38:06 2018
RJ9-5 D 0 Fri Nov 30 10:34:24 2018
WorkingOnClass D 0 Wed Feb 28 09:37:52 2018
ML-Test001 D 0 Fri Dec 7 16:38:56 2018
TestML4Testing D 0 Wed Aug 22 08:58:42 2018
ML-NewDataSE SetCases1.xlsx A 1415577 Wed Aug 29 14:00:16 2018
DR0001-Dum01 D 0 Thu Aug 16 08:24:25 2018
DR0002-Dum02 D 0 Thu Aug 16 09:04:50 2018
Readme File for Documentation And Data Description.docx A 16136 Wed Aug 29 14:00:24 2018
ML Database Prototype D 0 Thu Dec 6 15:11:11 2018
OneNote D 0 Mon Dec 3 09:39:20 2018
Data A 0 Mon Dec 10 11:07:46 2018
\RJ9-5
. D 0 Fri Nov 30 10:34:24 2018
.. D 0 Mon Dec 10 11:07:46 2018
KLR0151_Set023_Files_RJ9_05.xlsx A 182462 Wed Apr 4 02:48:55 2018
KLR0152_Set023_Files_RJ9_05.xlsx A 525309 Wed Apr 4 02:53:57 2018
\ML-Test001
. D 0 Wed Feb 28 09:37:52 2018
.. D 0 Mon Dec 10 11:07:46 2018
WT_Conforming_Format1_1.docx A 500914 Mon Feb 26 08:50:55 2018
Conforming_Format_1_1.xlsx A 130647 Mon Feb 26 08:52:33 2018
DR0135_Dum01_text.xls A 974848 Mon Feb 12 08:11:11 2018
DR0139_Dum02_body.xls A 1061888 Tue Jun 19 13:43:54 2018
DataSet_File_mod0874953.xlsx A 149835 Mon Feb 26 14:17:02 2018
File Path For Dataset-2018.07.11.xlsx A 34661 Mon Feb 12 09:27:17
This is script right here can make the job:
#!/bin/bash
awk -v OFS=, '
BEGIN { print "PATH, FILENAME, SIZE, TIMESTAMP" }
/[\\]/ { path=$0 }
$2 ~ /A/ {print path"\\"$1,$3,$4 " " $5 " " $6 " " $7 " "$8 }
' "$#"
But is ignoring the names with spaces on it, so I need to validate them with something like:
awk -v FS="\t" '{print $1}'
But I could't integrate into the shell script, because the way the shell script is working, so I was thinking on make AWK to start reading by the end, since the end is always the same, and leave the rest.
The output should something like this:
/RJ9-5/KLR0151_Set023_Files_RJ9_05.xlsx,182462,Wed Apr 4 02:48:55 2018
/RJ9-5/KLR0152_Set023_Files_RJ9_05.xlsx,25309,Wed Apr 4 02:53:57 2018
/ML-Test001/WT_Conforming_Format1_1.docx,500914,Mon Feb 26 08:50:55 2018
/ML-Test001/Format_1_1.xlsx,130647,Mon Feb 26 08:52:33 2018
/ML-Test001/DR0135_Dum01_text.xls,974848,Mon Feb 12 08:11:11 2018
/ML-Test001/DR0139_Dum02_body.xls,1061888,Tue Jun 19 13:43:54 2018
/ML-Test001/DataSet_File_mod0874953.xlsx,149835,Mon Feb 26 14:17:02 2018
/ML-Test001/File Path For Dataset-2018.07.11.xlsx,34661,Mon Feb 12 09:27:17 2018
With GNU awk for the 3rd arg to match() (and far less importantly \s shorthand for [[:space:]]):
$ cat tst.awk
BEGIN { OFS="," }
{ gsub(/^\s+|\s+$/,"") }
sub(/^\\/,"/") { path = $0; next }
path == "" { next }
match($0,/^(.*[^ ]) +A +([^ ]+) +(.*)/,a) { print path "/" a[1], a[2], a[3] }
$ awk -f tst.awk file
/RJ9-5/KLR0151_Set023_Files_RJ9_05.xlsx,182462,Wed Apr 4 02:48:55 2018
/RJ9-5/KLR0152_Set023_Files_RJ9_05.xlsx,525309,Wed Apr 4 02:53:57 2018
/ML-Test001/WT_Conforming_Format1_1.docx,500914,Mon Feb 26 08:50:55 2018
/ML-Test001/Conforming_Format_1_1.xlsx,130647,Mon Feb 26 08:52:33 2018
/ML-Test001/DR0135_Dum01_text.xls,974848,Mon Feb 12 08:11:11 2018
/ML-Test001/DR0139_Dum02_body.xls,1061888,Tue Jun 19 13:43:54 2018
/ML-Test001/DataSet_File_mod0874953.xlsx,149835,Mon Feb 26 14:17:02 2018
/ML-Test001/File Path For Dataset-2018.07.11.xlsx,34661,Mon Feb 12 09:27:17
Try this Perl solution:
$ perl -lane ' if(/^\s*$/) { $x=0;$y=0} if(/^\\/) {$x=1 ;($a=$_)=~s/\s*$//g;$a=~s/\\/\//g; } $y++ if $x==1 ; if($y>3) { s/^\s*//g; $_=~s/(.+?)\s+\S+\s+((\d+)\s+.+)/$1 $2/g;print "$a/$_" } ' essparaq.txt
/RJ9-5/KLR0151_Set023_Files_RJ9_05.xlsx 182462 Wed Apr 4 02:48:55 2018
/RJ9-5/KLR0152_Set023_Files_RJ9_05.xlsx 525309 Wed Apr 4 02:53:57 2018
/ML-Test001/WT_Conforming_Format1_1.docx 500914 Mon Feb 26 08:50:55 2018
/ML-Test001/Conforming_Format_1_1.xlsx 130647 Mon Feb 26 08:52:33 2018
/ML-Test001/DR0135_Dum01_text.xls 974848 Mon Feb 12 08:11:11 2018
/ML-Test001/DR0139_Dum02_body.xls 1061888 Tue Jun 19 13:43:54 2018
/ML-Test001/DataSet_File_mod0874953.xlsx 149835 Mon Feb 26 14:17:02 2018
/ML-Test001/File Path For Dataset-2018.07.11.xlsx 34661 Mon Feb 12 09:27:17
$ cat essparaq.txt
. D 0 Mon Dec 10 11:07:46 2018
.. D 0 Mon Feb 19 11:38:06 2018
RJ9-5 D 0 Fri Nov 30 10:34:24 2018
WorkingOnClass D 0 Wed Feb 28 09:37:52 2018
ML-Test001 D 0 Fri Dec 7 16:38:56 2018
TestML4Testing D 0 Wed Aug 22 08:58:42 2018
ML-NewDataSE SetCases1.xlsx A 1415577 Wed Aug 29 14:00:16 2018
DR0001-Dum01 D 0 Thu Aug 16 08:24:25 2018
DR0002-Dum02 D 0 Thu Aug 16 09:04:50 2018
Readme File for Documentation And Data Description.docx A 16136 Wed Aug 29 14 :00:24 2018
ML Database Prototype D 0 Thu Dec 6 15:11:11 2018
OneNote D 0 Mon Dec 3 09:39:20 2018
Data A 0 Mon Dec 10 11:07:46 2018
\RJ9-5
. D 0 Fri Nov 30 10:34:24 2018
.. D 0 Mon Dec 10 11:07:46 2018
KLR0151_Set023_Files_RJ9_05.xlsx A 182462 Wed Apr 4 02:48:55 2018
KLR0152_Set023_Files_RJ9_05.xlsx A 525309 Wed Apr 4 02:53:57 2018
\ML-Test001
. D 0 Wed Feb 28 09:37:52 2018
.. D 0 Mon Dec 10 11:07:46 2018
WT_Conforming_Format1_1.docx A 500914 Mon Feb 26 08:50:55 2018
Conforming_Format_1_1.xlsx A 130647 Mon Feb 26 08:52:33 2018
DR0135_Dum01_text.xls A 974848 Mon Feb 12 08:11:11 2018
DR0139_Dum02_body.xls A 1061888 Tue Jun 19 13:43:54 2018
DataSet_File_mod0874953.xlsx A 149835 Mon Feb 26 14:17:02 2018
File Path For Dataset-2018.07.11.xlsx A 34661 Mon Feb 12 09:27:17

List all the mondays of this month

I'm pretty new to bash and all the terminal in general - I've been messing around with cal and the date scripts wondering if there is anyway to list all the dates of monday of the current month .
My thought process is going thru the cal command, listing out the dates and maybe cutting a column from that input. Is that possible ?
You can do it with date command. Print 10 mondays since month ago:
for x in $(seq 0 9)
do
date -d "$x monday 5 week ago"
done
And grep only current month. Full command: for x in $(seq 0 9); do; date -d "$x monday 5 week ago"; done | grep $(date +%b)
Output:
Mon Jun 5 00:00:00 MSK 2017
Mon Jun 12 00:00:00 MSK 2017
Mon Jun 19 00:00:00 MSK 2017
Mon Jun 26 00:00:00 MSK 2017
Given:
$ cal
June 2017
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
You can do:
$ cal | awk 'NF>5{print $2}'
Mo
5
12
19
26
If you want something that will support any day of cal, use the field width (gawk only this):
$ cal | gawk -v n=5 '
BEGIN{
FIELDWIDTHS = "3 3 3 3 3 3 3"
}
FNR>1{print $n}'
Th
1
8
15
22
29
Or, as pointed out in comments:
$ ncal | awk '/^Mo/'
Mo 5 12 19 26
Combination of cal,cut commands to achieve the output.
cal -h| cut -c'4,5'
Remove the highlight and cut the characters which suits in the fields of monday.
ncal | sed -n '/^Mo/p'
The output as below:
Mo 5 12 19 26

Shell: replace \n with space between two specific lines

This is my first post here, thanks in advance for your prompt support.
I did some greps on a big file and came with file like the below. I want to make all lines after that starting with XX# in the same line.
So, I want the flowing:
XX#DEV1>
Feb 23 07:00:03
Feb 23 07:00:05
Sent : 4608
Received : 4227
Feb 23 07:00:07
Feb 23 07:00:09
XX#DEV2>
Feb 23 07:00:32
Feb 23 07:00:34
Sent : 4608
Received : 4232
Feb 23 07:00:36
Feb 23 07:00:38
XX#DEV1>
Feb 23 08:00:03
Feb 23 08:00:06
Sent : 4608
Received : 4265
Feb 23 08:00:07
Feb 23 08:00:09
XX#DEV2>
...
To become:
XX#DEV1> Feb 23 07:00:03 Feb 23 07:00:05 Sent : 4608 Received : 4227 Feb 23 07:00:07 Feb 23 07:00:09
XX#DEV2> Feb 23 07:00:32 Feb 23 07:00:34 Sent : 4608 Received : 4232 Feb 23 07:00:36 Feb 23 07:00:38
XX#DEV1> Feb 23 08:00:03 Feb 23 08:00:06 Sent : 4608 Received : 4265 Feb 23 08:00:07 Feb 23 08:00:09
XX#DEV2> ...
You can do that with awk pretty easily:
awk '/^XX/{if(length(out))print out;out=$0;next}{out=out" "$0}END{print out}' yourfile
That says.. if the line starts with "XX", print whatever we have accumluated in variable out, then save current line in variable out. If the line is anything else, append it to variable out after adding a space. At the end, print whatever we have accumulated in variable out.
Output:
XX#DEV1> Feb 23 07:00:03 Feb 23 07:00:05 Sent : 4608 Received : 4227 Feb 23 07:00:07 Feb 23 07:00:09
XX#DEV2> Feb 23 07:00:32 Feb 23 07:00:34 Sent : 4608 Received : 4232 Feb 23 07:00:36 Feb 23 07:00:38
XX#DEV1> Feb 23 08:00:03 Feb 23 08:00:06 Sent : 4608 Received : 4265 Feb 23 08:00:07 Feb 23 08:00:09
Or you can do it bash in much the same way as the awk in my other answer:
#!/bin/bash
while read line
do
if [[ $line == XX* ]] ;then
echo $OUT # Output whatever we have accumulated in $OUT
OUT=$line # Zero out $OUT and restart accumulating
else
OUT="$OUT $line" # Append this line to $OUT
fi
done < yourfile
echo $OUT # Output any trailing stuff at the end
sed -n '
# use -n to prevent sed from printing each line
# if the line read starts with XX, go (b) to :printit
/^XX/ b printit
# if it is the last line ($), also go to :printit
$ b printit
# for all other lines, append the read line to the hold space (H)
# and then go to the :end (so we read the next line)
H; b end
:printit {
# swap the hold and the pattern space
x
# replace all (s/../../g) single or multiple newlines (\n\n*)
# with a space
s/\n\n*/ /g
# print the pattern space
p
}
:end
' your_file

Resources