How to check on FTP if there files on the list older than 7 days - bash

I have a list of files from remote FTP Server:
drwxrwxrwx 2 test-backup everyone 4096 Jul 8 02:30 .
drwxrwxrwx 5 0 0 4096 Jul 23 07:02 ..
-rw-rw-rw- 1 test-backup everyone 352696 Jul 18 02:30 expdp_TEST11P2_custom_Fri.dmp.gz
-rw-rw-rw- 1 test-backup everyone 352796 Jul 21 02:30 expdp_TEST11P2_custom_Mon.dmp.gz
-rw-rw-rw- 1 test-backup everyone 352615 Jul 19 02:30 expdp_TEST11P2_custom_Sat.dmp.gz
-rw-rw-rw- 1 test-backup everyone 352626 Jul 20 02:30 expdp_TEST11P2_custom_Sun.dmp.gz
-rw-rw-rw- 1 test-backup everyone 10511523642 Jul 24 03:08 expdp_TEST11P2_custom_Thu.dmp.gz
-rw-rw-rw- 1 test-backup everyone 10496881744 Jul 22 03:03 expdp_TEST11P2_custom_Tue.dmp.gz
-rw-rw-rw- 1 test-backup everyone 10504557195 Jul 23 03:03 expdp_TEST11P2_custom_Wed.dmp.gz
I need to check if there are any files older than 7 days, Have You any Ideas how can I do this in Bash?

As I understand the issue, you have a list of file list received via ftp (and you do not have access to find on the remote server). Assuming that you have the directory list stored in a file called ftptimes, then you can identify files older than 7 days via:
$ awk -v cutoff="$(date -d "7 days ago" +%s)" '{line=$0; "date -d \""$6" " $7" " $8 "\" +%s" |getline; fdate=$1} fdate < cutoff {print line} ' ftptimes
From your sample date, the output would be:
drwxrwxrwx 2 test-backup everyone 4096 Jul 8 02:30 .
Addressing the parts of the awk command, one by one:
-v cutoff="$(date -d "7 days ago" +%s)"
This defines an awk variable called cutoff that will have the Unix time (seconds since 1970-01-01 00:00:00 UTC) corresponding to seven days ago
line=$0;
This saves for later use the current input line into the variable line.
"date -d \""$6" " $7" " $8 "\" +%s" |getline; fdate=$1
This converts the date given by ftp into Unix time, reads that time in, and saves it in a variable called fdate.
fdate < cutoff {print line}
If the file date is less than the cutoff date, then the line is printed.
In the sample data that you provided, the only file older than seven days is the current directory (.) which dates to Jul 8.
As an example, if we wanted files older than 5 days, then more files would be printed:
$ awk -v cutoff="$(date -d "5 days ago" +%s)" '{line=$0; "date -d \""$6" " $7" " $8 "\" +%s" |getline; fdate=$1} fdate < cutoff {print line} ' ftptimes
drwxrwxrwx 2 test-backup everyone 4096 Jul 8 02:30 .
-rw-rw-rw- 1 test-backup everyone 352696 Jul 18 02:30 expdp_TEST11P2_custom_Fri.dmp.gz
-rw-rw-rw- 1 test-backup everyone 352615 Jul 19 02:30 expdp_TEST11P2_custom_Sat.dmp.gz
In the above, I assumed that the info from ftp was stored in a file. It is also possible to pipe it in:
echo ls | ftp host port | awk -v cutoff="$(date -d "5 days ago" +%s)" '{line=$0; "date -d \""$6" " $7" " $8 "\" +%s" |getline; fdate=$1} fdate < cutoff {print line} '
where host and port are replaced by the host and port of your server.
Bash version
The above can also be accomplished in bash although it requires explicit looping. Again, assuming the ftp information in the file ftptimes:
$ cutoff="$(date -d "7 days ago" +%s)"; while read line; do set -- $line; fdate=$(date -d "$6 $7 $8" +%s) ; [ $fdate -lt $cutoff ] && echo $line ; done <ftptimes
drwxrwxrwx 2 test-backup everyone 4096 Jul 8 02:30 .

The find command is the most flexible for date ranges. You have 3 basic tests to choose from: -atime +n (last access time was greater than n*24 hours ago); -ctime +n (file status changed greater than n*24 hours ago); and -mtime +n (file was modified greater than n*24 hours ago). Note: the use of n means exactly n*24 hours ago; +n means greater than n*24 hours ago and -n means less than n*24 hours ago. Also note that any fractional parts of the 24 hour period are ignored which means you may have to adjust the +n to +6 to get all files greater than 6 days old (meaning 7 days old) rather than +7. Example:
find /path/to/files -type f -mtime +6
Will find all files (not dirs) in /path/to/files that were modified greater than 6 days ago (which is 7 days). You can test with -atime, -ctime, and -mtime to see which fits your needs.

Related

How to split string with unequal amount of spaces [duplicate]

This question already has answers here:
How do I pipe a file line by line into multiple read variables?
(3 answers)
Closed 1 year ago.
I am currently struggling, with splitting a string with a varying amount of spaces, coming from a log file.
An excerpt of the log file:
ProcessA Mon Nov 9 09:59 - 10:48 (00:48)
ProcessB Sun Nov 8 11:16 - 11:17 (00:00)
ProcessC Sat Nov 7 12:52 - 12:53 (00:00)
ProcessD Fri Nov 6 09:31 - 11:25 (01:54)
ProcessE Thu Nov 5 16:41 - 16:41 (00:00)
ProcessF Thu Nov 5 11:39 - 11:40 (00:00)
As you can see the amount of spaces between the process name and the date of execution varies between 2 to 5 spaces.
I would like to split it up into three parts; - process, date of execution, and execution time.
However I don’t see a solution to that, because of the unequal amount of spaces. Am I wrong or is splitting such a string incredibly hard?
Hopefully somebody out there is way smarter than me and can provide me with a solution for that 😊
Thanks to everybody in advance, who is willing trying to help me with that!
You can also assign fields directly in a read.
while read -r prc wd mon md start _ end dur _; do
echo "prc='$prc' wd='$wd' mon='$mon' md='$md' start='$start' end='$end' dur='${dur//[)(]/}'"
done < file
Output:
prc='ProcessA' wd='Mon' mon='Nov' md='9' start='09:59' end='10:48' dur='00:48'
prc='ProcessB' wd='Sun' mon='Nov' md='8' start='11:16' end='11:17' dur='00:00'
prc='ProcessC' wd='Sat' mon='Nov' md='7' start='12:52' end='12:53' dur='00:00'
prc='ProcessD' wd='Fri' mon='Nov' md='6' start='09:31' end='11:25' dur='01:54'
prc='ProcessE' wd='Thu' mon='Nov' md='5' start='16:41' end='16:41' dur='00:00'
prc='ProcessF' wd='Thu' mon='Nov' md='5' start='11:39' end='11:40' dur='00:00'
read generally doesn't care how much whitespace is between.
In bash, you can use a regex to parse each line:
#! /bin/bash
while IFS=' ' read -r line ; do
if [[ "$line" =~ ([^\ ]+)\ +(.+[^\ ])\ +'('([^\)]+)')' ]] ; then
process=${BASH_REMATCH[1]}
date=${BASH_REMATCH[2]}
time=${BASH_REMATCH[3]}
echo "$process $date $time."
fi
done
Or, use parameter expansions:
#! /bin/bash
while IFS=' ' read -r process datetime ; do
shopt -s extglob
date=${datetime%%+( )\(*}
time=${datetime#*\(}
time=${time%\)}
echo "$process $date $time."
done
Using awk:
awk '{printf $1; for (i=2; i<NF; i++) printf " %s",$i; print "",$NF}' < file.txt
produces:
ProcessA Mon Nov 9 09:59 - 10:48 (00:48)
ProcessB Sun Nov 8 11:16 - 11:17 (00:00)
ProcessC Sat Nov 7 12:52 - 12:53 (00:00)
ProcessD Fri Nov 6 09:31 - 11:25 (01:54)
ProcessE Thu Nov 5 16:41 - 16:41 (00:00)
ProcessF Thu Nov 5 11:39 - 11:40 (00:00)

Does anybody have a script that counts the number of consecutive files which contain a specific word?

Any resources or advice would help, since I am pretty rubbish at scripting
So, I need to go to this path: /home/client/data/storage/customer/data/2020/09/15
And check to see if there are 5 or more consecutive files that contain the word "REJECTED":
ls -ltr
-rw-rw-r-- 1 root root 5059 Sep 15 00:05 customer_rlt_20200915000514737_20200915000547948_8206b49d-b585-4360-8da0-e90b8081a399.zip
-rw-rw-r-- 1 root root 5023 Sep 15 00:06 customer_rlt_20200915000547619_20200915000635576_900b44dc-1cf4-4b1b-a04f-0fd963591e5f.zip
-rw-rw-r-- 1 root root 39856 Sep 15 00:09 customer_rlt_20200915000824108_20200915000908982_b87b01b3-a5dc-4a80-b19d-14f31ff667bc.zip
-rw-rw-r-- 1 root root 39719 Sep 15 00:09 customer_rlt_20200915000901688_20200915000938206_38261b59-8ebc-4f9f-9e2d-3e32eca3fd4d.zip
-rw-rw-r-- 1 root root 12829 Sep 15 00:13 customer_rlt_20200915001229811_20200915001334327_1667be2f-f1a7-41ae-b9ca-e7103d9abbf8.zip
-rw-rw-r-- 1 root root 12706 Sep 15 00:13 customer_rlt_20200915001333922_20200915001357405_609195c9-f23a-4984-936f-1a0903a35c07.zip
Example of rejected file:
customer_rlt_20200513202515792_20200513202705506_5b8deae0-0405-413c-9a81-d1cc2171fa51REJECTED.zip
What I have so far:
!/bin/bash
YYYY=$(date +%Y);
MM=$(date +%m)
DD=$(date +%d)
#Set constants
CODE_OK=0
CODE_WARN=1
CODE_CRITICAL=2
CODE_UNKNOWN=3
#Set Default Values
FILE="/home/client/data/storage/customer/data/${YYYY}/${MM}/{DD}"
if [ ! -f $FILE ]
then
echo "NO TRANSACTIONS FOUND"
exit $CODE_CRITICAL
fi
You can do something quick in AWK:
$ cat consec.awk
/REJECTED/ {
if (match_line == NR - 1) {
consecutives++
} else {
consecutives = 1
}
if (consecutives == 5) {
print "5 REJECTED"
exit
}
match_line = NR
}
$ touch 1 2REJECTED 3REJECTED 5REJECTED 6REJECTED 7REJECTED 8
$ ls -1 | awk -f consec.awk
5 REJECTED
$ rm 3REJECTED; touch 3
$ ls -1 | awk -f consec.awk
$
This works by matching line containing REJECTED, counting consecutive lines (checked with match_line == NR - 1, which means "the last matching line was the previous line") and printing "5 REJECTED" if the number of consecutive lines is 5.
I've used ls -1 (note digit 1, not letter l) to sort by filename in this example. You could use ls -1rt (digit 1 again) to sort by file modification time, as in your original post.

How to get a filename list with ncftp?

So I tried
ncftpls -l
which gives me a list
-rw-r--r-- 1 100 ftpgroup 3817084 Jan 29 15:50 1548773401.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817089 Jan 29 15:51 1548773461.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817083 Jan 29 15:52 1548773521.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817085 Jan 29 15:53 1548773582.tar.gz
-rw-r--r-- 1 100 ftpgroup 3817090 Jan 29 15:54 1548773642.tar.gz
But all I want is to check the timestamp (which is the name of the tar.gz)
How to only get the timestamp list ?
As requested, all I wanted to do is delete old backups, so awk was a good idea (at least it was effective) even it wasn't the right params. My method to delete old backup is probably not the best but it works
ncftpls *authParams* | (awk '{match($9,/^[0-9]+/, a)}{ print a[0] }') | while read fileCreationDate; do
VALIDITY_LIMIT="$((`date +%s`-600))"
a=$VALIDITY_LIMIT
b=$fileCreationDate
if [ $b -lt $a ];then
deleteFtpFile $b
fi
done;
You can use awk to only display the timestamps from the output like so:
ncftpls -l | awk '{ print $5 }'

Unix script for checking logs for last 10 days

I have a log table which is maintained for a single day and the data from the table is only present for one day.However, the logs for it is present in the unix directory.
My requirement is to check the logs for the last 10 days and find me the count of records got loaded.
In the log file the pattern is something like this( fastload log of teradata).
**** 13:16:49 END LOADING COMPLETE
Total Records Read = 443303
Total Error Table 1 = 0 ---- Table has been dropped
Total Error Table 2 = 0 ---- Table has been dropped
Total Inserts Applied = 443303
Total Duplicate Rows = 0
I want to the script to be parametrized( parameter will be stage table name) which find the records inserted into table and error tables for the last 10 days.
Is this possible? Can anyone help me build the unix script for this?
There are many logs in the logs directory. what if a want to check only for the below:
bash-3.2$ ls -ltr 2018041*S_EVT_ACT_FLD*
-rw-rw----+ 1 edwops abgrp 52610 Apr 10 17:37 20180410173658_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52576 Apr 11 18:12 20180411181205_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52646 Apr 13 18:04 20180413180422_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52539 Apr 14 16:16 20180414161603_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52538 Apr 15 14:15 20180415141523_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52576 Apr 16 15:38 20180416153808_S_EVT_ACT_FLD.log
Thanks.
find . -ctime -10 -type f -print|xargs awk -F= '/Total Records Read/ {print $2}'|paste -sd+| bc
find . -ctime -10 -type f -print get the filenames of files 10 days or younger in current working directory. To run on a different directory replace . with the path
awk -F= '/Total Records Read/ {print $2}' using = as a field seperator filter out the second half of any line containing the key phrase
Total Records Read
paste -sd+ add a plus sign
bc evaluate the stream of numbers and operators into a single answer
I could not use find. because the system is Solaris, find doesn't have maxdepth future. I use case to create a FILTER2 and use it to
ls -l --time-style=long-iso FOLDER | grep -E $FILTER.
but I know it's not a good way.
LOCAL_DAY=`date "+%d"`
LOCAL_MONTH=`date "+%Y-%m"`
LASTTENDAYE_MONTH=`date --date='10 days ago' "+%Y-%m"`
case $LOCAL_DAY in
0*)
FILTER2="$LASTTENDAY_MONTH-[2-3][0-9]|$LOCAL_MONTH";;
1*)
FILTER2="$LOCAL_MONTH-0[0-9]|$LOCAL_MONTH-1[0-9]";;
2*)
FILTER2="$LOCAL_MONTH-1[0-9]|$LOCAL_MONTH-2[0-9]";;
esac

pick up files based on dates in ksh script

I have this list of files . Now I will have to pick the latest file based on some condition
3679 Jul 21 23:59 belk_rpo_error_**po9324892**_07212014.log
0 Jul 22 23:59 belk_rpo_error_**po9324892**_07222014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
0 Jul 20 05:50 belk_rpo_error_**po9999992**_07202014.log
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
742 Jul 21 07:30 belk_rpo_error_**po9999991**_07212014.log
0 Jul 23 2014 belk_rpo_error_**po9999991**_07232014.log
For a PATRICULAR Order_No(Marked with ** **)
If the latest file is 0 kB then we will discard it (rest of the files with same Order_no as well)
if the latest file is non Zero then I will take it.(Only the latest one)
Then append the contents in a txt file .
My expected output would be ::
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
I am at my wits end here. I cant seem to figure out how to compare dates in Unix. Any help is very appreciated.
You can try something like:
touch test.txt
for var in ` find . ! -empty -exec ls -r {} \;`
do
cat $var>>test.txt
done
untested
use stat to emit date (epoch time), size and filename.
use awk to filter out zero-length files and extract order number.
sort by order number and date
awk to pick up the last filename for each order number
stat -c $'%Y\t%s\t%n' *.log |
awk -F'\t' -v OFS='\t' '
$2 > 0 {
split($3, a, /_/)
print a[4], $1, $3
}' |
sort -t $'\t' -k1,1 -k2,2n |
awk -F'\t' '
NR > 1 && $1 != prev_order {print filename}
{filename = $3; prev_order = $1}
END {print filename}
'
The sort command might be wrong: In order to group by order number, you might need to sort first by file time then by order number.
If I understand your question, the resulting files need to be concatenated and appended to a file. If the above pipeline is working OK, then pipe into | xargs cat >> something.log

Resources