error in script containing grep - shell

I am getting GREP invalid option error in the code below :
file=$(find . -mtime -4 |ls -lt)
for f in $file
do
po=$(echo $f|cut -d"_" -f2)
find . -mtime -4 |ls -lt|grep "$po"|while read fn
do
if [ -s $fn ]; then #checks if the file is not empty
if [ -d tmp ]; then
rm -r tmp
fi
mkdir tmp
cp -p $fn /tmp/$fn
break
fi
done
done
Basically I am trying to sort the list which I am getting from find then looping through it taking the latest non zero file for a PO.
List of file is
-rw-rw-r-- 1 loneranger loneranger 37 Jul 21 06:30 belk_po12345_20140721.log
-rw-rw-r-- 1 loneranger loneranger 24 Jul 22 06:30 belk_po12345_20140722.log
-rw-rw-r-- 1 loneranger loneranger 0 Jul 23 06:30 belk_po12345_20140723.log
-rw-rw-r-- 1 loneranger loneranger 11 Jul 24 12:00 belk_po12348_20140723.log
PO - po12345 or po12348 these are...

Basically I am trying to sort the list which I am getting from find
then looping through it taking the latest non zero file for a PO.
You might use find for all of that except the final sort:
find . -size '+1c' -type f -printf "%f %T#\n" | sort -k2
The find part search files (-type f) of more than 1 byte long (-size '+1c') and for each one print the file's base name (%f) and the modified time as seconds since Jan. 1, 1970, 00:00 GMT (%T#). After that, it is a simple sort on the second field (timestamp).
Of course, you might add all the search criterion you need on find but that's the basic idea.
And if you want to loop over the result, do as usual:
find . -size '+1c' -type f -printf "%f %T#\n" |
sort -k2 |
while read fname mts; do
# ...
done

Related

Unix script for checking logs for last 10 days

I have a log table which is maintained for a single day and the data from the table is only present for one day.However, the logs for it is present in the unix directory.
My requirement is to check the logs for the last 10 days and find me the count of records got loaded.
In the log file the pattern is something like this( fastload log of teradata).
**** 13:16:49 END LOADING COMPLETE
Total Records Read = 443303
Total Error Table 1 = 0 ---- Table has been dropped
Total Error Table 2 = 0 ---- Table has been dropped
Total Inserts Applied = 443303
Total Duplicate Rows = 0
I want to the script to be parametrized( parameter will be stage table name) which find the records inserted into table and error tables for the last 10 days.
Is this possible? Can anyone help me build the unix script for this?
There are many logs in the logs directory. what if a want to check only for the below:
bash-3.2$ ls -ltr 2018041*S_EVT_ACT_FLD*
-rw-rw----+ 1 edwops abgrp 52610 Apr 10 17:37 20180410173658_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52576 Apr 11 18:12 20180411181205_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52646 Apr 13 18:04 20180413180422_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52539 Apr 14 16:16 20180414161603_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52538 Apr 15 14:15 20180415141523_S_EVT_ACT_FLD.log
-rw-rw----+ 1 edwops abgrp 52576 Apr 16 15:38 20180416153808_S_EVT_ACT_FLD.log
Thanks.
find . -ctime -10 -type f -print|xargs awk -F= '/Total Records Read/ {print $2}'|paste -sd+| bc
find . -ctime -10 -type f -print get the filenames of files 10 days or younger in current working directory. To run on a different directory replace . with the path
awk -F= '/Total Records Read/ {print $2}' using = as a field seperator filter out the second half of any line containing the key phrase
Total Records Read
paste -sd+ add a plus sign
bc evaluate the stream of numbers and operators into a single answer
I could not use find. because the system is Solaris, find doesn't have maxdepth future. I use case to create a FILTER2 and use it to
ls -l --time-style=long-iso FOLDER | grep -E $FILTER.
but I know it's not a good way.
LOCAL_DAY=`date "+%d"`
LOCAL_MONTH=`date "+%Y-%m"`
LASTTENDAYE_MONTH=`date --date='10 days ago' "+%Y-%m"`
case $LOCAL_DAY in
0*)
FILTER2="$LASTTENDAY_MONTH-[2-3][0-9]|$LOCAL_MONTH";;
1*)
FILTER2="$LOCAL_MONTH-0[0-9]|$LOCAL_MONTH-1[0-9]";;
2*)
FILTER2="$LOCAL_MONTH-1[0-9]|$LOCAL_MONTH-2[0-9]";;
esac

Csh - Fetching fields via awk inside xargs

I'm struggling to understand this behavior:
Script behavior: read a file (containing dates); print a list of files in a multi-level directory tree and get their size, print the file size only, (future step: sum the overall file size).
Starting script:
cat dates | xargs -I {} sh -c "echo '{}: '; du -d 2 "/folder/" | grep {} | head"
2000-03:
1000 /folder/2000-03balbasldas
2000-04:
12300 /folder/2000-04asdwqdas
[and so on]
But when I try to filter via awk on the first field, I still get the whole line
cat dates | xargs -I {} sh -c "echo '{}: '; du -d 2 "/folder/" | grep {} | awk '{print $1}'"
2000-03:
1000 /folder/2000-03balbasldas
2000-04:
12300 /folder/2000-04asdwqdas
I've already approached it via divide-et-impera, and the following command works just fine:
du -d 2 "/folder/" | grep '2000-03' | awk '{print $1}'
1000
I'm afraid that I'm missing something very trivial, but I haven't found anything so far.
Any idea? Thanks!
Input: directory containing folders named YYYY-MM-random_data and a file containing strings:
ls -l
drwxr-xr-x 2 user staff 68 Apr 24 11:21 2000-03-blablabla
drwxr-xr-x 2 user staff 68 Apr 24 11:21 2000-04-blablabla
drwxr-xr-x 2 user staff 68 Apr 24 11:21 2000-05-blablabla
drwxr-xr-x 2 user staff 68 Apr 24 11:21 2000-06-blablabla
drwxr-xr-x 2 user staff 68 Apr 24 11:21 2000-06-blablablb
drwxr-xr-x 2 user staff 68 Apr 24 11:21 2000-06-blablablc
[...]
cat dates
2000-03
2000-04
2000-05
[...]
Expected output: sum of the disk space occupied by all the files contained in the folder whose name include the string in the file dates
2000-03: 1000
2000-04: 2123
2000-05: 1222112
[...]
======
But in particular, I'm interested in why awk is not able to fetch the column $1 I asked it to.
Ok it seems I found the answer myself after a lot of research :D
I'll post it here, hoping that it will help somebody else out.
https://unix.stackexchange.com/questions/282503/right-syntax-for-awk-usage-in-combination-with-other-command-inside-xargs-sh-c
The trick was to escape the $ sign.
cat dates | xargs -I {} sh -c "echo '{}: '; du -d 2 "/folder/" | grep {} | awk '{print \$1}'"
Using GNU Parallel it looks like this:
parallel --tag "eval du -s folder/{}* | perl -ne '"'$s+=$_ ; END {print "$s\n"}'"'" :::: dates
--tag prepends the line with the date.
{} is replaced with the date.
eval du -s folder/{}* finds all the dirs starting with the date and gives the total du from those dirs.
perl -ne '$s+=$_ ; END {print "$s\n"}' sums up the output from du
Finally there is bit of quoting trickery to get it quoted correctly.

List files that have been modified for the last x days, excluding today

This lists the files that have been modified for the last day (24 hours), -mtime -1, that contain the string "UGW", -name '*UGW*', in their name:
find ./ -mtime -1 -type f -name '*UGW*' -printf '%Tc %p\n' | sort
Is there an easy way to modify this so that I only list the last X days but exclude today?
Note:
I am sorting here, but I do not think this is sorting 100% correctly by timestamp.
EDIT1 based on anser below getting the below error
:~/tmp$ find ./ -mtime -1 -type f -name '*' -printf '%Tc %p\n' Tue 08 Mar 2016 12:25:01 NZDT ./compareKPIs-log
Mon 07 Mar 2016 18:05:02 NZDT ./log-file
Tue 08 Mar 2016 12:25:01 NZDT ./compareKPIs-error
Mon 07 Mar 2016 18:05:02 NZDT ./backup_public_html_20160307.tgz
:~/tmp$ comm -13 <(find ./ -daystart -mtime -1 -type f -printf '%Tc %p\n' ) <(find ./ -daystart -mtime -3 -type f -printf '%Tc %p\n' )
Sun 06 Mar 2016 18:05:00 NZDT ./backup_public_html_20160306.tgz
comm: file 2 is not in sorted order
Mon 07 Mar 2016 18:05:02 NZDT ./log-file
comm: file 1 is not in sorted order
Mon 07 Mar 2016 18:05:02 NZDT ./backup_public_html_20160307.tgz
***EDIT: The actual way to do this:
find ./ -daystart -mtime -3 -type f ! -mtime -1 -printf '%Tc %p\n'
Uses:
! -mtime -1 to exclude today
-daystart to start at 00:00
Really hackish, but what about
comm -13 <(find ./ -daystart -mtime -1 -type f -printf '%Tc %p\n' | sort) <(find ./ -daystart -mtime -3 -type f -printf '%Tc %p\n' | sort)
The command line options for comm are:
-1 suppress column 1 (lines unique to FILE1)
-2 suppress column 2 (lines unique to FILE2)
-3 suppress column 3 (lines that appear in both files)

pick up files based on dates in ksh script

I have this list of files . Now I will have to pick the latest file based on some condition
3679 Jul 21 23:59 belk_rpo_error_**po9324892**_07212014.log
0 Jul 22 23:59 belk_rpo_error_**po9324892**_07222014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
0 Jul 20 05:50 belk_rpo_error_**po9999992**_07202014.log
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
742 Jul 21 07:30 belk_rpo_error_**po9999991**_07212014.log
0 Jul 23 2014 belk_rpo_error_**po9999991**_07232014.log
For a PATRICULAR Order_No(Marked with ** **)
If the latest file is 0 kB then we will discard it (rest of the files with same Order_no as well)
if the latest file is non Zero then I will take it.(Only the latest one)
Then append the contents in a txt file .
My expected output would be ::
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
I am at my wits end here. I cant seem to figure out how to compare dates in Unix. Any help is very appreciated.
You can try something like:
touch test.txt
for var in ` find . ! -empty -exec ls -r {} \;`
do
cat $var>>test.txt
done
untested
use stat to emit date (epoch time), size and filename.
use awk to filter out zero-length files and extract order number.
sort by order number and date
awk to pick up the last filename for each order number
stat -c $'%Y\t%s\t%n' *.log |
awk -F'\t' -v OFS='\t' '
$2 > 0 {
split($3, a, /_/)
print a[4], $1, $3
}' |
sort -t $'\t' -k1,1 -k2,2n |
awk -F'\t' '
NR > 1 && $1 != prev_order {print filename}
{filename = $3; prev_order = $1}
END {print filename}
'
The sort command might be wrong: In order to group by order number, you might need to sort first by file time then by order number.
If I understand your question, the resulting files need to be concatenated and appended to a file. If the above pipeline is working OK, then pipe into | xargs cat >> something.log

Can sed be used with find to print Fields and Folder Name

This small script :
touch ilFldsN9LS.txt
ls -l | grep "^d" > /home/userB/PLAY/LibTESTxOutputFiles/ilFldsN9LS_testTEST.txt
produces file content of this format:
drwxr-xr-x 2 userB userB 4096 Mar 23 22:40 BASH_Collection_FolderNESTY
drwxr-xr-x 2 userB userB 4096 Mar 24 17:33 BASH_Collection_Functionality
What I wish to achieve is to get output very much like the above, but using find.
Turning to the use of find, this script: (which unlike the one previous, is recursive)
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; | grep "^d" |\
| tee -a /home/UserB/PLAY/LibTESTxOutputFiles/ilFldsR9FB.txt
produces file content of this format:
drwxr-xr-x 2 UserB UserB 4096 Mar 24 17:33 ./BASH_Collection_Functionality
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 ./LibTESTxOutputFiles/AdditionalTESTresults
adding some AWK to the script, like so:
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; | grep "^d" | awk '{ sub(/\.\//, " ");print}'\
| tee -a /home/innocentxlii/PLAY/LibTESTxOutputFiles/ilFldsR9FB.txt
produces output with the ** ./ * stripped from the front of the path
and pads the gap with an extra space to ease reading:
drwxr-xr-x 2 UserB UserB 4096 Mar 24 17:33 BASH_Collection_Functionality
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 LibTESTxOutputFiles/AdditionalTESTresults
Where I have gotten stuck, is I have been trying to use sed to keep the Fields, but to
have only the last Folder in the path, listed. For example the last item above would be:
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 AdditionalTESTresults
Ideas? I tried literally dozens of sed variants but have realized something must be
wrong with this approach.
How about this: sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
Check my test runs below:
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1/dir2/dir3/dir4" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir4
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1/dir2/dir3" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir3
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1/dir2" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir2
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1
NOTE: Please note that there is a space After the 1st asterisks in the sed command
You don't need sed, you can do it all with awk
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; |
awk '{n=split($NF,a,"/"); sub($NF, " "a[n])}1'
grep "^d" is redundant since you are already finding directories with find . -type d
awk Explanation:
n=split($NF,a,"/") splits the last field (denoted by $NF) by / and assigns it to array a
n gives the length of the array
a[n] will therefore return the string following the last / (i.e. the inner most directory)
sub($NF, " "a[n]) replaces the last field (denoted by $NF) with a space for padding (as per example) + inner most directory (denoted by a[n])
awk '{...}1' the 1 outside the awk is the same as print
EDIT: RE: for cases where directory contain spaces in name
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; |
awk -F '[0-9]+:[0-9]+ ' '{n=split($NF,a,"/"); sub($NF, " "a[n])}1'
specifying the input separator with -F '[0-9]+:[0-9]+ ' (for modification time) will ensure the last field ($NF) is the file name -- regardless whether or not it contain spaces in the directory name
what about
sed 's|\./||'
that remove the first ./, so in your line
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; | sed 's|\./||'

Resources