List files that have been modified for the last x days, excluding today - bash

This lists the files that have been modified for the last day (24 hours), -mtime -1, that contain the string "UGW", -name '*UGW*', in their name:
find ./ -mtime -1 -type f -name '*UGW*' -printf '%Tc %p\n' | sort
Is there an easy way to modify this so that I only list the last X days but exclude today?
Note:
I am sorting here, but I do not think this is sorting 100% correctly by timestamp.
EDIT1 based on anser below getting the below error
:~/tmp$ find ./ -mtime -1 -type f -name '*' -printf '%Tc %p\n' Tue 08 Mar 2016 12:25:01 NZDT ./compareKPIs-log
Mon 07 Mar 2016 18:05:02 NZDT ./log-file
Tue 08 Mar 2016 12:25:01 NZDT ./compareKPIs-error
Mon 07 Mar 2016 18:05:02 NZDT ./backup_public_html_20160307.tgz
:~/tmp$ comm -13 <(find ./ -daystart -mtime -1 -type f -printf '%Tc %p\n' ) <(find ./ -daystart -mtime -3 -type f -printf '%Tc %p\n' )
Sun 06 Mar 2016 18:05:00 NZDT ./backup_public_html_20160306.tgz
comm: file 2 is not in sorted order
Mon 07 Mar 2016 18:05:02 NZDT ./log-file
comm: file 1 is not in sorted order
Mon 07 Mar 2016 18:05:02 NZDT ./backup_public_html_20160307.tgz

***EDIT: The actual way to do this:
find ./ -daystart -mtime -3 -type f ! -mtime -1 -printf '%Tc %p\n'
Uses:
! -mtime -1 to exclude today
-daystart to start at 00:00
Really hackish, but what about
comm -13 <(find ./ -daystart -mtime -1 -type f -printf '%Tc %p\n' | sort) <(find ./ -daystart -mtime -3 -type f -printf '%Tc %p\n' | sort)
The command line options for comm are:
-1 suppress column 1 (lines unique to FILE1)
-2 suppress column 2 (lines unique to FILE2)
-3 suppress column 3 (lines that appear in both files)

Related

Concatenate many files into one file without the header

I have three csv files (with the same name, e.g. A_bestInd.csv) that are located in different subfolders. I want to copy all of them into one file (e.g. All_A_bestInd.csv). To do that, I did the following:
{ find . -type f -name A_bestInd.csv -exec cat '{}' \; ; } >> All_A_bestInd.csv
The result of this command is the following:
Class Conf 1 2 3 4 //header of file1
A Reduction 5 1 2 1
A Reduction 1 8 1 10
Class Conf 1 2 3 4 //header of file2
A No_red 2 1 3 2
A No_red 3 6 1 9
Class Conf 1 2 3 4 //header of file3
A Reduction 5 5 8 9
A Reduction 7 2 1 11
As you can see, the issue is the header of each file is copied. How can I change my command to keep only one header and avoid the rest?
Use tail +2 to trim the headers from all the files.
find . -type f -name A_bestInd.csv -exec tail +2 {} \; >> All_A_bestInd.csv
To keep just one header you could combine it with head -1.
{ find . -type f -name A_bestInd.csv -exec head -1 {} \; -quit
find . -type f -name A_bestInd.csv -exec tail +2 {} \; } >> All_A_bestInd.csv
There are solutions with tail +2 and awk, but it seems to me the classic way to print all but the first line of a file is sed: sed -e 1d. So:
find . -type f -name A_bestInd.csv -exec sed -e 1d '{}' \; >> All_A_bestInd.csv
Use awk to filter out header lines from all files but the first (except you have thousands of them):
find . -type f -name 'A_bestInd.csv' -exec awk 'NR==1 || FNR>1' {} + > 'All_A_bestInd.csv'
NR==1 || FNR>1 means; if the number of current line from the start of input is 1, or, the number of current line from the start of current file is greater than 1, print current line.
$ cat A_bestInd.csv
Class Conf 1 2 3 4 //header of file3
A Reduction 5 5 8 9
A Reduction 7 2 1 11
$
$ cat foo/A_bestInd.csv
Class Conf 1 2 3 4 //header of file1
A Reduction 5 1 2 1
A Reduction 1 8 1 10
$
$ cat bar/A_bestInd.csv
Class Conf 1 2 3 4 //header of file2
A No_red 2 1 3 2
A No_red 3 6 1 9
$
$ find . -type f -name 'A_bestInd.csv' -exec awk 'NR==1 || FNR>1' {} + > 'All_A_bestInd.csv'
$
$ cat All_A_bestInd.csv
Class Conf 1 2 3 4 //header of file1
A Reduction 5 1 2 1
A Reduction 1 8 1 10
A Reduction 5 5 8 9
A Reduction 7 2 1 11
A No_red 2 1 3 2
A No_red 3 6 1 9

error in script containing grep

I am getting GREP invalid option error in the code below :
file=$(find . -mtime -4 |ls -lt)
for f in $file
do
po=$(echo $f|cut -d"_" -f2)
find . -mtime -4 |ls -lt|grep "$po"|while read fn
do
if [ -s $fn ]; then #checks if the file is not empty
if [ -d tmp ]; then
rm -r tmp
fi
mkdir tmp
cp -p $fn /tmp/$fn
break
fi
done
done
Basically I am trying to sort the list which I am getting from find then looping through it taking the latest non zero file for a PO.
List of file is
-rw-rw-r-- 1 loneranger loneranger 37 Jul 21 06:30 belk_po12345_20140721.log
-rw-rw-r-- 1 loneranger loneranger 24 Jul 22 06:30 belk_po12345_20140722.log
-rw-rw-r-- 1 loneranger loneranger 0 Jul 23 06:30 belk_po12345_20140723.log
-rw-rw-r-- 1 loneranger loneranger 11 Jul 24 12:00 belk_po12348_20140723.log
PO - po12345 or po12348 these are...
Basically I am trying to sort the list which I am getting from find
then looping through it taking the latest non zero file for a PO.
You might use find for all of that except the final sort:
find . -size '+1c' -type f -printf "%f %T#\n" | sort -k2
The find part search files (-type f) of more than 1 byte long (-size '+1c') and for each one print the file's base name (%f) and the modified time as seconds since Jan. 1, 1970, 00:00 GMT (%T#). After that, it is a simple sort on the second field (timestamp).
Of course, you might add all the search criterion you need on find but that's the basic idea.
And if you want to loop over the result, do as usual:
find . -size '+1c' -type f -printf "%f %T#\n" |
sort -k2 |
while read fname mts; do
# ...
done

Can sed be used with find to print Fields and Folder Name

This small script :
touch ilFldsN9LS.txt
ls -l | grep "^d" > /home/userB/PLAY/LibTESTxOutputFiles/ilFldsN9LS_testTEST.txt
produces file content of this format:
drwxr-xr-x 2 userB userB 4096 Mar 23 22:40 BASH_Collection_FolderNESTY
drwxr-xr-x 2 userB userB 4096 Mar 24 17:33 BASH_Collection_Functionality
What I wish to achieve is to get output very much like the above, but using find.
Turning to the use of find, this script: (which unlike the one previous, is recursive)
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; | grep "^d" |\
| tee -a /home/UserB/PLAY/LibTESTxOutputFiles/ilFldsR9FB.txt
produces file content of this format:
drwxr-xr-x 2 UserB UserB 4096 Mar 24 17:33 ./BASH_Collection_Functionality
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 ./LibTESTxOutputFiles/AdditionalTESTresults
adding some AWK to the script, like so:
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; | grep "^d" | awk '{ sub(/\.\//, " ");print}'\
| tee -a /home/innocentxlii/PLAY/LibTESTxOutputFiles/ilFldsR9FB.txt
produces output with the ** ./ * stripped from the front of the path
and pads the gap with an extra space to ease reading:
drwxr-xr-x 2 UserB UserB 4096 Mar 24 17:33 BASH_Collection_Functionality
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 LibTESTxOutputFiles/AdditionalTESTresults
Where I have gotten stuck, is I have been trying to use sed to keep the Fields, but to
have only the last Folder in the path, listed. For example the last item above would be:
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 AdditionalTESTresults
Ideas? I tried literally dozens of sed variants but have realized something must be
wrong with this approach.
How about this: sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
Check my test runs below:
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1/dir2/dir3/dir4" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir4
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1/dir2/dir3" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir3
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1/dir2" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir2
$ echo "drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1" | sed -r 's/(.* )(.*)\/(.*)$/\1\3/g'
drwxr-xr-x 2 UserB UserB 4096 Mar 25 16:04 dir1
NOTE: Please note that there is a space After the 1st asterisks in the sed command
You don't need sed, you can do it all with awk
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; |
awk '{n=split($NF,a,"/"); sub($NF, " "a[n])}1'
grep "^d" is redundant since you are already finding directories with find . -type d
awk Explanation:
n=split($NF,a,"/") splits the last field (denoted by $NF) by / and assigns it to array a
n gives the length of the array
a[n] will therefore return the string following the last / (i.e. the inner most directory)
sub($NF, " "a[n]) replaces the last field (denoted by $NF) with a space for padding (as per example) + inner most directory (denoted by a[n])
awk '{...}1' the 1 outside the awk is the same as print
EDIT: RE: for cases where directory contain spaces in name
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; |
awk -F '[0-9]+:[0-9]+ ' '{n=split($NF,a,"/"); sub($NF, " "a[n])}1'
specifying the input separator with -F '[0-9]+:[0-9]+ ' (for modification time) will ensure the last field ($NF) is the file name -- regardless whether or not it contain spaces in the directory name
what about
sed 's|\./||'
that remove the first ./, so in your line
find . -type d \( ! -iname ".*" \) -exec ls -ld {} \; | sed 's|\./||'

Finding which files are taking up most space

On a mac terminal, I want to find out which files are the biggest in my project.
I try:
du -h | sort
But this sorts by path first and then within path the file size. How do I do it just for file size?
Thanks
Try
du -scm * | sort -n
If you want to have it as a nice zsh function you can use this:
function dudir () { du -scm ${1:-*(ND)} | sort -n }
Sort by numeric/reversed:
$ du -sk * | sort -nr
190560 find_buggy_pos.out
126676 DerivedData
29460 fens.txt
11108 cocos2d_html.tar.gz
484 ccore.log
164 ccore.out
16 a.out.dSYM
12 x
12 p
12 o
12 a.out
4 x.txt
4 trash.c
4 test2.cpp
4 test.cpp
4 stringify.py
4 ptest.c
4 o.cpp
4 mismatch.txt
4 games.pgn
It appears that you want to list files by size. Try:
find . -type f -printf "%s %p\n" | sort -n
(By default, du doesn't list counts for files. Use the -a or --all option to list count for files as well.)
On OSX following works:
find . -maxdepth 1 -type f -exec du -k {} \; | sort -nr
Use -k option:
du -sk * | sort -n

Script using find to count lines of code

I'm trying to create a shell script that will count the number of lines of code in one folder.
I got this:
h=find . -type f -name \*.[h]* -print0 | xargs -0 cat | wc -l
m=find . -type f -name \*.[m]* -print0 | xargs -0 cat | wc -l
expr $m + $h
But when I'm trying to run it I get this:
lines-of-code: line 6: .: -t: invalid option
.: usage: . filename [arguments]
0
lines-of-code: line 7: .: -t: invalid option
.: usage: . filename [arguments]
0
+
I know I have to do something to make it run on the specific folder I'm in. Is this even possible?
DDIYS (don't to it your self) Use cloc instead. Excelent tool written in perl that does the counting for you as well as a other things. It recognizes more than 80 languages.
Example output:
prompt> cloc perl-5.10.0.tar.gz
4076 text files.
3883 unique files.
1521 files ignored.
http://cloc.sourceforge.net v 1.50 T=12.0 s (209.2 files/s, 70472.1 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Perl 2052 110356 130018 292281
C 135 18718 22862 140483
C/C++ Header 147 7650 12093 44042
Bourne Shell 116 3402 5789 36882
Lisp 1 684 2242 7515
make 7 498 473 2044
C++ 10 312 277 2000
XML 26 231 0 1972
yacc 2 128 97 1549
YAML 2 2 0 489
DOS Batch 11 85 50 322
HTML 1 19 2 98
-------------------------------------------------------------------------------
SUM: 2510 142085 173903 529677
-------------------------------------------------------------------------------
Quote the commands like:
h=$(find . -type f -name *.[h]* -print0 | xargs -0 cat | wc -l)
Please also have a look at sloccount for counting lines of code. You can install it on debian/ubuntu with sudo apt-get install sloccount
For this specific problem, I have a different solution:
find . -type f -print0 | wc --files0-from=-
May be I misunderstood the question, but does this work for you?
wc -l *.[mh]*
Now it works!
h=$(find . -type f -name \*.[h]* -print0 | xargs -0 cat | wc -l)
m=$(find . -type f -name \*.[m]* -print0 | xargs -0 cat | wc -l)
expr $m + $h

Resources