How to tell find command to escape space characters in file names? - bash

I have a single line find command, which recursively checks and prints out the size, owner and name of a specific file type which are created in a specific time frame. But in the result, the filename column is given until the first space character in the directory or file name.
Any idea to fix this problem right in this single command without writing any loop in the bash? Thanks!
here is the command:
find /path/to/dist -type f -iname "*.png" -newermt 2015-01-01 ! -newermt 2016-12-31 -ls | sort -k5 | sort -k5 | awk '{print $5"\t"$7"\t"$11}'

Try changing your awk command to this :
awk '{$1=$2=$3=$4=$6=$8=$9=$10="" ; print$11}'
So that the whole command becomes this :
find /path/to/dist -type f -iname "*.png" -newermt 2015-01-01 ! -newermt 2016-12-31 -ls | sort -k5 | awk '{$1=$2=$3=$4=$6=$8=$9=$10="" ; print$0}'
This leaves some extra spaces at the beginning of the line, hopefully it works for your purpose.
I have removed the second instance of sort, as it sorts on the same key as the first, which does not seem likely to do anything.

Well thanks to Arno's input, the following line does the job. I used exec (-exec ls -lh {} \;) to make the size human readable:
find /Path/To/Dest/ -type f -iname "*.pdf" -newermt 2015-01-01 ! -newermt 2016-12-31 -exec ls -lh {} \; |sed 's/\\ /!!!/g' | sort -k5 | awk '{gsub("!!!"," ",$11);print $3"\t"$5"\t"$9}'

I found the following solution. You hide the space in filename. I did it with a sed, I used a strange chain "!!!" to replace "\ ". Then I replace it in awk command. Here is the command I used for my tests:
find . -type f -iname "*.pdf" -newermt 2015-01-01 -ls |sed 's/\\ /!!!/g' | sort -k5 | awk '{gsub("!!!"," ",$11);print $5"\t"$7"\t"$11}'

The -print0 action of find is probably the starting point. From find's manual page:
-print0
True; print the full file name on the standard output,
followed by a null character (instead of the newline
character that -print uses). This allows file names that
contain newlines or other types of white space to be cor‐
rectly interpreted by programs that process the find out‐
put. This option corresponds to the -0 option of xargs.
But find has the nice printf action that is even better:
find /path/to/dist -type f -iname "*.png" -newermt 2015-01-01 ! -newermt 2016-12-31 -printf "%u\t%s\t%p\n" | sort
probably does the job.

Related

I want to grep a pattern inside a file n list that file based on current date

ls -l | grep "Feb 22" | grep -l "good" *
This is the command i am using . i have 4 files among which one file contains the world good . I want to list that file . And that file creation is the current date . based on both the criteria i want to list that file
Try this :
find . -type f -newermt 2018-02-21 ! -newermt 2018-02-22 -exec grep -l good {} \;
or
find . -type f -newermt 2018-02-21 ! -newermt 2018-02-22 | xargs grep -l good
And please, don't parse ls output
Hi Try with below command. How it works? Here find command with parameter -mtime -1 will search for files with current date in current directory as well as its sub directories. Each file found will be pass to grep command one at a time. grep command will check for the string in that file (means each file passes to it)
find . -mtime -1 -type f | xargs grep -i "good"
In the above command it will list all the file with current date. To list a files of specific kind you below command. Here I am listing only txt files.
find . -name "*.txt" -mtime -1 -type f | xargs grep -i "good"
find . is for running it from current directory (dot means current directory). To run it from a specific directory path modify like below:-
find /yourpath/ -name "*.txt" -mtime -1 -type f | xargs grep -i "good"
Also grep -i means "ignore case". For a specific case just use grep "good"

BASH script : list all files including subdirectories and sort them by date

I have a bash script:
for entry in "/home/pictures"/*
do
echo "ls -larth $entry"
done
I want to list also the files in subfolders and include their path
I want to sort the results by date
It must be a bash script, because some other software (Jenkins) will call it .
Try find.
find /home/pictures -type f -exec ls -l --full-time {} \; | sort -k 6
If there are no newlines in file names use:
find /home/pictures -type f -printf '%T# %p\n'|sort -n
If you can not tolerate timestamps in output, use:
find /home/pictures -type f -printf '%28T# %p\n' | sort -n | cut -c30-
If there is possibility of newlines in file name, and, if you can make the program that consumes the output accept null terminated records, you can use:
find /home/pictures -type f -printf '%T#,%p\0' | sort -nz
For no timestamps in output, use:
find /home/pictures -type f -printf '%28T# %p\0' | sort -nz | cut -zc30-
P.S.
I have assumed that you want to sort by last modification time.
I found out the solution for my question:
find . -name * -exec ls -larth {} +

Bash Filtering strings with other than alphanumeric symbols

I have this line in my script
find $DIR -type f \( -iname "*.*" ! -iname ".*" \) | awk -F. '{print $NF}' | sort -u
And it bassicaly just finds every non-hidden files and prints its extension one per line, then sorts for duplicates, so an output could be for example:
exe
c
x
png
lg_CNG
new
lib-old
s
I made that up, it should be in alphabetical order as well, but my question is can I somehow exclude those having any non-alphabetical symbol in them (_,-,/,.....)? Thank you
I have made a little change to your origin command, it works for me:
find . -type f -iname '*' ! -iname '.*' | sed -r -e '/[-_]/d' | awk -F'.' '{ print $NF }' | sort -u

Use find, wc, and sed to count lines

I was trying to use sed to count all the lines based on a particular extension.
find -name '*.m' -exec wc -l {} \; | sed ...
I was trying to do the following, how would I include sed in this particular line to get the totals.
You may also get the nice formatting from wc with :
wc `find -name '*.m'`
Most of the answers here won't work well for a large number of files. Some will break if the list of file names is too long for a single command line call, others are inefficient because -exec starts a new process for every file. I believe a robust and efficient solution would be:
find . -type f -name "*.m" -print0 | xargs -0 cat | wc -l
Using cat in this way is fine, as its output is piped straight into wc so only a small amount of the files' content is kept in memory at once. If there are too many files for a single invocation of cat, cat will be called multiple times, but all the output will still be piped into a single wc process.
You can cat all files through a single wc instance to get the total number of lines:
find . -name '*.m' -exec cat {} \; | wc -l
On modern GNU platforms wc and find take -print0 and -files0-from parameters that can be combined into a command that count lines in files with total at the end. Example:
find . -name '*.c' -type f -print0 | wc -l --files0-from=-
you could use sed also for counting lines in place of wc:
find . -name '*.m' -exec sed -n '$=' {} \;
where '$=' is a "special variable" that keep the count of lines
EDIT
you could also try something like sloccount
Hm, solution with cat may be problematic if you have many files, especially big ones.
Second solution doesn't give total, just lines per file, as I tested.
I'll prefer something like this:
find . -name '*.m' | xargs wc -l | tail -1
This will do the job fast, no matter how many and how big files you have.
sed is not the proper tool for counting. Use awk instead:
find . -name '*.m' -exec awk '{print NR}' {} +
Using + instead of \; forces find to call awk every N files found (like with xargs).
For big directories we should use:
find . -type f -name '*.m' -exec sed -n '$=' '{}' + 2>/dev/null | awk '{ total+=$1 }END{print total}'
# alternative using awk twice
find . -type f -name '*.m' -exec awk 'END {print NR}' '{}' + 2>/dev/null | awk '{ total+=$1 }END{print total}'

total size of group of files selected with 'find'

For instance, I have a large filesystem that is filling up faster than I expected. So I look for what's being added:
find /rapidly_shrinking_drive/ -type f -mtime -1 -ls | less
And I find, well, lots of stuff. Thousands of files of six-seven types. I can single out a type and count them:
find /rapidly_shrinking_drive/ -name "*offender1*" -mtime -1 -ls | wc -l
but what I'd really like is to be able to get the total size on disk of these files:
find /rapidly_shrinking_drive/ -name "*offender1*" -mtime -1 | howmuchspace
I'm open to a Perl one-liner for this, if someone's got one, but I'm not going to use any solution that involves a multi-line script, or File::Find.
The command du tells you about disk usage. Example usage for your specific case:
find rapidly_shrinking_drive/ -name "offender1" -mtime -1 -print0 | du --files0-from=- -hc | tail -n1
(Previously I wrote du -hs, but on my machine that appears to disregard find's input and instead summarises the size of the cwd.)
Darn, Stephan202 is right. I didn't think about du -s (summarize), so instead I used awk:
find rapidly_shrinking_drive/ -name "offender1" -mtime -1 | du | awk '{total+=$1} END{print total}'
I like the other answer better though, and it's almost certainly more efficient.
with GNU find,
find /path -name "offender" -printf "%s\n" | awk '{t+=$1}END{print t}'
I'd like to promote jason's comment above to the status of answer, because I believe it's the most mnemonic (though not the most generic, if you really gotta have the file list specified by find):
$ du -hs *.nc
6.1M foo.nc
280K foo_region_N2O.nc
8.0K foo_region_PS.nc
844K foo_region_xyz.nc
844K foo_region_z.nc
37M ETOPO1_Ice_g_gmt4.grd_region_zS.nc
$ du -ch *.nc | tail -n 1
45M total
$ du -cb *.nc | tail -n 1
47033368 total
Recently i faced the same(almost) problem and i came up with this solution.
find $path -type f -printf '%s '
It'll show files sizes in bytes, from man find:
-printf format
True; print format on the standard output, interpreting `\' escapes and `%' directives. Field widths and precisions can be spec‐
ified as with the `printf' C function. Please note that many of the fields are printed as %s rather than %d, and this may mean
that flags don't work as you might expect. This also means that the `-' flag does work (it forces fields to be left-aligned).
Unlike -print, -printf does not add a newline at the end of the string.
...
%s File's size in bytes.
...
And to get a total i used this:
echo $[ $(find $path -type f -printf %s+)0] #b
echo $[($(find $path -type f -printf %s+)0)/1024] #Kb
echo $[($(find $path -type f -printf %s+)0)/1024/1024] #Mb
echo $[($(find $path -type f -printf %s+)0)/1024/1024/1024] #Gb
I have tried all this commands but no luck.
So I have found this one that gives me an answer:
find . -type f -mtime -30 -exec ls -l {} \; | awk '{ s+=$5 } END { print s }'
Since OP specifically said:
I'm open to a Perl one-liner for this, if someone's got one, but I'm
not going to use any solution that involves a multi-line script, or
File::Find.
...and there's none yet, here is the perl one-liner:
find . -name "*offender1*" | perl -lne '$total += -s $_; END { print $total }'
You could also use ls -l to find their size, then awk to extract the size:
find /rapidly_shrinking_drive/ -name "offender1" -mtime -1 | ls -l | awk '{print $5}' | sum

Resources