the script is running normally on ubuntu Linux and i can call bin_packing.awk, but when I try to run it on unix solaris I'm getting an error:
find: bad option -printf
find: [-H | -L] path-list predicate-list
awk: syntax error near line 1
awk: bailing out near line 1
this is the script that works on ubuntu
$ find . -type f -iname '*pdf' -printf "%s %p\n" \
| awk -v c=100000 -f bin_packing.awk
i have tried this and it works but without | awk...part
$ find . -type f -name '*.pdf' -print | perl -lne '$,=" "; #s=stat $_; print $s[7],$_' \
| awk -v c=100000 -f bin_packing.awk
On modern systems, you can use GNU stat or GNU find to extract size without needing to do something awful like parse ls.
Unfortunately, you're not on a modern system, so it's time to do something awful. Fortunately, size is one of the fields of ls that can be semi-reliably parsed (when running it over only one file at a time) as long as you're on a platform that doesn't allow crazy things like usernames with spaces.
find . -type f -iname '*.pdf' -exec bash -c '
for name; do
read -r _ _ _ _ size _ < <(ls -l -- "$name")
printf "%s %s\n" "$size" "$name"
done
' _ {} + | awk -v c=100000 -f bin_packing.awk
If -exec ... {} + syntax doesn't work, you can change the + to a \; to make this slower but more compatible.
If you have the current version of Solaris you can use find's -print option. For older Solaris, use gfind (for GNU find).
Related
I have a command that will find all .png image files on my linux machine, but is there something I can add to it that will also copy each one over to a mapped drive / location I have set up?
Code to find .png is here.
find / -type f -exec file --mime-type {} \; | awk '{if ($NF == "image/png") print $0 }'
The mapped drive on the machine is as follows, as reported via df -h
/mnt/nas
If you need to check the MIME type, after your command, extract a file name part by using sed and execute cp by using xargs.
find / -type f -exec file --mime-type {} \; | awk '{if ($NF == "image/png") print $0 }' \
| sed 's/:.\+$//' | xargs -i cp -p {} /mnt/nas
If you can determine the file type by the extension .png, the command is below:
find / -name '*.png' -exec cp -p {} /mnt/nas \;
or
find / -name '*.png' | xargs -i cp -p {} /mnt/nas
Note that I checked the commands on GNU bash only.
It's a bit strange that you run file --mime-type to determine whether the file is a png. Couldn't you just look at the file extension like so?
find / -type f -iname '*.png'
find uses short-circuit evaluation. If one of its test (e.g. -iname, but also -exec) fails, then the subsequent tests will be ignored. Therefore you can use
find / -type f -iname '*.png' -exec cp {} /mnt/nas \;
For faster execution you might want to switch to -exec cp -t /mnt/nas {} + if your cp supports -t.
If you want to stick to file --mime-type, use
find / -type f -exec sh -c \
'file -b --mime-type "$0" | grep -qx image/png && cp "$0" /mnt/nas' {} \;
Both of these approaches correctly handle all filenames, even such with linebreaks in them.
You could adjust your original command by printing the filename instead of the whole line in awk. Then copy the file with xargs (you can also use the awk system subcommand)
find / -type f -exec file --mime-type {} \; | \
awk '{if ($NF == "image/png") print $1}' \
xargs -I% cp % /mnt/nas
Note that the print of the filename in awk is brittle because the awk default delimiter is space/tabs. And the filename may contain space.
I would specify the ': ' delimiter since that is which one used by file --mime-type (and also not need a if, awk has a straight way for that case)
find / -type f -exec file --mime-type {} \; | \
awk -F ': ' '$NF=="image/png" {print $1}' \
xargs -I% cp % /mnt/nas
If ":" is used in the filenames you could use any other separator such as #####
I run this from a user's home dir to show me the most recent files while omitting the shell profile files:
find ./ -type f -printf "%T# %p\n"|grep -vP "/\.(bash|emacs|gtkrc|kde/|zshrc)" |sort -n| tail -10|cut -f2- -d" "|while read EACH; do ls -l "$EACH"; done;
This works, but just not as well when placed in my .bashrc as an alias:
alias recentfiles='find ./ -type f -printf "%T# %p\n"|grep -vP "/\.\(bash|emacs|gtkrc|kde/|zshrc\)"|sort -n| tail -10|cut -f2- -d" "|while read EACH; do ls -l "$EACH"; done;'
In the image you see the results without doing any filtering, followed by the desired result using grep -v for filtering which works on command line. Then final result - only partially succeeds in weeding out those files.
I have tried using bash_ and [b]ash. Not even bas (which fails to even get .basin) work ?!? And also I can use macs or acs AND still get the .emacs omitted so obviously the syntax in my alias is not respecting the /. either. Not a problem with reserved words as I originally thought.
I DO get the expected results if I place my original command as is in a file and then use the alias that way:
alias recentfiles='. /root/mycommands/recentfiles'
Can someone explain or point me to a reference to understand what is at play here? I wouldn't know what phrase with the proper terms to search on.
This should fix your problems:
alias recentfiles='find ./ -type f -printf "%T# %p\n"|grep -vP "/\.(bash|emacs|gtkrc|kde/|zshrc)"|sort -n| tail -10|cut -f2- -d" "|while read EACH; do ls -l "$EACH"; done;'
The issue is with grep -P, where -P makes it use the perl regular expressions. In perl there is no need to use \ in grouping. So (bash|emacs|...) instead of \(bash|emacs|...\) . I really doubt it worked outside of .bashrc, unless you have some alias for grep which make it behave differently outside of .bashrc.
As other have said in the comments, your filtering is inefficient. Better rewrite your command with:
find ./ \( -name ".bash*" -o -name ".emacs*" -o -name .gtkrc -o -name .kde -o -name .zshrc \) -prune -o \( -type f -printf "%T# %p\n" \) |sort -n| tail -10|cut -f2- -d" "| tr "\n" "\0" | xargs -0 ls -l;
This way it will not waste time searching files inside .emacs.d/ or inside .kde/, and will immediately prune the search. Also, xargs -0 ls -l is so much shorter and clearer than the while loop.
To avoid issues with filenames that contain newlines, better use \0 characters, that are never part of a file name:
find ./ \( -name ".bash*" -o -name .emacs -o -name .gtkrc -o -name .kde -o -name .zshrc \) -prune -o \( -type f -printf "%T# %p\0" \) |sort -n -z | tail -z -n -10| cut -z -f2- -d" " | xargs -0 ls -l
Part 1: Fixing The Issue
Use a function instead.
There are several major issues with aliases:
Because you pass your content to be string-prefixed inside quotes when creating an alias, it's parsed differently than it would be when typed directly at the command line.
Because an alias is simple prefix substitution, they don't have their own arguments ($1, $2, etc); they don't have a call stack; debugging mechanisms like PS4=':$BASH_SOURCE:$LINENO+'; set -x can't tell you which file code from an alias originated in; etc.
Aliases are an interactive feature; POSIX doesn't mandate that shells support them at all, and they're turned off by default during script execution.
Functions solve all these problems.
recentfiles() {
find ./ \
'(' -name '.bash*' -o -name '.emacs*' -o -name .gtkrc -o -name .kde -o -name .zshrc ')' -prune \
-o -type f -printf "%T# %p\0" |
sort -nz |
tail -z -n -10 |
while read -d' ' _ && IFS= read -r -d '' file; do
printf '%s\0' "$file"
done |
xargs -0 ls -ld --
}
Note that I also made several other changes:
Instead of using \n as a separator, the above code uses \0. This is because newlines can be found in filenames; a file that contained newlines in its name could look like any number of files, with any arbitrary sizes it wanted, to the rest of your pipeline. (Unfortunately, POSIX doesn't require that sort and tail support newline delimiters, so the -z options used above are GNUisms).
Instead of using grep -v to remove dotfiles, I used the -prune option to find. This is particularly important for directories like .kde, since it stops find from spending the time and I/O bandwidth to recurse down directories for which you intend to throw the results away anyhow.
For documentation of the importance of the IFS= and -r arguments used in the while read loop, see BashFAQ #1. Both of these improve behavior in presence of unusual filenames (clearing IFS prevents trailing whitespace from being stripped; passing -r prevents literal backslashes from being elided).
Instead of grep -P -- a GNU extension which is only available if grep was compiled with libpcre support -- my first cut (prior to moving to find -prune) switched to grep -E, which is adequately expressive, much more widely available, and lends itself to higher performance implementations.
Part 2: Explaining The Issue
Running your alias after set -x, we see:
+ find ./ -type f -printf '%T# %p\n'
+ grep -vP '/\.\(bash|emacs|gtkrc|kde/|zshrc\)'
+ sort -n
+ tail -10
+ cut -f2- '-d '
+ read EACH
By contrast, running the command it was intended to wrap, we see:
+ find ./ -type f -printf '%T# %p\n'
+ grep -vP '/\.(bash|emacs|gtkrc|kde/|zshrc)'
+ sort -n
+ tail -10
+ cut -f2- '-d '
+ read EACH
In the command itself, there are no literal backslashes before ( and ).
my command was this
ls -l|grep "\-[r,-][w,-]x*"|tr -s " " | cut -d" " -f9
but for the result I get all the files, not only the ones for which user has a right to execute ( the first x bit is set on).
I'm running linux ubuntu
You can use find with the -perm option:
find . -maxdepth 1 -type f -perm -u+x
OK -- if you MUST use grep:
ls -l | grep '^[^d]..[sx]' | awk '{ print $9 }'
Don't use grep. If you want to know if a file is executable, use test -x. To check all files in the current directory, use find or a for loop:
for f in *; do test -f "$f" -a -x "$f" && echo "$f"; done
or
find . -maxdepth 1 -type f -exec test -x {} \; -print
Use awk with match
ls -l|awk 'match($1,/^...x/) {print $9}'
match($1,/^...x/): match first field for the regular expression ^...x, ie search for owner permission ending in x.
The below example shows the way how I need the file search and output type which works well in local find.
> find /DBBACKMEUP/ -not -name "localhost*" -type f -name "*2012-10-26*" -exec du -b {} \; | awk '{print $2 "\t" $1}' | awk -F'/' '{print $NF}'
monitor_2012-10-26_22h00m.11.29.135.Friday.sql.gz 119601
test_2012-10-26_22h00m.10.135.Friday.sql.gz 530
status_2012-10-26_22h00m.1.29.135.Friday.sql.gz 944
But I need to print the same command on many servers. So I have planned to exec like this.
>ssh root#192.168.87.80 "find /DBBACKMEUP/ -not -name "localhost*" -type f -name "*2012-10-26*" -exec du -b {} \; | awk '{print $2 "\t" $1}' | awk -F'/' '{print $NF}'"
Ofcourse this gives be a blank output. Any way to parse such a search string in shell and get the output that I desire by ssh?
Thanks!!
Looks like your ssh command there has lots of quotes and double-quotes, which may be the root of your problem (no pun intended). I'd recommend that you create a shell script that will run the find command you desire, them place a copy of it on each server. After that, simply use ssh to execute that shell script instead of trying to pass in a complex command.
Edit:
I think I misunderstood; please correct me if I'm wrong. Are you looking for a way to create a loop that will run the command on a range of IP addresses? If so, here's a recommendation - create a shell script like this:
#!/bin/bash
for ((C=0; C<255; C++)) ; do
for ((D=0; D<255; D++)) ; do
IP="192.168.$C.$D"
ssh root#$IP "find /DBBACKMEUP/ -not -name "localhost*" -type f -name "*2012-10-26*" -exec du -b {} \; | awk '{print "\$"2 \"\\t\" "\$"1}' | awk -F'/' '{print "\$"NF}'"
done
done
Each server?? That must be 749 servers - Your option goes good for hardworkers.. my approach goes good for lazy goose ;) Just a trial did the click ;)
ssh root#192.168.47.203 "find /DBBACKMEUP/ -not -name "localhost*" -type f -name "*2012-10-26*" -exec du -b {} \; | awk '{print "\$"2 \"\\t\" "\$"1}' | awk -F'/' '{print "\$"NF}'"
Tel_Avaya_Log_2012-10-26_22h00m.105.23.Friday.sql.gz 2119
test_2012-10-26_22h00m.10.25.Friday.sql.gz 529
OBD_2012-10-26_22h00m.103.2.203.Friday.sql.gz 914
For instance, I have a large filesystem that is filling up faster than I expected. So I look for what's being added:
find /rapidly_shrinking_drive/ -type f -mtime -1 -ls | less
And I find, well, lots of stuff. Thousands of files of six-seven types. I can single out a type and count them:
find /rapidly_shrinking_drive/ -name "*offender1*" -mtime -1 -ls | wc -l
but what I'd really like is to be able to get the total size on disk of these files:
find /rapidly_shrinking_drive/ -name "*offender1*" -mtime -1 | howmuchspace
I'm open to a Perl one-liner for this, if someone's got one, but I'm not going to use any solution that involves a multi-line script, or File::Find.
The command du tells you about disk usage. Example usage for your specific case:
find rapidly_shrinking_drive/ -name "offender1" -mtime -1 -print0 | du --files0-from=- -hc | tail -n1
(Previously I wrote du -hs, but on my machine that appears to disregard find's input and instead summarises the size of the cwd.)
Darn, Stephan202 is right. I didn't think about du -s (summarize), so instead I used awk:
find rapidly_shrinking_drive/ -name "offender1" -mtime -1 | du | awk '{total+=$1} END{print total}'
I like the other answer better though, and it's almost certainly more efficient.
with GNU find,
find /path -name "offender" -printf "%s\n" | awk '{t+=$1}END{print t}'
I'd like to promote jason's comment above to the status of answer, because I believe it's the most mnemonic (though not the most generic, if you really gotta have the file list specified by find):
$ du -hs *.nc
6.1M foo.nc
280K foo_region_N2O.nc
8.0K foo_region_PS.nc
844K foo_region_xyz.nc
844K foo_region_z.nc
37M ETOPO1_Ice_g_gmt4.grd_region_zS.nc
$ du -ch *.nc | tail -n 1
45M total
$ du -cb *.nc | tail -n 1
47033368 total
Recently i faced the same(almost) problem and i came up with this solution.
find $path -type f -printf '%s '
It'll show files sizes in bytes, from man find:
-printf format
True; print format on the standard output, interpreting `\' escapes and `%' directives. Field widths and precisions can be spec‐
ified as with the `printf' C function. Please note that many of the fields are printed as %s rather than %d, and this may mean
that flags don't work as you might expect. This also means that the `-' flag does work (it forces fields to be left-aligned).
Unlike -print, -printf does not add a newline at the end of the string.
...
%s File's size in bytes.
...
And to get a total i used this:
echo $[ $(find $path -type f -printf %s+)0] #b
echo $[($(find $path -type f -printf %s+)0)/1024] #Kb
echo $[($(find $path -type f -printf %s+)0)/1024/1024] #Mb
echo $[($(find $path -type f -printf %s+)0)/1024/1024/1024] #Gb
I have tried all this commands but no luck.
So I have found this one that gives me an answer:
find . -type f -mtime -30 -exec ls -l {} \; | awk '{ s+=$5 } END { print s }'
Since OP specifically said:
I'm open to a Perl one-liner for this, if someone's got one, but I'm
not going to use any solution that involves a multi-line script, or
File::Find.
...and there's none yet, here is the perl one-liner:
find . -name "*offender1*" | perl -lne '$total += -s $_; END { print $total }'
You could also use ls -l to find their size, then awk to extract the size:
find /rapidly_shrinking_drive/ -name "offender1" -mtime -1 | ls -l | awk '{print $5}' | sum