Use string as argument in a bash scipt - bash

Trying to write a script that use vlc to create a playlist.
#!/bin/bash
filename=/media/*/*.mp3
while [ "$1" != "" ]; do
case $1 in
-f | --filepath ) shift
filename=$1
;;
-h | --help ) usage
exit
;;
* ) usage
exit 1
esac
shift
done
#echo $filename
vlc $filename --novideo --quiet
This script is working but it only finds mp3 files in the root of any usb device. So i want to change the filename variable. This code gives similar results but it lists evething.
filename=$(find /media/* -name *.mp3 -print)
filename=$(tr '\n' ' ' <<<$filename)
Now the problem is that i can't pass it as an argument. I tried:
vlc $filename --novideo --quiet
or
vlc $*filename --novideo --quiet
or
vlc "$filename" --novideo --quiet
nothing worked. Any suggestions?
UPDATE:
Guys the problem I want help with is how to make vlc accept the filename variable as argument or arguments of files to use in the playlist. filename contains
/media/MULTIBOOT/Linkin Park - In The End.mp3 /media/MULTIBOOT/Man with a
Mission ft. Takuma - Database.mp3 /media/MULTIBOOT/Sick Puppies - You're
Going Down.mp3 /media/MULTIBOOT/Skillet - Rise.mp3 /media/MULTIBOOT/Song
Riders - Be.mp3 /media/MULTIBOOT/30 Seconds to Mars - This is War.mp3
/media/MULTIBOOT/Fade - One Reason.mp3
Now this is a string how to use it file path arguments?

I would use bash's recursive globbing and arrays:
#!/bin/bash
shopt -s globstar nullglob
files=()
while [[ $1 ]]; do
case $1 in
-f | --filepath ) shift
files+=("$1")
;;
-h | --help ) usage
exit
;;
* ) usage
exit 1
esac
shift
done
if [[ ${#files[#]} -eq 0 ]]; then
files=( /media/**/*.mp3 )
if [[ ${#files[#]} -eq 0 ]]; then
echo "no mp3 files found"
exit 1
fi
fi
#printf "%s\n" "${files[#]}"
vlc "${files[#]}" --novideo --quiet
With this code, you can specify -f filename multiple times to play a few songs.

You need to quote *.mp3. Otherwise it will be expanded in the current directory.
filename=$(find /media/* -name '*.mp3' -print)
You also don't need to remove the newlines. When you use the variable without quoting it, all whitespace, including newlines, will be converted to word delimiters.

rather than storing all filenames in a variable, you can tell find to call an application with all the files. this will prevent problems with whitespace, newlines and the like:
find /media -name '*.mp3' -exec vlc --novideo --quiet \{\} \+

A better way to handle options in your script might be to use getopts, if you don't mind losing the option of long options. For example:
#!/usr/bin/env bash
while getopts vqt opt; do
case "$opt" in
f) filename=($OPTARG) ;;
h) usage; exit 0 ;;
*) usage; exit 1 ;;
esac
done
shift $((OPTIND - 1))
filename=($(find /media/ -name \*.mp3 -type f))
vlc --novideo --quiet "${filename[#]}"
I don't know this usage of VLC, but the effect of this script is to create a command line with all the files found by the find command, which were stored in array called $filename.
An advantage of handling things in an array is that it lends itself to use in for loops.
for thisfile in "${filename[#]}"; do
vlc "$thisfile" # with options to convert just one file
done
NOTE that since you're using bash, you may not need to use find at all.
shopt -s globstar
filelist=(/media/**/*.mp3)
Check man bash for discussion of globstar.

Related

How can I find the exist file in shell script with * argument [duplicate]

If I want to check for the existence of a single file, I can test for it using test -e filename or [ -e filename ].
Supposing I have a glob and I want to know whether any files exist whose names match the glob. The glob can match 0 files (in which case I need to do nothing), or it can match 1 or more files (in which case I need to do something). How can I test whether a glob has any matches? (I don't care how many matches there are, and it would be best if I could do this with one if statement and no loops (simply because I find that most readable).
(test -e glob* fails if the glob matches more than one file.)
Bash-specific solution:
compgen -G "<glob-pattern>"
Escape the pattern or it'll get pre-expanded into matches.
Exit status is:
1 for no-match,
0 for 'one or more matches'
stdout is a list of files matching the glob.
I think this is the best option in terms of conciseness and minimizing potential side effects.
Example:
if compgen -G "/tmp/someFiles*" > /dev/null; then
echo "Some files exist."
fi
The nullglob shell option is indeed a bashism.
To avoid the need for a tedious save and restore of the nullglob state, I'd only set it inside the subshell that expands the glob:
if test -n "$(shopt -s nullglob; echo glob*)"
then
echo found
else
echo not found
fi
For better portability and more flexible globbing, use find:
if test -n "$(find . -maxdepth 1 -name 'glob*' -print -quit)"
then
echo found
else
echo not found
fi
Explicit -print -quit actions are used for find instead of the default implicit -print action so that find will quit as soon as it finds the first file matching the search criteria. Where lots of files match, this should run much faster than echo glob* or ls glob* and it also avoids the possibility of overstuffing the expanded command line (some shells have a 4K length limit).
If find feels like overkill and the number of files likely to match is small, use stat:
if stat -t glob* >/dev/null 2>&1
then
echo found
else
echo not found
fi
I like
exists() {
[ -e "$1" ]
}
if exists glob*; then
echo found
else
echo not found
fi
This is both readable and efficient (unless there are a huge number of files).
The main drawback is that it's much more subtle than it looks, and I sometimes feel compelled to add a long comment.
If there's a match, "glob*" is expanded by the shell and all the matches are passed to exists(), which checks the first one and ignores the rest.
If there's no match, "glob*" is passed to exists() and found not to exist there either.
Edit: there may be a false positive, see comment
#!/usr/bin/env bash
# If it is set, then an unmatched glob is swept away entirely --
# replaced with a set of zero words --
# instead of remaining in place as a single word.
shopt -s nullglob
M=(*px)
if [ "${#M[*]}" -ge 1 ]; then
echo "${#M[*]} matches."
else
echo "No such files."
fi
If you have globfail set you can use this crazy (which you really should not)
shopt -s failglob # exit if * does not match
( : * ) && echo 0 || echo 1
or
q=( * ) && echo 0 || echo 1
test -e has the unfortunate caveat that it considers broken symbolic links to not exist. So you may want to check for those, too.
function globexists {
test -e "$1" -o -L "$1"
}
if globexists glob*; then
echo found
else
echo not found
fi
I have yet another solution:
if [ "$(echo glob*)" != 'glob*' ]
This works nicely for me. There may be some corner cases I missed.
Based on flabdablet's answer, for me it looks like easiest (not necessarily fastest) is just to use find itself, while leaving glob expansion on shell, like:
find /some/{p,long-p}ath/with/*globs* -quit &> /dev/null && echo "MATCH"
Or in if like:
if find $yourGlob -quit &> /dev/null; then
echo "MATCH"
else
echo "NOT-FOUND"
fi
To simplify miku's answer somewhat, based on his idea:
M=(*py)
if [ -e ${M[0]} ]; then
echo Found
else
echo Not Found
fi
In Bash, you can glob to an array; if the glob didn't match, your array will contain a single entry that doesn't correspond to an existing file:
#!/bin/bash
shellglob='*.sh'
scripts=($shellglob)
if [ -e "${scripts[0]}" ]
then stat "${scripts[#]}"
fi
Note: if you have nullglob set, scripts will be an empty array, and you should test with [ "${scripts[*]}" ] or with [ "${#scripts[*]}" != 0 ] instead. If you're writing a library that must work with or without nullglob, you'll want
if [ "${scripts[*]}" ] && [ -e "${scripts[0]}" ]
An advantage of this approach is that you then have the list of files you want to work with, rather than having to repeat the glob operation.
If you want to test if the files exist before iterating over them, you can use this pattern:
for F in glob*; do
if [[ ! -f $F ]]; then break; fi
...
done
if the glob does not does not match anything, $F will be the non-expanded glob ('glob*' in this case) and if a file with the same name does not exist, it will skip the rest of the loop.
#!/bin/bash
set nullglob
touch /tmp/foo1 /tmp/foo2 /tmp/foo3
FOUND=0
for FILE in /tmp/foo*
do
FOUND=$((${FOUND} + 1))
done
if [ ${FOUND} -gt 0 ]; then
echo "I found ${FOUND} matches"
else
echo "No matches found"
fi
set -- glob*
if [ -f "$1" ]; then
echo "It matched"
fi
Explanation
When there isn't a match for glob*, then $1 will contain 'glob*'. The test -f "$1" won't be true because the glob* file doesn't exist.
Why this is better than alternatives
This works with sh and derivates: KornShell and Bash. It doesn't create any sub-shell. $(..) and `...` commands create a sub-shell; they fork a process, and therefore are slower than this solution.
Like this in Bash (test files containing pattern):
shopt -s nullglob
compgen -W *pattern* &>/dev/null
case $? in
0) echo "only one file match" ;;
1) echo "more than one file match" ;;
2) echo "no file match" ;;
esac
It's far better than compgen -G: because we can discriminates more cases and more precisely.
It can work with only one wildcard *.
This abomination seems to work:
#!/usr/bin/env bash
shopt -s nullglob
if [ "`echo *py`" != "" ]; then
echo "Glob matched"
else
echo "Glob did not match"
fi
It probably requires bash, not sh.
This works because the nullglob option causes the glob to evaluate to an empty string if there are no matches. Thus any non-empty output from the echo command indicates that the glob matched something.
A solution for extended globs (extglob) in Bash:
bash -c $'shopt -s extglob \n /bin/ls -1U <ext-glob-pattern>'
Exit status is 0 if there is at least one match, and non-zero (2) when there is no match. Standard output contains a newline-separated list of matching files (and file names containing spaces they are quoted).
Or, slightly different:
bash -c $'shopt -s extglob \n compgen -G <ext-glob-pattern>'
Differences to the ls-based solution: probably faster (not measured), file names with spaces not quoted in output, exit code 1 when there is no match (not 2 :shrug:).
Example usage:
No match:
$ bash -c $'shopt -s extglob \n /bin/ls -1U #(*.foo|*.bar)'; echo "exit status: $?"
/bin/ls: cannot access '#(*.foo|*.bar)': No such file or directory
exit status: 2
At least one match:
$ bash -c $'shopt -s extglob \n /bin/ls -1U #(*.ts|*.mp4)'; echo "exit status: $?"
'video1 with spaces.mp4'
video2.mp4
video3.mp4
exit status: 0
Concepts used:
ls' exit code behavior (adds -U for efficiency, and -1 for output control).
Does not enable extglob in current shell (often not desired).
Makes use of the $ prefix so that the \n is interpreted, so that the extended glob pattern is on a different line than the shopt -s extglob -- otherwise the extended glob pattern would be a syntax error!
Note 1: I worked towards this solution because the compgen -G "<glob-pattern>" approach suggested in other answers does not seem to work smoothly with brace expansion; and yet I needed some more advanced globbing features.
Note 2: lovely resource for the extended glob syntax: extglob
Both nullglob and compgen are useful only on some bash shells.
A (non-recursive) solution that works on most shells is:
set -- ./glob* # or /path/dir/glob*
[ -f "$1" ] || shift # remove the glob if present.
if [ "$#" -lt 1 ]
then echo "at least one file found"
fi
Assuming you may want to do something with the files if they exist:
mapfile -t exists < <(find "$dirName" -type f -iname '*.zip'); [[ ${#exists} -ne 0 ]] && { echo "Zip files found" ; } || { echo "Zip files not found" ; }
You can then loop through the exists array if you need to do something with the files.
(ls glob* &>/dev/null && echo Files found) || echo No file found
if ls -d $glob > /dev/null 2>&1; then
echo Found.
else
echo Not found.
fi
Note that this can be very time cosuming if there are a lot of matches or file access is slow.
ls | grep -q "glob.*"
Not the most efficient solution (if there's a ton of files in the directory it might be slowish), but it's simple, easy to read and also has the advantage that regexes are more powerful than plain Bash glob patterns.
[ `ls glob* 2>/dev/null | head -n 1` ] && echo true

Find multiple file formats in IF condition in Shell Script [duplicate]

If I want to check for the existence of a single file, I can test for it using test -e filename or [ -e filename ].
Supposing I have a glob and I want to know whether any files exist whose names match the glob. The glob can match 0 files (in which case I need to do nothing), or it can match 1 or more files (in which case I need to do something). How can I test whether a glob has any matches? (I don't care how many matches there are, and it would be best if I could do this with one if statement and no loops (simply because I find that most readable).
(test -e glob* fails if the glob matches more than one file.)
Bash-specific solution:
compgen -G "<glob-pattern>"
Escape the pattern or it'll get pre-expanded into matches.
Exit status is:
1 for no-match,
0 for 'one or more matches'
stdout is a list of files matching the glob.
I think this is the best option in terms of conciseness and minimizing potential side effects.
Example:
if compgen -G "/tmp/someFiles*" > /dev/null; then
echo "Some files exist."
fi
The nullglob shell option is indeed a bashism.
To avoid the need for a tedious save and restore of the nullglob state, I'd only set it inside the subshell that expands the glob:
if test -n "$(shopt -s nullglob; echo glob*)"
then
echo found
else
echo not found
fi
For better portability and more flexible globbing, use find:
if test -n "$(find . -maxdepth 1 -name 'glob*' -print -quit)"
then
echo found
else
echo not found
fi
Explicit -print -quit actions are used for find instead of the default implicit -print action so that find will quit as soon as it finds the first file matching the search criteria. Where lots of files match, this should run much faster than echo glob* or ls glob* and it also avoids the possibility of overstuffing the expanded command line (some shells have a 4K length limit).
If find feels like overkill and the number of files likely to match is small, use stat:
if stat -t glob* >/dev/null 2>&1
then
echo found
else
echo not found
fi
I like
exists() {
[ -e "$1" ]
}
if exists glob*; then
echo found
else
echo not found
fi
This is both readable and efficient (unless there are a huge number of files).
The main drawback is that it's much more subtle than it looks, and I sometimes feel compelled to add a long comment.
If there's a match, "glob*" is expanded by the shell and all the matches are passed to exists(), which checks the first one and ignores the rest.
If there's no match, "glob*" is passed to exists() and found not to exist there either.
Edit: there may be a false positive, see comment
#!/usr/bin/env bash
# If it is set, then an unmatched glob is swept away entirely --
# replaced with a set of zero words --
# instead of remaining in place as a single word.
shopt -s nullglob
M=(*px)
if [ "${#M[*]}" -ge 1 ]; then
echo "${#M[*]} matches."
else
echo "No such files."
fi
If you have globfail set you can use this crazy (which you really should not)
shopt -s failglob # exit if * does not match
( : * ) && echo 0 || echo 1
or
q=( * ) && echo 0 || echo 1
test -e has the unfortunate caveat that it considers broken symbolic links to not exist. So you may want to check for those, too.
function globexists {
test -e "$1" -o -L "$1"
}
if globexists glob*; then
echo found
else
echo not found
fi
I have yet another solution:
if [ "$(echo glob*)" != 'glob*' ]
This works nicely for me. There may be some corner cases I missed.
Based on flabdablet's answer, for me it looks like easiest (not necessarily fastest) is just to use find itself, while leaving glob expansion on shell, like:
find /some/{p,long-p}ath/with/*globs* -quit &> /dev/null && echo "MATCH"
Or in if like:
if find $yourGlob -quit &> /dev/null; then
echo "MATCH"
else
echo "NOT-FOUND"
fi
To simplify miku's answer somewhat, based on his idea:
M=(*py)
if [ -e ${M[0]} ]; then
echo Found
else
echo Not Found
fi
In Bash, you can glob to an array; if the glob didn't match, your array will contain a single entry that doesn't correspond to an existing file:
#!/bin/bash
shellglob='*.sh'
scripts=($shellglob)
if [ -e "${scripts[0]}" ]
then stat "${scripts[#]}"
fi
Note: if you have nullglob set, scripts will be an empty array, and you should test with [ "${scripts[*]}" ] or with [ "${#scripts[*]}" != 0 ] instead. If you're writing a library that must work with or without nullglob, you'll want
if [ "${scripts[*]}" ] && [ -e "${scripts[0]}" ]
An advantage of this approach is that you then have the list of files you want to work with, rather than having to repeat the glob operation.
If you want to test if the files exist before iterating over them, you can use this pattern:
for F in glob*; do
if [[ ! -f $F ]]; then break; fi
...
done
if the glob does not does not match anything, $F will be the non-expanded glob ('glob*' in this case) and if a file with the same name does not exist, it will skip the rest of the loop.
#!/bin/bash
set nullglob
touch /tmp/foo1 /tmp/foo2 /tmp/foo3
FOUND=0
for FILE in /tmp/foo*
do
FOUND=$((${FOUND} + 1))
done
if [ ${FOUND} -gt 0 ]; then
echo "I found ${FOUND} matches"
else
echo "No matches found"
fi
set -- glob*
if [ -f "$1" ]; then
echo "It matched"
fi
Explanation
When there isn't a match for glob*, then $1 will contain 'glob*'. The test -f "$1" won't be true because the glob* file doesn't exist.
Why this is better than alternatives
This works with sh and derivates: KornShell and Bash. It doesn't create any sub-shell. $(..) and `...` commands create a sub-shell; they fork a process, and therefore are slower than this solution.
Like this in Bash (test files containing pattern):
shopt -s nullglob
compgen -W *pattern* &>/dev/null
case $? in
0) echo "only one file match" ;;
1) echo "more than one file match" ;;
2) echo "no file match" ;;
esac
It's far better than compgen -G: because we can discriminates more cases and more precisely.
It can work with only one wildcard *.
This abomination seems to work:
#!/usr/bin/env bash
shopt -s nullglob
if [ "`echo *py`" != "" ]; then
echo "Glob matched"
else
echo "Glob did not match"
fi
It probably requires bash, not sh.
This works because the nullglob option causes the glob to evaluate to an empty string if there are no matches. Thus any non-empty output from the echo command indicates that the glob matched something.
A solution for extended globs (extglob) in Bash:
bash -c $'shopt -s extglob \n /bin/ls -1U <ext-glob-pattern>'
Exit status is 0 if there is at least one match, and non-zero (2) when there is no match. Standard output contains a newline-separated list of matching files (and file names containing spaces they are quoted).
Or, slightly different:
bash -c $'shopt -s extglob \n compgen -G <ext-glob-pattern>'
Differences to the ls-based solution: probably faster (not measured), file names with spaces not quoted in output, exit code 1 when there is no match (not 2 :shrug:).
Example usage:
No match:
$ bash -c $'shopt -s extglob \n /bin/ls -1U #(*.foo|*.bar)'; echo "exit status: $?"
/bin/ls: cannot access '#(*.foo|*.bar)': No such file or directory
exit status: 2
At least one match:
$ bash -c $'shopt -s extglob \n /bin/ls -1U #(*.ts|*.mp4)'; echo "exit status: $?"
'video1 with spaces.mp4'
video2.mp4
video3.mp4
exit status: 0
Concepts used:
ls' exit code behavior (adds -U for efficiency, and -1 for output control).
Does not enable extglob in current shell (often not desired).
Makes use of the $ prefix so that the \n is interpreted, so that the extended glob pattern is on a different line than the shopt -s extglob -- otherwise the extended glob pattern would be a syntax error!
Note 1: I worked towards this solution because the compgen -G "<glob-pattern>" approach suggested in other answers does not seem to work smoothly with brace expansion; and yet I needed some more advanced globbing features.
Note 2: lovely resource for the extended glob syntax: extglob
Both nullglob and compgen are useful only on some bash shells.
A (non-recursive) solution that works on most shells is:
set -- ./glob* # or /path/dir/glob*
[ -f "$1" ] || shift # remove the glob if present.
if [ "$#" -lt 1 ]
then echo "at least one file found"
fi
Assuming you may want to do something with the files if they exist:
mapfile -t exists < <(find "$dirName" -type f -iname '*.zip'); [[ ${#exists} -ne 0 ]] && { echo "Zip files found" ; } || { echo "Zip files not found" ; }
You can then loop through the exists array if you need to do something with the files.
(ls glob* &>/dev/null && echo Files found) || echo No file found
if ls -d $glob > /dev/null 2>&1; then
echo Found.
else
echo Not found.
fi
Note that this can be very time cosuming if there are a lot of matches or file access is slow.
ls | grep -q "glob.*"
Not the most efficient solution (if there's a ton of files in the directory it might be slowish), but it's simple, easy to read and also has the advantage that regexes are more powerful than plain Bash glob patterns.
[ `ls glob* 2>/dev/null | head -n 1` ] && echo true

Shell script on mac and ubuntu 14.04: Boolean command line args

I'm trying to write a generic shell script to archive X days older files matching pattern passed as parameter. I'm having tough time making the boolean parameter, regex parameter and regex variable in script work across mac and ubuntu. I'm new to shell scripting. Any suggestion related to the problem or best practices are welcome. Following is the script:
#!/usr/bin/env bash
#Default source dir
SOURCE_DIR=./logs/
# Delete by default
DELETE=YES
# Archive files older by these many days
OLD=7
# Pattern to archive
PATTERN="*.log*"
# Use -gt 1 to consume two arguments per pass in the loop
# (each argument has a corresponding value to go with it).
while [[ $# -gt 1 ]]; do
key="$1"
case $key in
-s|--source_dir)
SOURCE_DIR="$2"
shift # past argument
;;
-d|--dest_dir)
DEST_DIR="$2"
shift # past argument
;;
-o|--days)
OLD="$2"
shift # past argument
;;
-p|--pattern)
PATTERN="$2"
shift # past argument
;;
-n|--no-delete)
DELETE=NO
;;
*)
# unknown option
;;
esac
shift # past argument or value
done
if [[ ! -d "$SOURCE_DIR" ]]; then
echo 'Archive source does not exist'
exit 1
fi
SOURCE_DIR=${SOURCE_DIR%/}
if [[ -z "$DEST_DIR" ]]; then
DEST_DIR="${SOURCE_DIR%/}/backup"
fi
DEST_DIR=${DEST_DIR%/}
if [[ ! -d "$DEST_DIR" ]]; then
echo 'Creating destination '$DEST_DIR
mkdir -p -- "$DEST_DIR"
fi
echo $SOURCE_DIR
echo $DEST_DIR
echo $OLD
echo $PATTERN
echo $DELETE
files=$(find $SOURCE_DIR -mtime +$OLD -type f -name $PATTERN)
echo $files
if [[ $DELETE = YES ]]; then
echo "Delete files"
else
echo "Don't delete files"
fi
Outpout on Mac:
(mysql30):recon-etl anshuc$ ./archive.sh -s junk/ -p *.py -o 10 -n
junk
junk/backup
10
*.py
YES
junk/__init__.py junk/client.py
Delete files
(mysql30):recon-etl anshuc$
Output on ubuntu 14.04
anshuc:~/workspace/xyz$ ./archive.sh -s ae/tools/ -d ae/logs/backup/ -p *.py -n -o 420
ae/tools
ae/logs/backup
420
*.py
NO
ae/tools/services/__init__.py ae/tools/__init__.py
Don't delete files
anshuc:~/workspace/xyz$
DELETE not working on mac is the concern. Also, I was having problem with PATTERN argument before. Though on trial and error I have come across a way to do. But am not sure of side-effects in case someone doesn't use quotes or any other intricacies that may be involved. A li'l input on that would make me more knowledged. :-)
TIA

How to find latest modified files and delete them with SHELL code

I need some help with a shell code. Now I have this code:
find $dirname -type f -exec md5sum '{}' ';' | sort | uniq --all-repeated=separate -w 33 | cut -c 35-
This code finds duplicated files (with same content) in a given directory. What I need to do is to update it - find out latest (by date) modified file (from duplicated files list), print that file name and also give opportunity to delete that file in terminal.
Doing this in pure bash is a tad awkward, it would be a lot easier to write
this in perl or python.
Also, if you were looking to do this with a bash one-liner, it might be feasible,
but I really don't know how.
Anyhoo, if you really want a pure bash solution below is an attempt at doing
what you describe.
Please note that:
I am not actually calling rm, just echoing it - don't want to destroy your files
There's a "read -u 1" in there that I'm not entirely happy with.
Here's the code:
#!/bin/bash
buffer=''
function process {
if test -n "$buffer"
then
nbFiles=$(printf "%s" "$buffer" | wc -l)
echo "================================================================================="
echo "The following $nbFiles files are byte identical and sorted from oldest to newest:"
ls -lt -c -r $buffer
lastFile=$(ls -lt -c -r $buffer | tail -1)
echo
while true
do
read -u 1 -p "Do you wish to delete the last file $lastFile (y/n/q)? " answer
case $answer in
[Yy]* ) echo rm $lastFile; break;;
[Nn]* ) echo skipping; break;;
[Qq]* ) exit;;
* ) echo "please answer yes, no or quit";;
esac
done
echo
fi
}
find . -type f -exec md5sum '{}' ';' |
sort |
uniq --all-repeated=separate -w 33 |
cut -c 35- |
while read -r line
do
if test -z "$line"
then
process
buffer=''
else
buffer=$(printf "%s\n%s" "$buffer" "$line")
fi
done
process
echo "done"
Here's a "naive" solution implemented in bash (except for two external commands: md5sum, of course, and stat used only for user's comfort, it's not part of the algorithm). The thing implements a 100% Bash quicksort (that I'm kind of proud of):
#!/bin/bash
# Finds similar (based on md5sum) files (recursively) in given
# directory. If several files with same md5sum are found, sort
# them by modified (most recent first) and prompt user for deletion
# of the oldest
die() {
printf >&2 '%s\n' "$#"
exit 1
}
quicksort_files_by_mod_date() {
if ((!$#)); then
qs_ret=()
return
fi
# the return array is qs_ret
local first=$1
shift
local newers=()
local olders=()
qs_ret=()
for i in "$#"; do
if [[ $i -nt $first ]]; then
newers+=( "$i" )
else
olders+=( "$i" )
fi
done
quicksort_files_by_mod_date "${newers[#]}"
newers=( "${qs_ret[#]}" )
quicksort_files_by_mod_date "${olders[#]}"
olders=( "${qs_ret[#]}" )
qs_ret=( "${newers[#]}" "$first" "${olders[#]}" )
}
[[ -n $1 ]] || die "Must give an argument"
[[ -d $1 ]] || die "Argument must be a directory"
dirname=$1
shopt -s nullglob
shopt -s globstar
declare -A files
declare -A hashes
for file in "$dirname"/**; do
[[ -f $file ]] || continue
read md5sum _ < <(md5sum -- "$file")
files[$file]=$md5sum
((hashes[$md5sum]+=1))
done
has_found=0
for hash in "${!hashes[#]}"; do
((hashes[$hash]>1)) || continue
files_with_same_md5sum=()
for file in "${!files[#]}"; do
[[ ${files[$file]} = $hash ]] || continue
files_with_same_md5sum+=( "$file" )
done
has_found=1
echo "Found ${hashes[$hash]} files with md5sum=$hash, sorted by modified (most recent first):"
# sort them by modified date (using quicksort :p)
quicksort_files_by_mod_date "${files_with_same_md5sum[#]}"
for file in "${qs_ret[#]}"; do
printf " %s %s\n" "$(stat --printf '%y' -- "$file")" "$file"
done
read -p "Do you want to remove the oldest? [yn] " answer
if [[ ${answer,,} = y ]]; then
echo rm -fv -- "${qs_ret[#]:1}"
fi
done
if((!has_found)); then
echo "Didn't find any similar files in directory \`$dirname'. Yay."
fi
I guess the script is self-explanatory (you can read it like a story). It uses the best practices I know of, and is 100% safe regarding any silly characters in file names (e.g., spaces, newlines, file names starting with hyphens, file names ending with a newline, etc.).
It uses bash's globs, so it might be a bit slow if you have a bloated directory tree.
There are a few error checkings, but many are missing, so don't use as-is in production! (it's a trivial but rather tedious taks to add these).
The algorithm is as follows: scan each file in the given directory tree; for each file, will compute its md5sum and store in associative arrays:
files with keys the file names and values the md5sums.
hashes with keys the hashes and values the number of files the md5sum of which is the key.
After this is done, we'll scan through all the found md5sum, select only the ones that correspond to more than one file, then select all files with this md5sum, then quicksort them by modified date, and prompt the user.
A sweet effect when no dups are found: the script nicely informs the user about it.
I would not say it's the most efficient way of doing things (might be better in, e.g., Perl), but it's really a lot of fun, surprisingly easy to read and follow, and you can potentially learn a lot by studying it!
It uses a few bashisms and features that only are in bash version ≥ 4
Hope this helps!
Remark. If on your system date has the -r switch, you can replace the stat command by:
date -r "$file"
Remark. I left the echo in front of rm. Remove it if you're happy with how the script behaves. Then you'll have a script that uses 3 external commands :).

Batch localize using IBTools?

Is there a way to run IBTools on a bunch of NIB files with a single command? I'm trying to extract strings from NIBs. Am I supposed to run ibtools once for each NIB?
I find it tedious to run IBTools so many times. (I have only 9 NIB files. It could be worse...)
I don't think ibtool can take multiple files as argument. The only way I see would be to write a bash script to perform this task.
#!/bin/bash
find . -name "*.xib" | while read FILENAME;
do
ibtool --export-strings-file $FILENAME.strings $FILENAME
done
Here is a much more full featured script I made to use for the same operation:
#!/bin/bash
# Argument = -o output_dir -i input_dir
usage()
{
cat << EOF
usage: $0 [options]
This script generates strings files from all xibs in a given directory.
OPTIONS:
-h Show this message
-i Input directory where XIBs are located [./]
-o Output directory where .strings files will be generated
EOF
}
INPUT_DIRECTORY="."
OUTPUT_DIRECTORY="."
while getopts “hi:o:” OPTION
do
case $OPTION in
h)
usage
exit 1
;;
i)
INPUT_DIRECTORY=$OPTARG
;;
o)
OUTPUT_DIRECTORY=$OPTARG
;;
?)
usage
exit
;;
esac
done
if [[ -z $INPUT_DIRECTORY ]] || [[ -z $OUTPUT_DIRECTORY ]]
then
usage
exit 1
fi
# do the generation
find $INPUT_DIRECTORY -name "*.xib" | while read FILENAME;
do
XIBNAME=$(basename "$FILENAME")
XIBNAME="${XIBNAME%.*}"
ibtool --generate-strings-file $OUTPUT_DIRECTORY/$XIBNAME.strings $FILENAME
done

Resources