Moving images based on IPTC metadata in bash - bash

I am trying to reorganise images based on the species that is within an image. Among other information, the species name can be found in the IPTC metadata (see link to the Inspector image). I am attempting to do this in bash on macOS and have tried following code (with template species name and directory):
find . -iname "*.jpg" -print0 | xargs -0 grep -l "Species Name" | xargs -0 -I {} mv {} ~/example/directory
I have also tried using the exiftool package, where the relevant information is in the Subject tag:
find . -iname "*.jpg" -print0 | xargs -0 exiftool -Subject | grep "Species Name" | xargs -0 -I {} mv {} ~/example/directory
However, I receive following error message, which I assume to be a result of a misuse of grep or the last xargs:
mv: rename (standard input)
to ~/example/directory(standard input)
: No such file or directory
Any ideas what could fix this issue? Thank you in advance.

Exiftool can move and rename files based upon the file metadata and is much faster calling it once for an entire directory than calling it individually for each file in a loop (Common Mistake #3).
I believe you could use this command to do your sorting:
exiftool -if '$subject=~/Species Name/i' -directory=~/example/directory .
Breakdown:
-if '$subject=~/Species Name/i' Does a regex comparison of the Subject tag for the "Species Name". I added the i at the end to do the comparison case insensitively, modify as desired.
-directory=~/example/directory If the -if condition is met, then the file will be moved to the stated directory.

Looks like grep's output newline is interfering there, as you're using a clean -0 pipeline, you should do so with grep -- in Linux it would be grep -Z ... (or --null), dunno if Mac OS's has a similar one.

Related

sorting output of find before running the command in -exec

I have a series of directories containing multiple mp3 files with filenames 001.mp3, 002.mp3, ..., 030.mp3.
What I want to do is to put them all together in order into a single mp3 file and add some meta data to that.
Here's what I have at the moment (removed some variable definitions, for clarity):
#!/bin/bash
for d in */; do
cd $d
find . -iname '*.mp3' -exec lame --decode '{}' - ';' | lame --tt "$title_prefix$name" --ty "${name:5}" --ta "$artist" --tl "$album" -b 64 - $final_path"${d%/}".mp3
cd ..
done
Sometimes this works and I get a single file with all the "tracks" in the correct order.
However, more often than not I get a single file with all the "tracks" in reverse order, which really isn't good.
What I can't understand is why the order varies between different runs of the script as all the directories contain the same set of filenames. I've poured over the man page and can't find a sort option for find.
I could run find . -iname '*.mp3' | sort -n >> temp.txt to put the files in a temporary file and then try and loop through that, but I can't get that to work with lame.
Is there any way I can put a sort in before find runs the exec? I can find plenty of examples here and elsewhere of doing this with -exec ls but not where one needs to execute something more complicated with exec.
find . -iname '*.mp3' -print0 | sort -zn | xargs -0 -I '{}' lame --decode '{}' - | lame --tt "$title_prefix$name" --ty "${name:5}" --ta "$artist" --tl "$album" -b 64 - $final_path"${d%/}".mp3
Untested but might be worth a try.
Normally xargs appends the arguments to the end of the command you give it. The -I option tells it to replace the given string instead ({} in this case).
Edit: I've added -print0, -z, -0 to make sure the pipeline still works even if your filenames contain newlines.

execute command on files returned by grep

Say I want to edit every .html file in a directory one after the other using vim, I can do this with:
find . -name "*.html" -exec vim {} \;
But what if I only want to edit every html file containing a certain string one after the other? I use grep to find files containing those strings, but how can I pipe each one to vim similar to the find command. Perphaps I should use something other than grep, or somehow pipe the find command to grep and then exec vim. Does anyone know how to edit files containing a certain string one after the other, in the same fashion the find command example I give above would?
grep -l 'certain string' *.html | xargs vim
This assumes you don't have eccentric file names with spaces etc in them. If you have to deal with eccentric file names, check whether your grep has a -z option to terminate output lines with null bytes (and xargs has a -0 option to read such inputs), and if so, then:
grep -zl 'certain string' *.html | xargs -0 vim
If you need to search subdirectories, maybe your version of Bash has support for **:
grep -zl 'certain string' **/*.html | xargs -0 vim
Note: these commands run vim on batches of files. If you must run it once per file, then you need to use -n 1 as extra options to xargs before you mention vim. If you have GNU xargs, you can use -r to prevent it running vim when there are no file names in its input (none of the files scanned by grep contain the 'certain string').
The variations can be continued as you invent new ways to confuse things.
With find :
find . -type f -name '*.html' -exec bash -c 'grep -q "yourtext" "${1}" && vim "${1}"' _ {} \;
On each files, calls bash commands that grep the file with yourtext and open it with vim if text is matching.
Solution with a for cycle:
for i in $(find . -type f -name '*.html'); do vim $i; done
This should open all files in a separate vim session once you close the previous.

When to use xargs when piping?

I am new to bash and I am trying to understand the use of xargs, which is still not clear for me. For example:
history | grep ls
Here I am searching for the command ls in my history. In this command, I did not use xargs and it worked fine.
find /etc - name "*.txt" | xargs ls -l
I this one, I had to use xargs but I still can not understand the difference and I am not able to decide correctly when to use xargs and when not.
xargs can be used when you need to take the output from one command and use it as an argument to another. In your first example, grep takes the data from standard input, rather than as an argument. So, xargs is not needed.
xargs takes data from standard input and executes a command. By default, the data is appended to the end of the command as an argument. It can be inserted anywhere however, using a placeholder for the input. The traditional placeholder is {}; using that, your example command might then be written as:
find /etc -name "*.txt" | xargs -I {} ls -l {}
If you have 3 text files in /etc you'll get a full directory listing of each. Of course, you could just have easily written ls -l /etc/*.txt and saved the trouble.
Another example lets you rename those files, and requires the placeholder {} to be used twice.
find /etc -name "*.txt" | xargs -I {} mv {} {}.bak
These are both bad examples, and will break as soon as you have a filename containing whitespace. You can work around that by telling find to separate filenames with a null character.
find /etc -print0 -name "*.txt" | xargs -I {} -0 mv {} {}.bak
My personal opinion is that there are almost always alternatives to using xargs (such as the -exec argument to find) and you will be better served by learning those.
When you use piping without xargs, the actual data is fed into the next command. On the other hand, when using piping with xargs, the actual data is viewed as a parameter to the next command. To give a concrete example, say you have a folder with a.txt and b.txt. a.txt contains just a single line 'hello world!', and b.txt is just empty.
If you do
ls | grep txt
you would end up getting the output:
a.txt
b.txt
Yet, if you do
ls | xargs grep txt
you would get nothing since neither file a.txt nor b.txt contains the word txt.
If the command is
ls | xargs grep hello
you would get:
hello world!
That's because with xargs, the two filenames given by ls are passed to grep as arguments, rather than the actual content.
Short answer: Avoid xargs for now. Return to xargs when you have written dozens or hundreds of scripts.
Commands can get their input from parameters (like rm bad_example) or can get the input from stdin (not just the y on the question after rm -i is_this_bad_too, but also read answer). Other commands like grep and sed will look for parameters and when the parameters don't show the input, switch to the input.
Your grep example works fine reading from stdin, nothing special needed.
Your ls needs the output of find as a parameter. xargs is just one way to turn things around. Use man xargs for more about xargs. Alternatives:
find /etc -name "*.txt" -exec ls -l {} \;
find /etc -name "*.txt" -ls
ls -l $(find /etc -name "*.txt" )
ls /etc/*.txt
First try to see which of this commands is best when you have a nasty filename with spaces.txt in /etc.
xargs(1) is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input.
If you're working with filenames, use find's -exec [command] {} + instead.
If you can get NUL-delimited output, use xargs -0.
GNU Parallel can do the same as xargs, but does not have the broken and exploitable "features".
You can learn GNU Parallel by looking at examples http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Working-as-xargs--n1.-Argument-appending and walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html

Scaling up grep find and copy to large folder (xargs?)

I would like to search a directory for any file that matches any of a list of words. If a file matches, I would like to copy that file into a new directory. I created a small batch of test files and got the following code working:
cp `grep -lir 'word\|word2\|word3\|word4\|word5' '/Users/originallocation'` '/Users/newlocation'
Unfortunately, when I run this code on a large folder with a few thousand files it says the argument list is too long for cp. I think I need to loop this or use a xargs but I can't figure out how to make the conversion.
The minimal change from what you have would be:
grep -lir 'word\|word2\|word3\|word4\|word5' '/Users/originallocation' | \
xargs cp -t '/Users/newlocation'
But, don't use that. Because you never know when you will encounter a filename with spaces or newlines in it, null-terminated strings should be used. On linux/GNU, add the -Z option to grep and -0 to xargs:
grep -Zlir 'word\|word2\|word3\|word4\|word5' '/Users/originallocation' | \
xargs -0 cp -t '/Users/newlocation'
On Macs (and AIX, HP-UX, Solaris, *BSD), the grep options change slightly but, more importantly, the GNU cp -t option is not available. A workaround is:
grep -lir --null 'word\|word2\|word3\|word4\|word5' '/Users/originallocation' | \
xargs -0 -I fname cp fname '/Users/newlocation'
This is less efficient because a new instance of cp has to be run for each file to be copied.
Alternative solution for those without grep -r. Using find + egrep + xargs , hope there is no file with same file name in different folders. Secondly, I replaced the ugly style of word\|word2\|word3\|word4\|word5
find . -type f -exec egrep -l 'word|word2|word3|word4|word5' {} \; |xargs -i cp {} /LARGE_FOLDER

Bash find filter and copy - trouble with spaces

So after a lot of searching and trying to interpret others' questions and answers to my needs, I decided to ask for myself.
I'm trying to take a directory structure full of images and place all the images (regardless of extension) in a single folder. In addition to this, I want to be able to remove images matching certain filenames in the process. I have a find command working that outputs all the filepaths for me
find -type f -exec file -i -- {} + | grep -i image | sed 's/\:.*//'
but if I try to use that to copy files, I have trouble with the spaces in the filenames.
cp `find -type f -exec file -i -- {} + | grep -i image | sed 's/\:.*//'` out/
What am I doing wrong, and is there a better way to do this?
With the caveat that it won't work if files have newlines in their names:
find . -type f -exec file -i -- {} + |
awk -vFS=: -vOFS=: '$NF ~ /image/{NF--;printf "%s\0", $0}' |
xargs -0 cp -t out/
(Based on answer by Jonathan Leffler and subsequent comments discussion with him and #devnull.)
The find command works well if none of the file names contain any newlines. Within broad limits, the grep command works OK under the same circumstances. The sed command works fine as long as there are no colons in the file names. However, given that there are spaces in the names, the use of $(...) (command substitution, also indicated by back-ticks `...`) is a disaster. Unfortunately, xargs isn't readily a part of the solution; it splits on spaces by default. Because you have to run file and grep in the middle, you can't easily use the -print0 option to (GNU) find and the -0 option to (GNU) xargs.
In some respects, it is crude, but in many ways, it is easiest if you write an executable shell script that can be invoked by find:
#!/bin/bash
for file in "$#"
do
if file -i -- "$file" | grep -i -q "$file:.*image"
then cp "$file" out/
fi
done
This is a little painful in that it invokes file and grep separately for each name, but it is reliable. The file command is even safe if the file name contains a newline; the grep is probably not.
If that script is called 'copyimage.sh', then the find command becomes:
find . -type f -exec ./copyimage.sh {} +
And, given the way the grep command is written, the copyimage.sh file won't be copied, even though its name contains the magic word 'image'.
Pipe the results of your find command to
xargs -l --replace cp "{}" out/
Example of how this works for me on Ubuntu 10.04:
atomic#atomic-desktop:~/temp$ ls
img.png img space.png
atomic#atomic-desktop:~/temp$ mkdir out
atomic#atomic-desktop:~/temp$ find -type f -exec file -i \{\} \; | grep -i image | sed 's/\:.*//' | xargs -l --replace cp -v "{}" out/
`./img.png' -> `out/img.png'
`./img space.png' -> `out/img space.png'
atomic#atomic-desktop:~/temp$ ls out
img.png img space.png
atomic#atomic-desktop:~/temp$

Resources