So I have 20 subfolders full of files in my main folder and have around 200 files in every subfolder. I've been trying to write a script to convert every picture in every subfolder to DNG.
I have done some research and was able to batch convert images from the current folder.
I've tried developping the idea to get it to work for subfolders but to no success.
Here is the code I've written:
for D in 'find . -type d'; do for i in *.RW2; do sips -s format jpeg $i --out "${i%.*}.jpg"; cd ..; done; done;
The easiest and fastest way to do this is with GNU Parallel like this:
find . -iname \*rw2 -print0 | parallel -0 sips -s format jpeg --out {.}.jpg {}
because that will use all your CPU cores in parallel. But before you launch any commands you haven't tested, it is best to use the --dry-run option like this so that it shows you what it is going to do, but without actually doing anything:
find . -iname \*rw2 -print0 | parallel --dry-run -0 sips -s format jpeg --out {.}.jpg {}
Sample Output
sips -s format jpeg --out ./sub1/b.jpg ./sub1/b.rw2
sips -s format jpeg --out ./sub1/a.jpg ./sub1/a.RW2
sips -s format jpeg --out ./sub2/b.jpg ./sub2/b.rw2
If you like the way it looks, remove the --dry-run and run it again. Note that the -iname parameter means it is insensitive to upper/lower case, so it will work for ".RW2" and ".rw2".
GNU Parallel is easily installed on macOS via homebrew with:
brew install parallel
It can also be installed without a package manager (like homebrew) because it is actually a Perl script and Macs come with Perl. So you can install by doing this in Terminal:
(wget pi.dk/3 -qO - || curl pi.dk/3/) | bash
Your question seems confused as to whether you want DNG files like your title suggests, or JPEG files like your code suggests. My code generates JPEGs as it stands. If you want DNG output, you will need to install Adobe DNG Converter, and then run:
find . -iname \*rw2 -print0 | parallel --dry-run -0 \"/Applications/Adobe DNG Converter.app/Contents/MacOS/Adobe DNG Converter\"
There are some other options you can append to the end of the above command:
-e will embed the original RW2 file in the DNG
-u will create the DNG file uncompressed
-fl will add fast load information to the DNG
DNG Converter seems happy enough to run multiple instances in parallel, but I did not test with thousands of files. If you run into issues, just run one job at a time by changing to parallel -j 1 ...
Adobe DNG Converter is easily installed under macOS using homebrew as follows:
brew install caskroom/cask/adobe-dng-converter
Related
i am thinking about the best and fastet way to convert 5 Mio tiff Files (in Folders Subfolders and SubSubfolders) in 5 Mio png Files (same directory).
Is there any way to parallelise this job?
How could i check then if all files are converted?
ls *.tif | wc -l # compared to
ls *.png | wc -l
but for every folder.
Thanks.
Marco
Your question is very vague on details, but you can use GNU Parallel and ImageMagick like this:
find STARTDIRECTORY -iname "*.tif" -print0 | parallel -0 --dry-run magick {} {.}.png
If that looks correct, I would make a copy of a few files in a temporary location and try it for real by removing the --dry-run. If it works ok, you can add --bar for a progress bar too.
In general, GNU Parallel will keep N jobs running, where N is the number of CPU cores you have. You can change this with -j parameter.
You can set up GNU Parallel to halt on fail, or on success, or a number of failures, or after currently running jobs complete and so on. In general you will get an error message if any file fails to convert but your jobs will continue till completeion. Run man parallel and search for --halt option.
Note that the above starts a new ImageMagick process for each image which is not the most efficient although it will be pretty fast on a decent machine with good CPU, disk subsystem and RAM. You could consider using different tools such as vips if you feel like experimenting - there are a few ideas and benchmarks here.
Depending on how your files are actually laid out, you might do better using ImageMagick's mogrify command, and getting GNU Parallel to pass as many files to each invocation as your maximum command line length permits. So, for example, if you had a whole directory of TIFFs that you wanted to make into PNGs, you can do that with a single mogrify like this:
magick mogrify -format PNG *.tif
You could pair that command with a find looking for directories something like this:
find STARTDIRECTORY -type d -print0 | parallel -0 'cd {} && magick mogrify -format PNG *.tif`
Or you could find TIFF files and pass as many as possible to each mogrify something like this:
find STARTDIRECTORY -iname "*.tif" -print0 | parallel -0 -X magick mogrify -format PNG {}
I've found a solution that claims to do one folder, but I have a deep folder hierarchy of sheet music that I'd like to batch convert from png to pdf. What do my solutions look like?
I will run into a further problem down the line, which may complicate things. Maybe I should write a script? (I'm a total n00b fyi)
The "further problem" is that some of my sheet music spans more than one page, so if the script can parse filenames that include "1of2" and "2of2" to be turned into a single pdf, that'd be neat.
What are my options here?
Thank you so much.
Updated Answer
As an alternative, the following should be faster (as it does the conversions in parallel) and also able to handle larger numbers of files:
find . -name \*.png -print0 | parallel -0 convert {} {.}.pdf
It uses GNU Parallel which is readily available on Linux/Unix and which can be simply installed on OSX with homebrew using:
brew install parallel
Original Answer (as accepted)
If you have bash version 4 or better, you can use extended globbing to recurse directories and do your job very simply:
First enable extended globbing with:
shopt -s globstar
Then recursively convert PNGs to PDFs:
mogrify -format pdf **/*.png
You can loop over png files in a folder hierarchy, and process each one as follows:
find /path/to/your/files -name '*.png' |
while read -r f; do
g=$(basename "$f" .png).pdf
your_conversion_program <"$f" >"$g"
done
To merge pdf-s, you could use pdftk. You need to find all pdf files that have a 1of2 and 2of2 in their name, and run pdftk on those:
find /path/to/your/files -name '*1of2*.pdf' |
while read -r f1; do
f2=${f1/1of2/2of2} # name of second file
([ -f "$f1" ] && [ -f "$f2" ]) || continue # check both exist
g=${f1/1of2//} # name of output file
(! [ -f "$g" ]) || continue # if output exists, skip
pdftk "$f1" "$f2" output "$g"
done
See:
bash string substitution
Regarding a deep folder hierarchy you may use find with -exec option.
First you find all the PNGs in every subfolder and convert them to PDF:
find ./ -name \*\.png -exec convert {} {}.pdf \;
You'll get new PDF files with extension ".png.pdf" (image.png would be converted to image.png.pdf for example)
To correct extensions you may run find command again but this time with "rename" after -exec option.
find ./ -name \*\.png\.pdf -exec rename s/\.png\.pdf/\.pdf/ {} \;
If you want to delete source PNG files, you may use this command, which deletes all files with ".png" extension recursively in every subfolder:
find ./ -name \*\.png -exec rm {} \;
if i understand :
you want to concatenate all your png files from a deep folders structure into only one single pdf.
so...
insure you png are ordered as you want in your folders
be aware you can redirect output of a command (say a search one ;) ) to the input of convert, and tell convert to output in one pdf.
General syntax of convert :
convert 1.png 2.png ... global_png.pdf
The following command :
convert `find . -name '*'.png -print` global_png.pdf
searches for png files in folders from cur_dir
redirects the output of the command find to the input of convert, this is done by back quoting find command
converts works and output to pdf file
(this very simple command line works fine only with unspaced filenames, don't miss quoting the wild char, and back quoting the find command ;) )
[edit]Care....
be sure of what you are doing.
if you delete your png files, you will just loose your original sources...
it might be a very bad practice...
using convert without any tricky -quality output option could create an enormous pdf file... and you might have to re-convert with -quality "60" for instance...
so keep your original sources until you do not need them any more
I need to list all images in a folder and its sub-folders, with certain size, say all images that are 320x200, I guess I need to do ls -R *.png then pipe the output to some other command that filters images for that size, my command line skill is pool, can anyone help? Thanks a lot!
You can use sips to get pixelHeight and pixelWidth from images. By combining the command with find you'll be able to recursively search images of a specific size.
example:
results=$HOME/Desktop/results.txt
find . -type f -name "*.png" -exec sips -g pixelHeight -g pixelWidth > $results {} \;
cat $results | grep "\w\{11\}\:\s\(320\)" -B 1 -A 1 | grep "\w\{10\}\:\s\(200\)" -B 1
results.txt:
/Users/Me/Desktop/nsfw.png
pixelHeight: 320
pixelWidth: 200
info:
The first command sets up a variable using the path to results.txt
Next, the find command writes a list of all images found with dimensions to results.txt
Finally we check the results.txt for the specific dimensions (320 x 200) using grep.
These commands can be refined however you want and possibly condensed, but should work as is.
In MacOSX there are more helpful terminal commands using metaData (similar to Spotlight):
mdfind, mdls etc. (manual pages exist and can be shown with man mdls …). For what you want to do try mdfind, as shown in the following example to find all files in a given folder (and only in this) with a pixel size greater than 900 x 1100:
mdfind -onlyin /Users/hg/Pictures/2014/01/01 "kMDItemPixelHeight>1100 && kMDItemPixelWidth>900"
The (a bit strange looking) query parameter names can be found in the documentation at DataManagement --> File Management --> MDItemReference. Try mdls filename to see some of these parameters.
man find and imagemagick's identify are what you need.
I'm trying to download a large batch of images from a website onto my Mac. I can download the smaller images with DownloadThemAll, SiteSucker, etc but they don't quite dig deep enough. So I've had to jump into Terminal which is a little bit out of my comfort zone and my skills are a bit rusty.
I've had a try with the script below:
curl -O http://www.domain.co.uk/system/images/[1-1000]/original/*.jpg
This script works and I can see the Terminal downloading the image files however the issue I'm having is that the files are being overwritten with *.jpg and not producing them sequentially such as 1.jpg, 2.jpg, 3.jpg etc or even with their original names. The original jpg names use random numbers/letters (such as LIC0145_websource.jpg) which is why I've tried to supplement it with the *.jpg. I'm wondering which piece of code I'm missing to tell the Terminal to download these images.
I've also tired calling the shell script below but run into the 'Unexpected end of file'
#!/bin/bash
for i in `seq 1 1000`;
do
input=http://www.domain.co.uk/system/images/$i/original/*.jpg
output=$i.jpg
# echo $input, $output
curl --output $output --remote-name $input
done
I think the curl option might still be a better option but if anyone has any fixes or other solutions let me know.
You can do something like this with wget (I know that's not curl):
wget --no-parent --accept=jpg,jpeg,htm,html --mirror http://somedomain/
Then CD to the directory and issue a
find ./ \( -iname '*.htm' -o -iname '*.html' \) -exec rm {} \;
I have a stack of hundreds of pictures and i want to use pngcrush for reducing the file sizes.
I know how to crush one file with terminal, but all over the web i find parts of explanations that assume previous knowledge.
can someone please explain how to do it clearly.
Thanks
Shani
You can use following script:
#!/bin/bash
# uncomment following line for more aggressive but longer compression
# pngcrush_options=-reduce -brute -l9
find . -name '*.png' -print | while read f; do
pngcrush $pngcrush_options -e '.pngcrushed' "$f"
mv "$f" "${f/%.pngcrushed/}"
done
Current versions of pngcrush support this functionality out of the box.
( I am using pngcrush 1.7.81)
pngcrush -dir outputFolder inputFolder/*.png
will create "outputFolder" if it does not exist and process all the .png files in the "inputFolder" placing them in "outputFolder".
Obviously you can add other options e.g.
pngcrush -dir outputFolder -reduce -brute -l9 inputFolder/*.png
Being in 2023, there are better tools to optimize png images like OptiPNG
install
sudo apt-get install optipng
use for one picture
optipng imagen.png
use for all pictures in folder
find /path/to/files/ -name '*.png' -exec optipng -o7 {} \;
optionally the command -o defines the quality, being possible from 1 to 7, where 7 is the maximum compression level of the image.
-o7
The high rated fix appears dangerous to me; it started compressing all png files in my iMac; needed is a command restricted to a specified directory; I am no UNIX expert; I undid the new files by searching for all files ending in .pngcrushed and deleting them