Converting all .pdf files in folder tree to .png - bash

I'm on Ubuntu, and I have a tree of folders containing .pdf files. I need to convert each one to a .png format. The bash script I am currently using is:
for f in $(find ./polkadots -name 'image.pdf'); do
convert -transparent white -fuzz 10% $f image.png;
done
I have tested the for loop by itself, and it works (it produces a list of all the .pdf files under the ./polkadots folder that I need to convert):
for f in $(find ./polkadots -name 'image.pdf'); do
echo "$f";
done
I have tested the imagemagic convert command by itself, and it works (it converts a single file in my current directory from .pdf to .png):
convert -transparent white -fuzz 10% image.pdf image.png
However, when I combine them... the console sits and thinks for a while, and then concludes.. but no files have been created or changed, and no error messages are produced. What am I doing wrong?
EDIT: The new .png files are being created, but they are being created in my current directory, instead of in the sub-directory where the .pdf was found. How do I fix this?

Try using find alone. No need to use a loop.
I haven't tested this command but it should work.
find ./polkadots -name 'image.pdf' -exec convert -transparent white -fuzz 10% {} image.png \; -print
The -print at the end is optional. I prefer to see which files have been modified.

Maybe you can find output option in convert command directly, which can export the png file to your expect folder. Anyway follow your idea, here is the updated code:
find ./polkadots -type f -name "image.pdf" |while read line
do
dir=${line%/*}
convert -transparent white -fuzz 10% $line image.png
mv image.png ${dir}/image.png
done
If you need convert all pdf files under polkadots folder, try this:
find ./polkadots -type f -name "*.pdf" |while read line
do
dir=${line%/*}
file=${line##*/}
file=${file%.*}
convert -transparent white -fuzz 10% $line ${file}.png
echo mv ${file}.png ${dir}/${file}.png
done

If you are using bash 4+, you should use globstar instead of find
#!/bin/bash
shopt -s globstar
for f in ./polkadots/**/image.pdf; do
convert -transparent white -fuzz 10% "$f" "${f%/*}/image.png"
done
If you're using an older bash, your original answer is close to fine, but has a few bugs/potential bugs,
If you directories have spaces in them, your original script will case errors, so use the | while read -r -d '' f syntax.
Don't put a ; at the end of command lines
Quote all variables to prevent expansion problems
As you pointed out, main issue in your case was not specifying destination dir, so you can use ${f%/*} parameter expansion to get the dir (this will delete everything including and after the last / in $f, then put the / back and append the filename like below.
Edited
#!/bin/bash
find ./polkadots -name "image.pdf" -print0 | while read -r -d '' f; do
convert -transparent white -fuzz 10% "$f" "${f%/*}/image.png"
done

Related

Loop through the directories to get the image files

I have a directory which includes multiple sub-directories. I would like to go through the directories and subdirectories and find the jpg files and convert the size using mogrify command. I would like to do it as dynamic as possible that's why I wrote a script. The $1 is the first argument that I pass through when executing the bash script. After running the script, it gives me an error about 'mogrify can not read [#]%'. I guess something is very wrong with my code and I am not mature in bash. Can anyone tell me how to do this script dynamically so that would be fast.
p.s: the name of jpg files are not in especial format...just bunch of numbers.
for folder in $1/*
do
for file in "$folder"/*
do
if [ -e "${file[#]%.jpg}" ]; then
mogrify -resize 112x112! "${file[#]%.jpg}"
fi
done
done
If you're open to using find, then this becomes pretty easy:
#!/usr/bin/env bash
find "$1" \( -iname \*.jpg -o -iname \*.jpeg \) -print0 | while read -r -d $'\0' file; do
# base="${file##*/}" $base is the file name with all the directory stuff stripped off
# dir="${file%/*} $dir is the directory with the file name stripped off
mogify -resize '112x112!' "$file"
done
Put that in a file named mymog.bash then
$ chmod 755 mymog.bash
$ mymog.bash /some/dir
Notes:
! is special to bash, so putting that in the single quotes make it "unspecial", passing it along to the mogrify command unmolested.
The double quotes around $1 and $file are needed in case a directory or file name has spaces in it. If you had a directory named /Users/alice/my pictures and didn't use the quotes, mogrify would get one argument named /Users/alice/my and another one named pictures.
Make sure you use the \( and \) for find. That makes the whole condition ("match *.jpg" OR "match *.jpeg") apply to the action -print0.
I used find 's -print0 action which prints each matching file name with a null-terminated (zero-terminated) string. You can have filenames that have newline characters in the middle. This protects against that.
bash 's built-in read command reads until a newline by default. I used the -d $'\0' to make it read each "line" or "record" (filename) up to the null (zero) character at its end. (Each ends with null because of the -print0.)
This solution (one of many) has two parts:
It uses the find utility to find (under the directory given) all files that end in .jpg or .jpeg, ignoring the case of the filenames. [So it will match .JPG or even \.JpEg.]
It spits out one record for each file.
If you give it an absolute path like /some/dir, it will find /some/dir/a.jpg and /some/dir/sub1/sub2/sub3/b.jpg.
If you give it a relative path like ../../nearby/dir, it will find ../../nearby/dir/c.jpg and ../../nearby/dir/sub1/sub2/sub3/d.jpeg.
The find part ends with the first | on that line. After that, it is a bash while…do loop.
The variable file takes on the value of each record spit out by find.
The loop (everything between do and done) runs once for each value that file takes on.
The two rows that start with # are comments. They contain commands that are ignored (skipped). You can remove the # to have bash run those commands too. I included them as examples in case you needed the directory part or just the filename part of the record.
find "$1" -type f -name "*.jpg" -exec mogrify -resize 112x112! {} \;
Try find and a while read loop:
find "$1" -type f -name '*.jpg' -print | while read fname
do
....
done
If your filenames may contain special chars like line feed, the use:
find "$1" -type f -name '*.jpg' -print0 | while IFS= read -r -d '' fname
do
....
done
There are many more options to tune the search.

Bash convert resize recursively preserving filenames

Have images in subfolders that need to be limited in size (720 max width or 1100 max height). Their filenames must be preserved. Started with:
for img in *.jpg; do filename=${img%.*}; convert -resize 720x1100\> "$filename.jpg" "$filename.jpg"; done
which works within each directory, but have a lot of subfolders with these images. Tried find . -iname "*.jpg" -exec cat {} but it did not create a list as expected.
This also didn't work:
grep *.jpg | while read line ; do `for img in *.jpg; do filename=${img%.jpg}; convert -resize 720x1100\> "$filename.jpg" "$filename.jpg"`; done
Neither did this:
find . -iname '*jpg' -print0 | while IFS= read -r -d $'\0' line; do convert -resize 720x1100\> $line; done
which gives me error message "convert: no images defined." And then:
find . -iname '*.jpg' -print0 | xargs -0 -I{} convert -resize 720x1100\> {}
gives me the same error message.
It seems you're looking for simply this:
find /path/to/dir -name '*.jpg' -exec mogrify -resize 720x1100\> {} \;
In your examples, you strip the .jpg extension, and then you add it back. No need to strip at all, and that simplifies things a lot.
Also, convert filename filename is really the same as mogrify filename. mogrify is part of ImageMagick, it's useful for modifying files in-place, overwriting the original file. convert is useful for creating new files, preserving originals.
Since all of the subdirectories are two levels down, found this worked:
for img in **/*/*.jpg ; do filename=${img%.*}; convert -resize 720x1100\> "$filename.jpg" "$filename.jpg"; done
Thanks to #pjh for getting me started. This also worked:
shopt -s globstar ; for img in */*/*.jpg ; do filename=${img%.*}; convert -resize 720x1100\> "$filename.jpg" "$filename.jpg"; done
But I got the error message "-bash: shopt: globstar: invalid shell option name" but all of the images larger than specified were resized with filenames preserved.

Batch convert PNGs to individual PDFs while maintaining deep folder hierarchy in bash

I've found a solution that claims to do one folder, but I have a deep folder hierarchy of sheet music that I'd like to batch convert from png to pdf. What do my solutions look like?
I will run into a further problem down the line, which may complicate things. Maybe I should write a script? (I'm a total n00b fyi)
The "further problem" is that some of my sheet music spans more than one page, so if the script can parse filenames that include "1of2" and "2of2" to be turned into a single pdf, that'd be neat.
What are my options here?
Thank you so much.
Updated Answer
As an alternative, the following should be faster (as it does the conversions in parallel) and also able to handle larger numbers of files:
find . -name \*.png -print0 | parallel -0 convert {} {.}.pdf
It uses GNU Parallel which is readily available on Linux/Unix and which can be simply installed on OSX with homebrew using:
brew install parallel
Original Answer (as accepted)
If you have bash version 4 or better, you can use extended globbing to recurse directories and do your job very simply:
First enable extended globbing with:
shopt -s globstar
Then recursively convert PNGs to PDFs:
mogrify -format pdf **/*.png
You can loop over png files in a folder hierarchy, and process each one as follows:
find /path/to/your/files -name '*.png' |
while read -r f; do
g=$(basename "$f" .png).pdf
your_conversion_program <"$f" >"$g"
done
To merge pdf-s, you could use pdftk. You need to find all pdf files that have a 1of2 and 2of2 in their name, and run pdftk on those:
find /path/to/your/files -name '*1of2*.pdf' |
while read -r f1; do
f2=${f1/1of2/2of2} # name of second file
([ -f "$f1" ] && [ -f "$f2" ]) || continue # check both exist
g=${f1/1of2//} # name of output file
(! [ -f "$g" ]) || continue # if output exists, skip
pdftk "$f1" "$f2" output "$g"
done
See:
bash string substitution
Regarding a deep folder hierarchy you may use find with -exec option.
First you find all the PNGs in every subfolder and convert them to PDF:
find ./ -name \*\.png -exec convert {} {}.pdf \;
You'll get new PDF files with extension ".png.pdf" (image.png would be converted to image.png.pdf for example)
To correct extensions you may run find command again but this time with "rename" after -exec option.
find ./ -name \*\.png\.pdf -exec rename s/\.png\.pdf/\.pdf/ {} \;
If you want to delete source PNG files, you may use this command, which deletes all files with ".png" extension recursively in every subfolder:
find ./ -name \*\.png -exec rm {} \;
if i understand :
you want to concatenate all your png files from a deep folders structure into only one single pdf.
so...
insure you png are ordered as you want in your folders
be aware you can redirect output of a command (say a search one ;) ) to the input of convert, and tell convert to output in one pdf.
General syntax of convert :
convert 1.png 2.png ... global_png.pdf
The following command :
convert `find . -name '*'.png -print` global_png.pdf
searches for png files in folders from cur_dir
redirects the output of the command find to the input of convert, this is done by back quoting find command
converts works and output to pdf file
(this very simple command line works fine only with unspaced filenames, don't miss quoting the wild char, and back quoting the find command ;) )
[edit]Care....
be sure of what you are doing.
if you delete your png files, you will just loose your original sources...
it might be a very bad practice...
using convert without any tricky -quality output option could create an enormous pdf file... and you might have to re-convert with -quality "60" for instance...
so keep your original sources until you do not need them any more

How to automate conversion of images

I can convert an image like this:
convert -resize 50% foo.jpg foo_50.jpg
How can I automate such a command to convert all the images in a folder?
You can assume every image has .jpg extension.
A solution easily adaptable to automate the conversion of all the images inside the subdirectories of the working directory is preferable.
You can use a for loop with pattern expansion:
for img in */*.jpg ; do
convert -resize 50% "$img" "${img%.jpg}"_50.jpg
done
${variable%pattern} removes the pattern from the right side of the $variable.
You can use find -exec:
find -type f -name '*.jpg' -exec \
bash -c 'convert -resize 50% "$0" "${0%.jpg}"_50.jpg' {} \;
find -type f -name '*.jpg' finds all .jpg files (including those in subdirectories) and hands it to the command after -exec, where it can be referenced using {}.
Because we want to use parameter expansion, we can't use -exec convert -resize directly; we have to call bash -c and supply {} as a positional parameter to it ($0 inside the command). \; marks the end of the -exec command.
You can also try this (less elegant) one-liner using ls+awk:
ls *.jpg | awk -F '.' '{print "convert -resize 50% "$0" "$1"_50.jpg"}' | sh
this assumes that all the .jpg files are in the current directory. before running this, try to remove the | sh and see what is printed on the screen.

Batch resize and rename image files in subdirectories

I'm trying to resize and rename several hundred subdirectories of images. The files that I need to be changed:
End with A.jpg
Need to be resized down to 400x400
Renamed to A#2x.jpg
Example:
images/**/A6919994719A#2x.jpg
I got the resizing bit down in one directory. I'm having some trouble finding a way to rename just the end of the file, not the extension, and executing it through the subdirectories.
Any help would be appreciated.
#!/bin/bash
for i in $( ls *A.jpg); do convert -resize 400x400 $i
You can do this:
#!/bin/bash
find . -name "*A.jpg" | while read f
do
newname=${f/A.jpg/A#2.jpg}
echo convert "$f" -resize 400x400 "$newname"
done
Remove the word echo if it looks correct, and run only on files you have backed up.
You can also do it in a one-liner, if you really want to:
find . -name "*A.jpg" -exec bash -c 'old="$0";new=${old/A.jpg/A#2.jpg};echo convert "$old" -resize 400x400 "$new"' {} \;

Resources