Find unreferenced png files in an Xcode workspace - xcode

I have an Xcode workspace with several hundred png files and would like to list those which are unreferenced.
Example pngs:
capture-bg-1.png
capture-bg-1#2x.png
promo_icon.png
promo_icon#2x.png
Reference example for "promo_icon" (XML file):
<string>promo_icon</string>
Reference example for "promo_icon" (Objective-C):
[UIImage imageNamed:#"promo_icon"]
I want to get a list of filenames including "capture-bg-1" (presuming it has no matches like "promo_icon" does).
A little wrinkle is that there is a .pbxproj file (XML) that has a reference to every png file in the workspace so that file needs to be excluded from the search.
The following command gets all the unique filename parts (excluding folder and everything after '#' and '.') for evaluation.
find . -name *.png -exec basename {} \;| sed 's/[.#].*$//' | uniq
The grep part into which I would pipe the filename parts is the problem. This grep finds the files that do or do not reference 'promo_icon' and lists the references. An empty return value (no references) would be a png file I'm looking for a list of:
grep -I -R promo_icon . | grep -v pbxproj
However I can't figure out how to combine the two in a functional way. There is this snippet (https://stackoverflow.com/a/16258198/26235) for doing this in sh but it doesn't work.

An easier way to do this might be to put the list of all PNG names into one file, one per line. Then put the list of all references to PNG names into another file, one per line. Then grep -v -f the first file against the second. Whatever is returned is your answer.
First,
find . -name '*.png' -printf %f | sed -e 's/[.#].*$//' | sort -u > pngList
Then,
grep -RI --exclude .pbxproj -e '<string>.*png</string>' \
-e 'UIImage imageNamed' . > pngRefs
Finally,
grep -v -f pngList pngRefs
And you can clean up the results with sed and sort -u from there.
::edit::
The above approach could produce some wrong answers if you have any PNGs whose names are proper substrings of other PNGs. For example, if you have promo_icon and cheese_promo_icon and promo_icon is never referenced but cheese_promo_icon is referenced, the above approach will not detect that promo_icon is unreferenced.
To deal with this problem, you can surround your PNG name patterns with \b (word-boundary) sequences:
find . -name '*.png' -printf %f | sed -e 's/^/\\b/' -e 's/$/\\b/' -e 's/[.#].*$//' | sort -u > pngList
This way your pngList file will contain lines like this:
\bpromo_icon\b
\bcapture-bg-1\b
so when you grep it against the list of references it will only match when the name of each PNG is the entire name in the image ref (and not a substring of a longer name).

This is the script that finds unreferenced images in an Xcode project. One gotcha is that people may use string formatting to construct references to images and that's unaccounted for here. Mac users will want to install findutils via brew to get a version of find with printf:
#!/bin/sh
# Finds unreferenced PNG assets in an xcode project
# Get a list of png file stems, stripping out folder information, 'png' extension
# and '#2x' parts of the filename
for png in `find . -name '*.png' -printf '%f\n' | sed -e 's/[.#].*$//' | sort -u`
# Loop through the files and print out a list of files not referenced. Keep in mind
# that some files like 'asset-1' may be referred to in code like 'asset-%d' so be careful
do
name=`basename $png`
if ! grep -qRI --exclude project.pbxproj --exclude-dir Podfile $png . ; then
echo "$png is not referenced"
fi
done

Related

bash script remove squares prefix when reading a file content [duplicate]

For debugging purposes, I need to recursively search a directory for all files which start with a UTF-8 byte order mark (BOM). My current solution is a simple shell script:
find -type f |
while read file
do
if [ "`head -c 3 -- "$file"`" == $'\xef\xbb\xbf' ]
then
echo "found BOM in: $file"
fi
done
Or, if you prefer short, unreadable one-liners:
find -type f|while read file;do [ "`head -c3 -- "$file"`" == $'\xef\xbb\xbf' ] && echo "found BOM in: $file";done
It doesn't work with filenames that contain a line break,
but such files are not to be expected anyway.
Is there any shorter or more elegant solution?
Are there any interesting text editors or macros for text editors?
What about this one simple command which not just finds but clears the nasty BOM? :)
find . -type f -exec sed '1s/^\xEF\xBB\xBF//' -i {} \;
I love "find" :)
Warning The above will modify binary files which contain those three characters.
If you want just to show BOM files, use this one:
grep -rl $'\xEF\xBB\xBF' .
The best and easiest way to do this on Windows:
Total Commander → go to project's root dir → find files (Alt + F7) → file types *.* → Find text "EF BB BF" → check 'Hex' checkbox → search
And you get the list :)
find . -type f -print0 | xargs -0r awk '
/^\xEF\xBB\xBF/ {print FILENAME}
{nextfile}'
Most of the solutions given above test more than the first line of the file, even if some (such as Marcus's solution) then filter the results. This solution only tests the first line of each file so it should be a bit quicker.
If you accept some false positives (in case there are non-text files, or in the unlikely case there is a ZWNBSP in the middle of a file), you can use grep:
fgrep -rl `echo -ne '\xef\xbb\xbf'` .
You can use grep to find them and Perl to strip them out like so:
grep -rl $'\xEF\xBB\xBF' . | xargs perl -i -pe 's{\xEF\xBB\xBF}{}'
I would use something like:
grep -orHbm1 "^`echo -ne '\xef\xbb\xbf'`" . | sed '/:0:/!d;s/:0:.*//'
Which will ensure that the BOM occurs starting at the first byte of the file.
For a Windows user, see this (good PHP script for finding the BOM in your project).
An overkill solution to this is phptags (not the vi tool with the same name), which specifically looks for PHP scripts:
phptags --warn ./
Will output something like:
./invalid.php: TRAILING whitespace ("?>\n")
./invalid.php: UTF-8 BOM alone ("\xEF\xBB\xBF")
And the --whitespace mode will automatically fix such issues (recursively, but asserts that it only rewrites .php scripts.)
I used this to correct only JavaScript files:
find . -iname *.js -type f -exec sed 's/^\xEF\xBB\xBF//' -i.bak {} \; -exec rm {}.bak \;
find -type f -print0 | xargs -0 grep -l `printf '^\xef\xbb\xbf'` | sed 's/^/found BOM in: /'
find -print0 puts a null \0 between each file name instead of using new lines
xargs -0 expects null separated arguments instead of line separated
grep -l lists the files which match the regex
The regex ^\xeff\xbb\xbf isn't entirely correct, as it will match non-BOMed UTF-8 files if they have zero width spaces at the start of a line
If you are looking for UTF files, the file command works. It will tell you what the encoding of the file is. If there are any non ASCII characters in there it will come up with UTF.
file *.php | grep UTF
That won't work recursively though. You can probably rig up some fancy command to make it recursive, but I just searched each level individually like the following, until I ran out of levels.
file */*.php | grep UTF

Renaming multiple directories while keeping a part in the middle with varying suffix

I'm trying to change the name of multiple directories using bash, the names being structures like the following:
DRMAD_CA-12__MRBK01_237a8430 DRMAD_CA-17__MRBK10_766c3396
DRMAD_CA-103__MRBK100_c27a6c1c
The goal is the to keep the MRBK as well as the number following directly after it (MRBK###), but to get rid of of the rest. The pattern of the prefix is always the same (DRMAD_CA-###__), while the suffix is '_' followed by a combination of exactly 8 letters and digits. Tried sed, but can't seem to figure out the right pattern.
Seeing other posts on Stackoverflow, I've tired variations of
ls | while read file; do new=$( echo $file | sed 's/[^0-9]*\([^ ]*\)[^.]*\(\..*\)*MRBK\1\2/' ) mv "$file" "$new" done
But since I don't really understand the syntax of sed, it doesn't produce a usable result.
Use rename utility.
First, print the old and new names, but do not rename anything:
rename --dry-run 's/.*(MRBK\d+).*/$1/' *MRBK*
If OK, actually rename:
rename 's/.*(MRBK\d+).*/$1/' *MRBK*
Install rename, for example, using conda.
Using find:
find . -type d -regextype posix-extended -regex "^.*MRBK[[:digit:]]+.*$" | while read line
do
dir=$(dirname $line)
newfil=$(grep -Eo 'MRBK[[:digit:]]+' <<< $line)
mv "$line" "$dir/$newfil"
done

Pass .txt list of .jpgs to convert (bash)

I'm currently working on an exercise that requires me to write a shell script whose function is to take a single command-line argument that is a directory. The script takes the given directory, and finds all the .jpgs in that directory and its sub-directories, and creates an image-strip of all the .jpgs in order of modification time (newest on bottom).
So far, I've written:
#!bin/bash/
dir=$1 #the first argument given will be saved as the dir variable
#find all .jpgs in the given directory
#then ls is run for the .jpgs, with the date format %s (in seconds)
#sed lets the 'cut' process ignore the spaces in the columns
#fields 6 and 7 (the name and the time stamp) are then cut and sorted by modification date
#then, field 2 (the file name) is selected from that input
#Finally, the entire sorted output is saved in a .txt file
find "$dir" -name "*.jpg" -exec ls -l --time-style=+%s {} + | sed 's/ */ /g' | cut -d' ' -f6,7 | sort -n | cut -d' ' -f2 > jgps.txt
The script correctly outputs the directory's .jpgs in order of time modification. The part that I am currently struggling on is how to give the list in the .txt file to the convert -append command that will create an image-strip for me (For those who aren't aware of that command, what would be inputted is: convert -append image1.jpg image2.jpg image3.jpg IMAGESTRIP.jpgwith IMAGESTRIP.jpg being the name of the completed image strip file made up of the previous 3 images).
I can't quite figure out how to pass the .txt list of files and their paths to this command. I've been scouring the man pages to find a possible solution but no viable ones have arisen.
xargs is your friend:
find "$dir" -name "*.jpg" -exec ls -l --time-style=+%s {} + | sed 's/ */ /g' | cut -d' ' -f6,7 | sort -n | cut -d' ' -f2 | xargs -I files convert -append files IMAGESTRIP.jpg
Explanation
The basic use of xargs is:
find . -type f | xargs rm
That is, you specify a command to xargs, it appends the arguments it receives from standard input and then executes it. The avobe line would execute:
rm file1 file2 ...
But you also need to specify a final argument to the command, so you need to use the xarg -I parameter, which tells xargs the string you will use after to indicate where the arguments read from standard input will be put.
So, we use the string files to indicate it. Then we write the command, putting the string files where the variable arguments will be, resulting in:
xargs -I files convert -append files IMAGESTRIP.jpg
Put the list of filenames in a file called filelist.txt and call convert with the filename prepended by an ampersand:
convert #filelist.txt -append result.jpg
Here's a little example:
# Create three blocks of colour
convert xc:red[200x100] red.png
convert xc:lime[200x100] green.png
convert xc:blue[200x100] blue.png
# Put their names in a file called "filelist.txt"
echo "red.png green.png blue.png" > filelist.txt
# Tell ImageMagick to make a strip
convert #filelist.txt +append strip.png
As there's always some image with a pesky space in its name...
# Make the pesky one
convert -background black -pointsize 128 -fill white label:"Pesky" -resize x100 "image with pesky space.png"
# Whack it in the list for IM
echo "red.png green.png blue.png 'image with pesky space.png'" > filelist.txt
# IM do your stuff
convert #filelist.txt +append strip.png
By the way, it is generally poor practice to parse the output of ls in case there are spaces in your filenames. If you want to find a list of images, across directories and sort them by time, look at something like this:
# Find image files only - ignoring case, so "JPG", "jpg" both work
find . -type f -iname \*.jpg
# Now exec `stat` to get the file ages and quoted names
... -exec stat --format "%Y:%N {} \;
# Now sort that, and strip the times and colon at the start
... | sort -n | sed 's/^.*://'
# Put it all together
find . -type f -iname \*.jpg -exec stat --format "%Y:%N {} \; | sort -n | sed 's/^.*://'
Now you can either redirect all that to filelist.txt and call convert like this:
find ...as above... > file list.txt
convert #filelist +append strip.jpg
Or, if you want to avoid intermediate files and do it all in one go, you can make this monster where convert reads the filelist from its standard input stream:
find ...as above... | sed 's/^.*://' | convert #- +append strip.jpg

Visit all subdirectories and extract first page from every pdf

I have a few folders with E-Books and I want to extract first page from every book. There are over two hundred books so doing this manually it's a big pain in the back and will be very time consuming.
I have a command that does the job for single file
pdftk TehInput.pdf cat 1 output cover_TehInput.pdf
How do I wrap this into a single script that visits everything and assigns the name to output like cover_wtv-original-name-is.pdf? All the output files might be everywhere like in the directory where script was started or near the original file.
You want to use the find command for this. Something like:
find . -iname '*.pdf' -exec pdftk '{}' cat 1 output '{}'.cover.pdf ';'
This will find all PDFs from the current directory (.) downwards, and execute
pdftk filename.pdf cat 1 output filename.pdf.cover.pdf
on it. It's the whole path that will get passed to pdftk, so you'll end up with the cover PDFs in the same directory as the original files. (You could do something to get rid of the .pdf.cover.pdf extensions if you need to.)
If you use no blanks or newlines in filenames:
find . -iname '*.pdf' -printf "%h %f\n" | sed -E 's|(.*) (.*)|echo pdftk \1/\2 cat 1 output \1/cover_\2|' | sh
If output is okay, remove "echo ".

Bash script to find files based on filename and do search replace on them

I have a bunch of files (with same name, say abc.txt) strewn all over the network filesystem.
I need to recursively search for each of those files and once I find each one, do a content search and replace on them.
After some research, I see that I can find the files using the find command (with -r to recurse right?). Something like:
find . -r -type f abc.txt
And use sed to do find and replace on each one:
sed -ie 's/search/replace/g' abc.txt
But I'm not sure how to glue the two together so that I find all occurrences of abc.txt and then do a search/replace on each one.
My directory tree is really large so I know a recursive search through all the directories will take a long time but that's better than manually trying to find each file.
I'm on OSX 10.6 with Bash.
Thanks all!
Update: I think the answer posted below may work for other OSes (linux perhaps) but for OSX, I had to tweak it a bit.
find . -name abc.text -exec sed -i '' 's/search/replace/g' {} +
the empty quotes seem to required after a -i in sed to indicate that we don't want to produce a backup file. The man page for this reads:
-i extension:
Edit files in-place, saving backups with the specified extension. If a zero-length extension is given, no backup will be saved.
find . -r -type f abc.txt -exec sed -i -e 's/search/replace/g' {} +

Resources