renumbering image files to be contiguous in bash - bash

I have a directory with image files that follow a naming scheme and are not always contiguous. e.i:
IMG_33.jpg
IMG_34.jpg
IMG_35.jpg
IMG_223.jpg
IMG_224.jpg
IMG_225.jpg
IMG_226.jpg
IMG_446.jpg
I would like to rename them so they go something like this, in the same order:
0001.jpg
0002.jpg
0003.jpg
0004.jpg
0005.jpg
0006.jpg
0007.jpg
0008.jpg
So far this is what I came up, and while it does the four-digit padding, it doesn't sort by the number values in the filenames.
#!/bin/bash
X=1;
for i in *; do
mv $i $(printf %04d.%s ${X%.*} ${i##*.})
let X="$X+1"
done
result:
IMG_1009.JPG 0009.JPG
IMG_1010.JPG 0010.JPG
IMG_101.JPG 0011.JPG
IMG_102.JPG 0012.JPG

Update:
Try this. If output is okay remove echo.
X=1; find . -maxdepth 1 -type f -name "*.jpg" -print0 | sort -z -n -t _ -k2 | while read -d $'\0' -r line; do echo mv "$line" "$(printf "%04d%s" $X .jpg)"; ((X++)); done

Using the super helpful rename. First, pads files with one digit to two digits; then pads files with two digits to three digits; etc.
rename IMG_ IMG_0 IMG_?.jpg
rename IMG_ IMG_0 IMG_??.jpg
rename IMG_ IMG_0 IMG_???.jpg
Then, your for-loop (or another similar one) that renames does the trick as the files are in both alphabetical and numerical order.

how about this :
while read f1;do
echo $f1
mv IMG_$f1 $f1
done< <(ls | cut -d '_' -f 2 | sort -n)
thanks
Michael

Related

bash iterate over a directory sorted by file size

As a webmaster, I generate a lot of junk files of code. Periodically I have to purge the unneeded files filtered by extention. Example: "cleaner txt" Easy enough. But I want to sort the files by size and process them for the "for" loop. How can I do that?
cleaner:
#/bin/bash
if [ -z "$1" ]; then
echo "Please supply the filename suffixes to delete.";
exit;
fi;
filter=$1;
for FILE in *.$filter; do clear;
cat $FILE; printf '\n\n'; rm -i $FILE; done
You can use a mix of find (to print file sizes and names), sort (to sort the output of find) and cut (to remove the sizes). In case you have very unusual file names containing any possible character including newlines, it is safer to separate the files by a character that cannot be part of a name: NUL.
#/bin/bash
if [ -z "$1" ]; then
echo "Please supply the filename suffixes to delete.";
exit;
fi;
filter=$1;
while IFS= read -r -d '' -u 3 FILE; do
clear
cat "$FILE"
printf '\n\n'
rm -i "$FILE"
done 3< <(find . -mindepth 1 -maxdepth 1 -type f -name "*.$filter" \
-printf '%s\t%p\0' | sort -zn | cut -zf 2-)
Note that we must use a different file descriptor than stdin (3 in this example) to pass the file names to the loop. Else, if we use stdin, it will also be used to provide the answers to rm -i.
Inspired from this answer, you could use the find command as follows:
find ./ -type f -name "*.yaml" -printf "%s %p\n" | sort -n
find command prints the the size of the files and the path so that the sort command prints the results from the smaller one to the larger.
In case you want to iterate through (let's say) the 5 bigger files you can do something like this using the tail command like this:
for f in $(find ./ -type f -name "*.yaml" -printf "%s %p\n" |
sort -n |
cut -d ' ' -f 2)
do
echo "### $f"
done
If the file names don't contain newlines and spaces
while read filesize filename; do
printf "%-25s has size %10d\n" "$filename" "$filesize"
done < <(du -bs *."$filter"|sort -n)
while read filename; do
echo "$filename"
done < <(du -bs *."$filter"|sort -n|awk '{$0=$2}1')

Bash Script to Prepend a Single Random Character to All Files In a Folder

I have an audio sample library with thousands of files. I would like to shuffle/randomize the order of these files. Can someone provide me with a bash script/line that would prepend a single random character to all files in a folder (including files in sub-folders). I do not want to prepend a random character to any of the folder names though.
Example:
Kickdrum73.wav
Kickdrum SUB.wav
Kick808.mp3
Renamed to:
f_Kickdrum73.wav
!_Kickdrum SUB.wav
4_Kick808.mp3
If possible, I would like to be able to run this script more than once, but on subsequent runs, it just changes the randomly prepended character instead of prepending a new one.
Some of my attempts:
find ~/Desktop/test -type f -print0 | xargs -0 -n1 bash -c 'mv "$0" "a${0}"'
find ~/Desktop/test/ -type f -exec mv -v {} $(cat a {}) \;
find ~/Desktop/test/ -type f -exec echo -e "Z\n$(cat !)" > !Hat 15.wav
for file in *; do
mv -v "$file" $RANDOM_"$file"
done
Note: I am running on macOS.
Latest attempt using code from mr. fixit:
find . -type f -maxdepth 999 -not -name ".*" |
cut -c 3- - |
while read F; do
randomCharacter="${F:2:1}"
if [ $randomCharacter == '_' ]; then
new="${F:1}"
else
new="_$F"
fi
fileName="`basename $new`"
newFilename="`jot -r -c $fileName 1 A Z`"
filePath="`dirname $new`"
newFilePath="$filePath$newFilename"
mv -v "$F" "$newFilePath"
done
Here's my first answer, enhanced to do sub-directories.
Put the following in file randomize
if [[ $# != 1 || ! -d "$1" ]]; then
echo "usage: $0 <path>"
else
find $1 -type f -not -name ".*" |
while read F; do
FDIR=`dirname "$F"`
FNAME=`basename "$F"`
char2="${FNAME:1:1}"
if [ $char2 == '_' ]; then
new="${FNAME:1}"
else
new="_$FNAME"
fi
new=`jot -r -w "%c$new" 1 A Z`
echo mv "$F" "${FDIR}/${new}"
done
fi
Set the permissions with chmod a+x randomize.
Then call it with randomize your/path.
It'll echo the commands required to rename everything, so you can examine them to ensure they'll work for you. If they look right, you can remove the echo from the 3rd to last line and rerun the script.
cd ~/Desktop/test, then
find . -type f -maxdepth 1 -not -name ".*" |
cut -c 3- - |
while read F; do
char2="${F:2:1}"
if [ $char2 == '_' ]; then
new="${F:1}"
else
new="_$F"
fi
new=`jot -r -w "%c$new" 1 A Z`
mv "$F" "$new"
done
find . -type f -maxdepth 1 -not -name ".*" will get all the files in the current directory, but not the hidden files (names starting with '.')
cut -c 3- - will strip the first 2 chars from the name. find outputs paths, and the ./ gets in the way of processing prefixes.
while read VAR; do <stuff>; done is a way to deal with one line at a time
char2="${VAR:2:1} sets a variable char2 to the 2nd character of the variable VAR.
if - then - else sets new to the filename, either preceded by _ or with the previous random character stripped off.
jot -r -w "%c$new" 1 A Z tacks random 1 character from A-Z onto the beginning of new
mv old new renames the file
You can also do it all in bash and there are several ways to approach it. The first is simply creating an array of letters containing whatever letters you want to use as a prefix and then generating a random number to use to choose the element of the array, e.g.
#!/bin/bash
letters=({0..9} {A..Z} {a..z}) ## array with [0-9] [A-Z] [a-z]
for i in *; do
num=$(($RANDOM % 63)) ## generate number
## remove echo to actually move file
echo "mv \"$i\" \"${letters[num]}_$i\"" ## move file
done
Example Use/Output
Current the script outputs the changes it would make, you must remove the echo "..." surrounding the mv command and fix the escaped quotes to actually have it apply changes:
$ bash ../randprefix.sh
mv "Kick808.mp3" "4_Kick808.mp3"
mv "Kickdrum SUB.wav" "h_Kickdrum SUB.wav"
mv "Kickdrum73.wav" "l_Kickdrum73.wav"
You can also do it by generating a random number representing the ASCII character between 48 (character '0') through 126 (character '~'), excluding 'backtick'), and then converting the random number to an ASCII character and prefix the filename with it, e.g.
#!/bin/bash
for i in *; do
num=$((($RANDOM % 78) + 48)) ## generate number for '0' - '~'
letter=$(printf "\\$(printf '%03o' "$num")") ## letter from number
while [ "$letter" = '`' ]; do ## exclude '`'
num=$((($RANDOM % 78) + 48)) ## generate number
letter=$(printf "\\$(printf '%03o' "$num")")
done
## remove echo to actually move file
echo "mv \"$i\" \"${letter}_$i\"" ## move file
done
(similar output, all punctuation other than backtick is possible)
In each case you will want to place the script in your path or call it from within the directory you want to move the file in (you split split dirname and basename and join them back together to make the script callable passing the directory to search as an argument -- that is left to you)

UNIX :: Padding for files containing string and multipleNumber

I have many files not having consistent filenames.
For example
IMG_20200823_1.jpg
IMG_20200823_10.jpg
IMG_20200823_12.jpg
IMG_20200823_9.jpg
I would like to rename all of them and ensure they all follow same naming convention
IMG_20200823_0001.jpg
IMG_20200823_0010.jpg
IMG_20200823_0012.jpg
IMG_20200823_0009.jpg
Found out it's possible to change for file having only a number using below
printf "%04d\n"
However am not able to do with my files considering they mix string + "_" + different numbers.
Could anyone help me ?
Thanks !
With Perl's standalone rename or prename command:
rename -n 's/(\d+)(\.jpg$)/sprintf("%04d%s",$1,$2)/e' *.jpg
Output:
rename(IMG_20200823_10.jpg, IMG_20200823_0010.jpg)
rename(IMG_20200823_12.jpg, IMG_20200823_0012.jpg)
rename(IMG_20200823_1.jpg, IMG_20200823_0001.jpg)
rename(IMG_20200823_9.jpg, IMG_20200823_0009.jpg)
if everything looks fine, remove -n.
With Bash regular expressions:
re='(IMG_[[:digit:]]+)_([[:digit:]]+)'
for f in *.jpg; do
[[ $f =~ $re ]]
mv "$f" "$(printf '%s_%04d.jpg' "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}")"
done
where BASH_REMATCH is an array containing the capture groups of the regular expression. At index 0 is the whole match; index 1 contains IMG_ and the first group of digits; index 2 contains the second group of digits. The printf command is used to format the second group with zero padding, four digits wide.
Use a regex to extract the relevant sub-strings from the input and then pad it...
For each file.
Extract the prefix, number and suffix from the filename.
Pad the number with zeros.
Create the new filename.
Move files
The following code for bash:
echo 'IMG_20200823_1.jpg
IMG_20200823_10.jpg
IMG_20200823_12.jpg
IMG_20200823_9.jpg' |
while IFS= read -r file; do # foreach file
# Use GNU sed to extract parts on separate lines
tmp=$(<<<"$file" sed 's/\(.*_\)\([0-9]*\)\(\..*\)/\1\n\2\n\3\n/')
# Read the separate parts separated by newlines
{
IFS= read -r prefix
IFS= read -r number
IFS= read -r suffix
} <<<"$tmp"
# create new filename
newfilename="$prefix$(printf "%04d" "$number")$suffix"
# move the files
echo mv "$file" "$newfilename"
done
outputs:
mv IMG_20200823_1.jpg IMG_20200823_0001.jpg
mv IMG_20200823_10.jpg IMG_20200823_0010.jpg
mv IMG_20200823_12.jpg IMG_20200823_0012.jpg
mv IMG_20200823_9.jpg IMG_20200823_0009.jpg
Being puzzled by your hint at printf...
Current folder content:
$ ls -1 IMG_*
IMG_20200823_1.jpg
IMG_20200823_21.jpg
Surely is not a good solution but with printf and sed we can do that:
$ printf "mv %3s_%8s_%d.%3s %3s_%8s_%04d.%3s\n" $(ls -1 IMG_* IMG_* | sed 's/_/ /g; s/\./ /')
mv IMG_20200823_1.jpg IMG_20200823_0001.jpg
mv IMG_20200823_21.jpg IMG_20200823_0021.jpg

How to get list of certain strings in a list of files using bash?

The title is maybe not really descriptive, but I couldn't find a more concise way to describe the problem.
I have a directory containing different files which have a name that e.g. looks like this:
{some text}2019Q2{some text}.pdf
So the filenames have somewhere in the name a year followed by a capital Q and then another number. The other text can be anything, but it won't contain anything matching the format year-Q-number. There will also be no numbers directly before or after this format.
I can work something out to get this from one filename, but I actually need a 'list' so I can do a for-loop over this in bash.
So, if my directory contains the files:
costumerA_2019Q2_something.pdf
costumerB_2019Q2_something.pdf
costumerA_2019Q3_something.pdf
costumerB_2019Q3_something.pdf
costumerC_2019Q3_something.pdf
costumerA_2020Q1_something.pdf
costumerD2020Q2something.pdf
I want a for loop that goes over 2019Q2, 2019Q3, 2020Q1, and 2020Q2.
EDIT:
This is what I have so far. It is able to extract the substrings, but it still has doubles. Since I'm already in the loop and I don't see how I can remove the doubles.
find original/*.pdf -type f -print0 | while IFS= read -r -d '' line; do
echo $line | grep -oP '[0-9]{4}Q[0-9]'
done
# list all _filanames_ that end with .pdf from the folder original
find original -maxdepth 1 -name '*.pdf' -type f -print "%p\n" |
# extract the pattern
sed 's/.*\([0-9]{4}Q[0-9]\).*/\1/' |
# iterate
while IFS= read -r file; do
echo "$file"
done
I used -print %p to print just the filename, instead of full path. The GNU sed has -z option that you can use with -print0 (or -print "%p\0").
With how you have wanted to do this, if your files have no newline in the name, there is no need to loop over list in bash (as a rule of a thumb, try to avoid while read line, it's very slow):
find original -maxdepth 1 -name '*.pdf' -type f | grep -oP '[0-9]{4}Q[0-9]'
or with a zero seprated stream:
find original -maxdepth 1 -name '*.pdf' -type f -print0 |
grep -zoP '[0-9]{4}Q[0-9]' | tr '\0' '\n'
If you want to remove duplicate elements from the list, pipe it to sort -u.
Try this, in bash:
~ > $ ls
costumerA_2019Q2_something.pdf costumerB_2019Q2_something.pdf
costumerA_2019Q3_something.pdf other.pdf
costumerA_2020Q1_something.pdf someother.file.txt
~ > $ for x in `(ls)`; do [[ ${x} =~ [0-9]Q[1-4] ]] && echo $x; done;
costumerA_2019Q2_something.pdf
costumerA_2019Q3_something.pdf
costumerA_2020Q1_something.pdf
costumerB_2019Q2_something.pdf
~ > $ (for x in *; do [[ ${x} =~ ([0-9]{4}Q[1-4]).+pdf ]] && echo ${BASH_REMATCH[1]}; done;) | sort -u
2019Q2
2019Q3
2020Q1

Find files which share part of a filename

In my current directory there are many files. Some of the files share part of their filename.
e.g.:
XGAE_537493_GSR.FITS
TGFE_537493_RRF.FITS
EGRE_537497_HDR.FITS
TRTE_537497_YUH.FITS
TRXX_537499_YDF.FITS
.
.
Files 1 & 2 would be a match, as would files 3 & 4. File 5 has no match. Therefore, files 1,2,3 and 4 would be moved.
I want to move the files which share part of their filename, in order to separate them from the ones that don't.
I was attempting to do this using bash. I googled but couldn't locate websites that were quite describing the process I need. So far in pseudo-code I have:
FOR F IN *
IF ${FILE:5:10} MATCHES ANY OTHER ${FILE:5:10}
MOVE ALL MATCHES TO ANOTHER DIRECTORY
Any information to help me move in the right direction would be appreciated.
Try this:
for f in ./*.FITS ; do
middleBit=$(echo $f| cut -d'_' -f 1)
count=$(ls *middleBit*.FITS | wc -l)
if [ $count -ge 1 ]
then
for match in *middleBit*.FITS ; do
mv $match ./somewhere
done
fi
done
Using associative array in BASH 4 you can do it easily:
#!/bin/bash
declare -A arr
for f in *.FITS; do
k="${f:5:6}"
[[ ${arr[$k]} ]] && mv "$f" /dest/ || arr["$k"]=1
done
if your file structure is fixed, you can scan them and find duplicates in sub fields of the file name in awk.
for example
$ ls -1 | awk -F_ 'NF==3{f[$2]=(a[$2]++?f[$2] OFS $0:$0)}
END{for(k in f) if(a[k]>1) print f[k]} '
TGFE_537493_RRF.FITS
XGAE_537493_GSR.FITS
you can then pipe the results to a cp command
$ ... | xargs -I file cp file file.DUP
adds suffix DUP to duplicate file names, or
$ ... | xargs -I file mv file anotherlocation/
moves to anotherlocation.

Resources