rename images using terminal script in linux - terminal

In my /home/myself/Pictures/travels folder on the Fedora 17 linux I have files IMG_2516.JPG, IMG_2519.JPG, IMG_2520.JPG, IMG_2525.JPG, IMG_2528.JPG.
I would like to rename them one by one from left to right such that IMG_2516.JPG becomes 01.JPG, IMG_2519.JPG - 02.JPG, IMG_2520.JPG - 03.JPG, IMG_2525.JPG - 04.JPG, IMG_2528.JPG - 05.JPG.
Notice that neighbouring numbers can be close (as 2519 and 2520) and distant (2516 and 2519), but always increase.
How can I write a terminal script to substitute the routine. These numbers are given for example, there many more files and at the moment I can only manually rename them (very time-consuming).

If the images all have the same number of digits:
I=1
for F in IMG_*.JPG; do
mv "$F" IMG_$(printf "%02d" $I).JPG
I=$(( I + 1 ))
done
Otherwise,
LIST=$(mktemp)
find . -maxdepth 1 -iname "*.jpg" > $LIST
sort -n -o $LIST $LIST
I=1
cat $LIST | while read F; do
mv "$F" IMG_$(printf "%02d" $I).JPG
I=$(( I + 1 ))
done
rm "$LIST"
The first one means: For each image, move it to IMG_0I.JPG, increase I by 1.
The second one means:
make a temporary file
find all of the JPG files in the directory (and not subdirectories, case-insensitive),
save one per line in the temporary file.
sort them by their numerical ordering (-n) and write back to the temporary file (-o)
send the contents of the file to the following:
-- while there is a next line,
-- -- store it in F
-- -- move the file with that name to I, where I is represented with two digits and padded with leading zeros, prefixed with IMG_ and postfixed with .JPG
-- -- increase I

Related

Splitting multiple input files into multiple outputs using split function in linux

I have 8 files I would like to split into 5 chunks per file. I would normally do this individually but would like to run this as a loop. I work within a HPC.
I have created a list of the file names and labelled it "variantlist.txt". My code is:
for f in 'cat variantlist.txt'; do split ${f} -n 5 -d; done
However, it only splits the final file in the variantlist.txt file outputting 5 chunks from the final entry only.
Even if I list the files individually:
for f in chr001.vcf chr002 ...chr008.vcf ; do split ${f} -n 5 -d; done
It still only splits the final file into 5 chunks.
Not sure where I am going wrong here. The desired output would be 40 chunks, 5 per chromosome. Your help would be greatly appreciated.
Many thanks
The split is creating the same set of files each time and overwriting the previous ones. Here's one way to handle that -
for f in $(<variantlist.txt) # don't use cat
do mkdir -p $f.split # make a subdir for the files
( cd $f.split && # change into the subdir only in a subshell
split ../$f -n 5 -d # split from there
) # close the subshell, parent still in base dir
done
Or you could just do this -
while read f # grab each filename
do split $f -n 5 -d # split it
for x in x?? # for each split file
do mv $x $f.$x # rename it to include the parent file name
done
done < variantlist.txt # take names from this file
This is a lot slower, but doesn't use subdirs.
My favorite, though -
xargs -I {} split {} -n 5 -d {} < variantlist.txt
The last arg becomes the PREFIX for split instead of the default of x.
EDIT -- with 2 billion lines per file, use this one:
for f in $(<variantlist.txt)
do split "$f" -d -n 5 "$f" & # run all in background at the same time
done
When using split the -n swicth will determine the number of output files that the orinal is split into...
You need -l for the number of lines you need, 5 in your case:
split -l 5 ${f}

Renames numbered files using names from list in other file

I have a folder where there are books and I have a file with the real name of each file. I renamed them in a way that I can easily see if they are ordered, say "00.pdf", "01.pdf" and so on.
I want to know if there is a way, using the shell, to match each of the lines of the file, say "names", with each file. Actually, match the line i of the file with the book in the positión i in sort order.
<name-of-the-book-in-the-1-line> -> <book-in-the-1-position>
<name-of-the-book-in-the-2-line> -> <book-in-the-2-position>
.
.
.
<name-of-the-book-in-the-i-line> -> <book-in-the-i-position>
.
.
.
I'm doing this in Windows, using Total Commander, but I want to do it in Ubuntu, so I don't have to reboot.
I know about mv and rename, but I'm not as good as I want with regular expressions...
renamer.sh:
#!/bin/bash
for i in `ls -v |grep -Ev '(renamer.sh|names.txt)'`; do
read name
mv "$i" "$name.pdf"
echo "$i" renamed to "$name.pdf"
done < names.txt
names.txt: (line count must be the exact equal to numbered files count)
name of first book
second-great-book
...
explanation:
ls -v returns naturally sorted file list
grep excludes this script name and input file to not be renamed
we cycle through found file names, read value from file and rename the target files by this value
For testing purposes, you can comment out the mv command:
#mv "$i" "$name"
And now, simply run the script:
bash renamer.sh
This loops through names.txt, creates a filename based on a counter (padding to two digits with printf, assigning to a variable using -v), then renames using mv. ((++i)) increases the counter for the next filename.
#!/bin/bash
i=0
while IFS= read -r line; do
printf -v fname "%02d.pdf" "$i"
mv "$fname" "$line"
((++i))
done < names.txt

print recursively the number of files in folders

I struggled for hours to get this uggly line to work
wcr() { find "$#" -type d | while read F; do find $F -maxdepth 0 && printf "%5d " $(ls $F | wc) && printf "$F"; done; echo; }
Here is the result
39 41 754 ../matlab.sh
1 1 19 ./matlab.sh./micmac
1 1 14 ./micmac
My first question is: how can I write it smarter?
Second question: I would like the names printed before the counts but I dont know how to tabulate the outputs, so I cannot do better than this:
.
./matlab.sh 1 1 19
./matlab.sh./micmac 1 1 19
./micmac 1 1 14
I don't see what that find $F -maxdepth 0 shall be good for, so I would just strip it.
Also, if a filename contains a %, you are in trouble if you use it as the format string to printf, so I'd add an explicit format string. And I combined the two printfs. To switch the columns (see also below for more on this topic), just switch the arguments and adapt the format string accordingly.
You should use double quotes around using variables ("$F" instead of $F) to avoid problems with filenames with spaces or other stuff in them.
Then, if a file starts with spaces, your read would skip those, rendering the resulting variable useless. To avoid that, set IFS to an empty string for the time of the read.
To get only the number of directory entries, you should use option -l for wc to only count the lines (and not also words and characters).
Use option --sort=none to ls to speed up things by avoiding useless sorting.
Use option -b to ls to escape newline characters in file names and thus avoid breaking of counting.
Indent your code properly if you want others to read it.
This is the result:
wcr() {
find "$#" -type d | while IFS='' read F
do
printf "%5d %s\n" "$(ls --sort=none -b "$F" | wc -l)" "$F"
done
echo
}
I'd object to switching the columns. The potentially widest column should be at the end (in this case the path to the file). Otherwise you will have to live with unreadable output. But if you really want to do this, you'd have to do two passes: One to determine the longest entry and a second to format the output accordingly.
for i in $(find . -type d); do
printf '%-10s %s\n' "$(ls $i | wc -l)" "$i"
done
You probably could pre-process the output and use column to make some fancier output with whatever order, but since the path can get big, doing this is probably simpler.

Bash script pdftk merge PDFs

I have a few thousand PDFs that I need merged based on filename.
Named like:
Lastname, Firstname_12345.pdf
Instead of overwriting or appending, our software appends a number/datetime to the pdf if there are additional pages like:
Lastname, Firstname_12345_201305160953344627.pdf
For all the ones that don't have a second (or third) pdf the script doesn't need to touch. But, for all the ones that have multiples, they need to be merged into a new file *_merged.pdf? and the originals deleted.
I gave this my best effort and this is what I have so far.
#! /bin/bash
# list all pdfs to show shortest name first
LIST=$(ls -r *.pdf)
for x in "$LIST"
# Remove .pdf extension. merge pdfs. delete originals.
do
y=${x%%.*}
pdftk "$y"*.pdf cat output "$y"_merged.pdf
find "$y"*.pdf -type f ! -iname "*_merged.pdf" -delete
done
This script works to a certain extent. It will merge and delete the originals, but it doesn't have anything in it to skip ones that don't need anything appended to them, and when I run it in a folder with several test files it stops after one file. Can anyone point me in the right direction?
Since your file names contain spaces the for loop won't work as is.
Once you have a list of file names, a test on the number of files matching y*.pdf to determine if you need to merge the pdfs.
#!/bin/bash
LIST=( * )
# Remove .pdf extension. merge pdfs. delete originals.
for x in "${LIST[#]}" ; do
y=${x%%.pdf}
if [ $(ls "$y"*.pdf 2>/dev/null | wc -l ) -gt 1 ]; then
pdftk "$y"*.pdf cat output "$y"_merged.pdf
find "$y"*.pdf -type f ! -iname "*_merged.pdf" -delete
fi
done

Grabbing every 4th file

I have 16,000 jpg's from a webcan screeb grabber that I let run for a year pointing into the back year. I want to find a way to grab every 4th image so that I can then put them into another directory so I can later turn them into a movie. Is there a simple bash script or other way under linux that I can do this.
They are named like so......
frame-44558.jpg
frame-44559.jpg
frame-44560.jpg
frame-44561.jpg
Thanks from a newb needing help.
Seems to have worked.
Couple of errors in my origonal post. There were actually 280,000 images and the naming was.
/home/baldy/Desktop/webcamimages/webcam_2007-05-29_163405.jpg
/home/baldy/Desktop/webcamimages/webcam_2007-05-29_163505.jpg
/home/baldy/Desktop/webcamimages/webcam_2007-05-29_163605.jpg
I ran.
cp $(ls | awk '{nr++; if (nr % 10 == 0) print $0}') ../newdirectory/
Which appears to have copied the images. 70-900 per day from the looks of it.
Now I'm running
mencoder mf://*.jpg -mf w=640:h=480:fps=30:type=jpg -ovc lavc -lavcopts vcodec=msmpeg4v2 -nosound -o ../output-msmpeg4v2.avi
I'll let you know how the movie works out.
UPDATE: Movie did not work.
Only has images from 2007 in it even though the directory has 2008 as well.
webcam_2008-02-17_101403.jpg webcam_2008-03-27_192205.jpg
webcam_2008-02-17_102403.jpg webcam_2008-03-27_193205.jpg
webcam_2008-02-17_103403.jpg webcam_2008-03-27_194205.jpg
webcam_2008-02-17_104403.jpg webcam_2008-03-27_195205.jpg
How can I modify my mencoder line so that it uses all the images?
One simple way is:
$ touch a b c d e f g h i j k l m n o p q r s t u v w x y z
$ mv $(ls | awk '{nr++; if (nr % 4 == 0) print $0}') destdir
Create a script move.sh which contains this:
#!/bin/sh
mv $4 ../newdirectory/
Make it executable and then do this in the folder:
ls *.jpg | xargs -n 4 ./move.sh
This takes the list of filenames, passes four at a time into move.sh, which then ignores the first three and moves the fourth into a new folder.
This will work even if the numbers are not exactly in sequence (e.g. if some frame numbers are missing, then using mod 4 arithmetic won't work).
As suggested, you should use
seq -f 'frame-%g.jpg' 1 4 number-of-frames
to generate the list of filenames since 'ls' will fail on 280k files. So the final solution would be something like:
for f in `seq -f 'frame-%g.jpg' 1 4 number-of-frames` ; do
mv $f destdir/
done
seq -f 'frame-%g.jpg' 1 4 number-of-frames
…will print the names of the files you need.
An easy way in perl (probably easily adaptable to bash) is to glob the filenames in an array then get the sequence number and remove those that are not divisible by 4
Something like this will print the files you need:
ls -1 /path/to/files/ | perl -e 'while (<STDIN>) {($seq)=/(\d*)\.jpg$/; print $_ if $seq && $seq % 4 ==0}'
You can replace the print by a move...
This will work if the files are numbered in sequence even if the number of digits is not constant like file_9.jpg followed by file_10.jpg )
Given masto's caveats about sorting:
ls | sed -n '1~4 p' | xargs -i mv {} ../destdir/
The thing I like about this solution is that everything's doing what it was designed to do, so it feels unixy to me.
Just iterate over a list of files:
files=( frame-*.jpg )
i=0
while [[ $i -lt ${#files} ]] ; do
cur_file=${files[$i]}
mungle_frame $cur_file
i=$( expr $i + 4 )
done
This is pretty cheesy, but it should get the job done. Assuming you're currently cd'd into the directory containing all of your files:
mkdir ../outdir
ls | sort -n | while read fname; do mv "$fname" ../outdir/; read; read; read; done
The sort -n is there assuming your filenames don't all have the same number of digits; otherwise ls will sort in lexical order where frame-123.jpg comes before frame-4.jpg and I don't think that's what you want.
Please be careful, back up your files before trying my solution, etc. I don't want to be responsible for you losing a year's worth of data.
Note that this solution does handle files with spaces in the name, unlike most of the others. I know that wasn't part of the sample filenames, but it's easy to write shell commands that don't handle spaces safely, so I wanted to do that in this example.
brace expansion {m..n..s} is more efficient than seq. AND it allows a bit of output formatting:
$ echo {0000..0010..2}
0000 0002 0004 0006 0008 0010
Postscript: In curl if you only want every fourth (nth) numbered images so you tell curl a step counter too. This example range goes from 0 to 100 with an increment of 4 (n):
curl -O "http://example.com/[0-100:4].png"

Resources