Incrementing number in filenames in bash - bash

I'm trying to take a list of files and rename them, incrementing a number in their filenames. The directory contains a bunch of files named like:
senreg1.csv senreg2.csv senreg10.csv
senreg1.csv.1 senreg2.csv.1 senreg10.csv.1
senreg1.csv.2 senreg2.csv.2 senreg10.csv.2
senreg1.csv.3 senreg2.csv.3 ... senreg10.csv.3
senreg1.csv.4 senreg2.csv.4 senreg10.csv.4
... ... ...
senreg1.csv.10 senreg2.csv.10 senreg10.csv.10
senreg1.csv.11 senreg2.csv.11 senreg10.csv.11
I want to increment all of the files that end in 3 or higher so I can insert a new file with suffix 3, so I made a text file called 'renames.txt' containing all the filenames that I want to rename. Then, I tried using a for loop to do the actual renaming.
for f in `cat renames.txt`
do
newfile=`echo $f | awk 'BEGIN { FS = "."}; { printf $1 "." $2 "." $3+1 }'`
mv "$f" "$newfile"
done
I want to end up with something like:
senreg1.csv senreg2.csv senreg10.csv
senreg1.csv.1 senreg2.csv.1 senreg10.csv.1
senreg1.csv.2 senreg2.csv.2 senreg10.csv.2
senreg1.csv.4 senreg2.csv.4 ... senreg10.csv.4
senreg1.csv.5 senreg2.csv.5 senreg10.csv.5
... ... ...
senreg1.csv.11 senreg2.csv.11 senreg10.csv.11
senreg1.csv.12 senreg2.csv.12 senreg10.csv.12
But instead I get:
senreg1.csv senreg2.csv senreg10.csv
senreg1.csv.1 senreg2.csv.1 senreg10.csv.1
senreg1.csv.2 senreg2.csv.2 ... senreg10.csv.2
senreg1.csv.12 senreg2.csv.12 senreg10.csv.12
The contents of senregX.csv.12 are the same as the original senregX.csv.3. Hope this explanation made sense. Anybody know what's going on here?

You need to rename the files in reverse.
11 -> 12
10 -> 11
9 -> 10
and so on.

This script do what you want, without temporary files, only to have diversity of solutions:
#!/bin/bash
for file in $(ls -1 *[0-9]) # list files ending with a number
do
# get file name and id
name=$(echo $file | sed 's/\(.*\)\.\([0-9]\+\)$/\1/g');
id=$(echo $file | sed 's/.*\.\([0-9]\+\)$/\1/g');
if [ $id -ge 3 ]
then
((id += 1))
# We need to backup the files because we may override some files
cp $file "_$name.$id"
fi
done
# remove old files
for file in $(ls -1 [!_]*[0-9])
do
id=$(echo $file | sed 's/.*\.\([0-9]\+\)$/\1/g');
if [ $id -ge 3 ]
then
rm $file;
fi
done
# finish
for file in $(ls -1 _*[0-9])
do
name=$(echo $file | tr -d '_');
mv "$file" "$name";
done

Related

Rename files matching pattern in a loop - Bash

I have been trying to rename some specific files based on a table but with no success. It either renames all files or gives error.
The directory contains hundreds of files named with long barcodes and I want to rename only files containing the patter _1_.
Example
barcode_1_barcode_SL484171.fastq.gz barcode_2_barcode_SL484171.fastq.gz barcode_1_barcode_SL484370.fastq.gz barcode_2_barcode_SL484370.fastq.gz
mytable.txt
oldname
newname
barcode_1_barcode_SL484171
Description1
barcode_2_barcode_SL484171
Description1
barcode_1_barcode_SL484370
Description2
barcode_2_barcode_SL484370
Description2
Desire output:
Description1.R1.fastq.gz Description2.R1.fastq.gz
As you can see in the table there are two files per description but I only want to rename the ones with the _1_ pattern.
Code I have tried:
for i in *_1_*.fastq.gz; do read oldname newname; mv "$oldname" "$newname".R1.fastq.gz; done < mytable.txt
for i in $(grep '_1_' mytable.txt); do read -r oldname newname; mv ${oldname} ${newname}.R1.fastq.gz; done < mytable.txt
for i in $(grep '_1_' mytable.txt); do oldname=$(cut -f1 $i);newname=$(cut -f2 $i); ln -s ${oldname} ${newname}.R1.fastq.gz; done
while read -r oldname newname
do
if [[ $oldname =~ "_1_" ]]
then
mv $oldname $newname
fi
done < mytable.txt
Something like this.
#!/usr/bin/env bash
while IFS= read -r files; do ##: loop through the output of `grep 'barcode_1_barcode.*' table.txt`
while read -ru9 old_name prefix; do ##: loop through the output of `find . -name 'barcode_1_barcode*.gz' | grep -f <(cut -d' ' -f1 table.txt`
if [[ $files == *"$old_name"* ]]; then ##: If the filename from the output of find matches the first field of table.txt (space delimite)
old_filename="${files%.fastq.gz}" ##: Extract the filename without the fast.gz extesntion
extension="${files#"$old_filename"}" ##: Extract the extention .fast.gz without the filename
# mv -v "$files" "$prefix.R1${extension}"
printf '%s %s %s ==> %s\n' mv -v "$files" "$prefix.R1${extension}" ##: Rename the files to the desired output
fi
done 9< <(grep 'barcode_1_barcode.*' table.txt)
done < <(find . -name 'barcode_1_barcode*.gz' | grep -f <(cut -d' ' -f1 table.txt) ) ##: Remain the first column/field of table.txt
Output from the OP's sample data/files.
renamed './barcode_1_barcode_SL484370.fastq.gz' -> 'Description2.R1.fastq.gz'
renamed './barcode_1_barcode_SL484171.fastq.gz' -> 'Description1.R1.fastq.gz'
If you're satisfied with the output either move the # from the front of mv to the
front of printf or just delete the entire line with printf and remove the # from
mv in order for mv to actually rename the files.

Sort files in a directory before renaming them

I am learning Bash and therefore I would like to write a script with runs over my files and names them after the current directory.
E.g. current_folder_1, current_folder_2, current_folder_3...
#!/bin/bash
# script to rename images, sorted by "type" and "date modified" and named by current folder
#get current folder which is also basename of files
basename=$(basename "$PWD");
echo "Current folder is: ${basename}";
echo '';
#set counter for iteration and variables
counter=1;
new_name="";
file_extension="";
#for each file in current folder
for f in *
do
#catch file name
echo "Current file is: ${f}"
#catch file extension
file_extension="${f##*.}";
echo "Current file extension is: ${file_extension}"
#create new name
new_name="${basename}_${counter}.${file_extension}"
echo "New name is: ${new_name}";
#mv $f "${new_name}";
echo "Counter is: ${counter}"
((counter++));
done
One of my two problems is I would like to sort them by first type and then date_modified before running the for-each-loop.
Something like
for f in * | sort -k "type" -k "date_modified"
[...]
I'd appreciate some help.
EDIT1: Solved the sorting by date problem with
for f in $(ls -1 -t -r)
This could be a start. I've commented in the code where I think it's needed but please ask if anything is unclear.
#!/bin/bash
#get current folder which is also basename of files
folder=$(basename "$PWD");
echo "Current folder is: ${folder}";
echo
#set counter for iteration
counter=1;
for f in *
do
file_extension="${f##*.}";
# replace all whitespaces with underscores
sort_extension=${file_extension//[[:space:]]/_}
# get modification time in seconds since epoch
sort_modtime=$(stat --format=%Y "$f")
# output fed to sort
echo $sort_extension $sort_modtime "/$f/"
# sort on extension first and modification time after
done | sort -k1 -k2n | while read -r dummy1 dummy2 file
do
# remove the slashes we added above
file=${file:1:-1}
file_extension="${file##*.}";
new_name="${folder}_${counter}.${file_extension}"
echo "moving \"$file\" to \"$new_name\""
#mv "$file" "$new_name"
(( counter++ ))
done
I decided to do a workaround with two loops, one for images and one for video.
#used for regex
shopt -s extglob
#used to get blanks in filenames
SAVEIFS=$IFS;
IFS=$(echo -en "\n\b");
[...]
image_extensions="(*.JPG|*.jpg|*.PNG|*.png|*.JPEG|*.jpeg)"
video_extensions="(*.mp4|*.gif)"
[...]
for f in $(ls -1 -t -r *${media_file_extensions})
[...]
for f in $(ls -1 -t -r *${video_extensions})
[...]
#reverse IFS to default
IFS=$SAVEIFS;
ls -lt | sort
The -t parameter will sort by date and time, and sort should sort by type of file.

Add filename of each file as a separator row when merging into a single file Bash Script

I have the current script which combines all the CSV files in a folder into a single CSV file and it works great. I need to add functionality to add the filename of the original csv's as a header row for each data block so I know which section is which.
Can someone assist as this is not by strong point and I am over my head
#!/bin/bash
OutFileName="./Data/all/all.csv" # Fix the output name
i=0 # Reset a counter
for filename in ./Data/all/*.csv; do
if [ "$filename" != "$OutFileName" ] ; # Avoid recursion
then
if [[ $i -eq 0 ]] ; then
head -1 $filename > $OutFileName # Copy header if it is the first file
fi
tail -n +2 $filename >> $OutFileName # Append from the 2nd line each file
i=$(( $i + 1 )) # Increase the counter
fi
done
I will be automating this and using and run shell script in apple automator.
Thank you got any help.
This is one of the files that are imported and output example
Example of current input file
Once combined I need the filename where the "headers are"
When you want to generate something like ...
Header1,Header2,Header3
file1.csv
a,b,c
x,y,z
file2.csv
1,2,3
9,9,9
file3.csv
...
... then you just have to insert an echo "$filename" >> "$OutFileName" in front of the tail command. Here is an updated version of your script with some minor improvements.
#!/bin/bash
out="./Data/all/all.csv"
i=0
rm -f "$out"
for file in ./Data/all/*.csv; do
(( i++ == 0)) && head -1 "$file"
echo "$file"
tail -n +2 "$file"
done > "$out"
There is no concept of "header line" other than the first line of the CSV file. What you can do is add a new column.
I've switched to Awk because it simplifies the script considerably. Your original would be literally a one-liner.
awk -F , 'NR==1 { OFS=FS; $(NF+1) = "Filename" }
FNR>1{ $(NF+1) = FILENAME }1' all/*.csv >all.csv
Not saving the output in the same directory as the inputs removes the pesky corner case handling.

Find files which share part of a filename

In my current directory there are many files. Some of the files share part of their filename.
e.g.:
XGAE_537493_GSR.FITS
TGFE_537493_RRF.FITS
EGRE_537497_HDR.FITS
TRTE_537497_YUH.FITS
TRXX_537499_YDF.FITS
.
.
Files 1 & 2 would be a match, as would files 3 & 4. File 5 has no match. Therefore, files 1,2,3 and 4 would be moved.
I want to move the files which share part of their filename, in order to separate them from the ones that don't.
I was attempting to do this using bash. I googled but couldn't locate websites that were quite describing the process I need. So far in pseudo-code I have:
FOR F IN *
IF ${FILE:5:10} MATCHES ANY OTHER ${FILE:5:10}
MOVE ALL MATCHES TO ANOTHER DIRECTORY
Any information to help me move in the right direction would be appreciated.
Try this:
for f in ./*.FITS ; do
middleBit=$(echo $f| cut -d'_' -f 1)
count=$(ls *middleBit*.FITS | wc -l)
if [ $count -ge 1 ]
then
for match in *middleBit*.FITS ; do
mv $match ./somewhere
done
fi
done
Using associative array in BASH 4 you can do it easily:
#!/bin/bash
declare -A arr
for f in *.FITS; do
k="${f:5:6}"
[[ ${arr[$k]} ]] && mv "$f" /dest/ || arr["$k"]=1
done
if your file structure is fixed, you can scan them and find duplicates in sub fields of the file name in awk.
for example
$ ls -1 | awk -F_ 'NF==3{f[$2]=(a[$2]++?f[$2] OFS $0:$0)}
END{for(k in f) if(a[k]>1) print f[k]} '
TGFE_537493_RRF.FITS
XGAE_537493_GSR.FITS
you can then pipe the results to a cp command
$ ... | xargs -I file cp file file.DUP
adds suffix DUP to duplicate file names, or
$ ... | xargs -I file mv file anotherlocation/
moves to anotherlocation.

(Bash) rename files but give it a new extension that will count up.. (md5sum)

I need to rename all files in a folder and give it a new file extension. I know how I can rename files with bash. The problem I have is, I need to rename it to:
file.01 file.02 file.03 and counting up for all files found.
Can somebody provide me an example where to start?
This is what i need:
md5sum * | sed 's/^\(\w*\)\s*\(.*\)/\2 \1/' | while read LINE; do
mv $LINE
done
but that doesnt give it an extension that will go from file.01 file.02 file.03 etc.
If one reads your requirements literally...
counter=0
for file in *; do
read sum _ <<<"$(md5sum "$file")"
printf -v file_new "%s.%02d" "$sum" "$counter"
mv -- "$file" "$file_new"
(( counter++ ))
done
This is less efficient than reading the filenames from md5sum's output, but more reliable, as globbing handles files with unusual names (newlines, special characters, etc) safely.
something line this:
i=0
for f in *
do
if [ -f $f ]; then
i=`expr $i + 1`
if [ $i -lt 10 ]; then
i=0$i
fi
sum=`md5sum $f | cut -d ' ' -f 1`
mv $f $sum.$i
fi
done

Resources