create and rename multiple copies of files - bash

I have a file input.txt that looks as follows.
abas_1.txt
abas_2.txt
abas_3.txt
1fgh.txt
3ghl_1.txt
3ghl_2.txt
I have a folder ff. The filenames of this folder are abas.txt, 1fgh.txt, 3ghl.txt. Based on the input file, I would like to create and rename the multiple copies in ff folder.
For example in the input file, abas has three copies. In the ff folder, I need to create the three copies of abas.txt and rename it as abas_1.txt, abas_2.txt, abas_3.txt. No need to copy and rename 1fgh.txt in ff folder.
Your valuable suggestions would be appreciated.

You can try something like this (to be run from within your folder ff):
#!/bin/bash
while IFS= read -r fn; do
[[ $fn =~ ^(.+)_[[:digit:]]+\.([^\.]+)$ ]] || continue
fn_orig=${BASH_REMATCH[1]}.${BASH_REMATCH[2]}
echo cp -nv -- "$fn_orig" "$fn"
done < input.txt
Remove the echo if you're happy with it.
If you don't want to run from within the folder ff, just replace the line
echo cp -nv -- "$fn_orig" "$fn"
with
echo cp -nv -- "ff/$fn_orig" "ff/$fn"
The -n option to cp so as to not overwrite existing files, and the -v option to be verbose. The -- tells cp that there are no more options beyond this point, so that it will not be confused if one of the files starts with a hyphen.

using for and grep :
#!/bin/bash
for i in $(ls)
do
x=$(echo $i | sed 's/^\(.*\)\..*/\1/')"_"
for j in $(grep $x in)
do
cp -n $i $j
done
done

Try this one
#!/bin/bash
while read newFileName;do
#split the string by _ delimiter
arr=(${newFileName//_/ })
extension="${newFileName##*.}"
fileToCopy="${arr[0]}.$extension"
#check for empty : '1fgh.txt' case
if [ -n "${arr[1]}" ]; then
#check if file exists
if [ -f $fileToCopy ];then
echo "copying $fileToCopy -> $newFileName"
cp "$fileToCopy" "$newFileName"
#else
# echo "File $fileToCopy does not exist, so it can't be copied"
fi
fi
done
You can call your script like this:
cat input.txt | ./script.sh

If you could change the format of input.txt, I suggest you adjust it in order to make your task easier. If not, here is my solution:
#!/bin/bash
SRC_DIR=/path/to/ff
INPUT=/path/to/input.txt
BACKUP_DIR=/path/to/backup
for cand in `ls $SRC_DIR`; do
grep "^${cand%.*}_" $INPUT | while read new
do
cp -fv $SRC_DIR/$cand $BACKUP_DIR/$new
done
done

Related

Delete empty files - Improve performance of logic

I am i need to find & remove empty files. The definition of empty files in my use case is a file which has zero lines.
I did try testing the file to see if it's empty However, this behaves strangely as in even though the file is empty it doesn't detect it so.
Hence, the best thing I could write up is the below script which i way too slow given it has to test several hundred thousand files
#!/bin/bash
LOOKUP_DIR="/path/to/source/directory"
cd ${LOOKUP_DIR} || { echo "cd failed"; exit 0; }
for fname in $(realpath */*)
do
if [[ $(wc -l "${fname}" | awk '{print $1}') -eq 0 ]]
then
echo "${fname}" is empty
rm -f "${fname}"
fi
done
Is there a better way to do what I'm after or alternatively, can the above logic be re-written in a way that brings better performance please?
Your script is slow beacuse wc reads every file to the end, which is not needed for your purpose. This might be what you're looking for:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || echo rm -f -- "$file"
fi
done
Drop the echo after making sure it works as intended.
Another version, calling the rm only once, could be:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || files_to_be_deleted+=("$file")
fi
done
rm -f -- "${files_to_be_deleted[#]}"
Explanation:
The core logic is in the line
read < "$file" || rm -f -- "$file"
The read < "$file" command attempts to read a line from the $file. If it succeeds, that is, a line is read, then the rm command on the right-hand side of the || won't be executed (that's how the || works). If it fails then the rm command will be executed. In any case, at most one line will be read. This has great advantage over the wc command because wc would read the whole file.
if ! read < "$file"; then rm -f -- "$file"; fi
could be used instead. The two lines are equivalent.
To check a "$fname" is a file and is empty or not, use [ -s "$fname" ]:
#!/usr/bin/env sh
LOOKUP_DIR="/path/to/source/directory"
for fname in "$LOOKUP_DIR"*/*; do
if ! [ -s "$fname" ]; then
echo "${fname}" is empty
# remove echo when output is what you want
echo rm -f "${fname}"
fi
done
See: help test:
File operators:
...
-s FILE True if file exists and is not empty.
Yet another method
wc -l ~/tmp/* 2>/dev/null | awk '$1 == 0 {print $2}' | xargs echo rm
This will break if any of your files have whitespace in the name.
To work around that, with awk still
wc -l ~/tmp/* 2>/dev/null \
| awk 'sub(/^[[:blank:]]+0[[:blank:]]+/, "")' \
| xargs echo rm
This works because the sub function returns the number of substitutions made, which can be treated as a boolean zero/not-zero condition.
Remove the echo to actually delete the files.

Sequentially numbering of files in different folders while keeping the name after the number

I have a lot of ogg or wave files in different folders that I want to sequentially number while keeping everything that stands behind the prefixed number. The input may look like this
Folder1/01 Insbruck.ogg
02 From Milan to Rome.ogg
03 From Rome to Naples.ogg
Folder2/01 From Naples to Palermo.ogg
02 From Palermo to Syracrus.ogg
03 From Syracrus to Tropea
The output should be:
Folder1/01 Insbruck.ogg
02 From Milan to Rome.ogg
03 From Rome to Naples.ogg
Folder2/04 From Naples to Palermo.ogg
05 From Palermo to Syracrus.ogg
06 From Syracrus to Tropea.ogg
The sequential numbering across folders can be done with this BASH script that I found here:
find . | (i=0; while read f; do
let i+=1; mv "$f" "${f%/*}/$(printf %04d "$i").${f##*.}";
done)
But this script removes the title that I would like to keep.
TL;DR
Like this, using find and perl rename:
rename -n 's#/\d+#sprintf "/%0.2d", ++$::c#e' Folder*/*
Drop -n switch if the output looks good.
With -n, you only see the files that will really be renamed, so only 3 files from Folder2.
Going further
The variable $::c (or $main::c is a package variable) is a little hack to avoid the use of more complex expressions:
rename -n 's#/\d+#sprintf "/%0.2d", ++our $c#e' Folder*/*
or
rename -n '{ no strict; s#/\d+#sprintf "/%0.2d", ++$c#e; }' Folder*/*
or
rename -n '
do {
use 5.012;
state $c = 0;
s#/\d+#sprintf "/%0.2d", ++$c#e
}
' Folder*/*
Thanks go|dfish & Grinnz on freenode
A bash script for this job would be:
#!/bin/bash
argc=$#
width=${#argc}
n=0
for src; do
base=$(basename "$src")
dir=$(dirname "$src")
if ! [[ $base =~ ^[0-9]+\ .*\.(ogg|wav)$ ]]; then
echo "$src: Unexpected file name. Skipping..." >&2
continue
fi
printf -v dest "$dir/%0${width}d ${base#* }" $((++n))
echo "moving '$src' to '$dest'"
# mv -n "$src" "$dest"
done
and could be run as
./renum Folder*/*
assuming the script is saved as renum. It will just print out source and destination file names. To do actual moving, you should drop the # at the beginning of the line # mv -n "$src" "$dest" after making sure it will work as expected. Note that the mv command will not overwrite an existing file due to the -n option. This may or may not be desirable. The script will print out a warning message and skip unexpected file names, that is, the file names not fitting the pattern specified in the question.
The sequential numbering across folders can be done with this BASH script that I found here:
find . | (i=0; while read f; do
let i+=1; mv "$f" "${f%/*}/$(printf %04d "$i").${f##*.}";
done)
But this script removes the title that I would like to keep.
Not as robust as the accepted answer but this is the improved version of your script and just in case rename is not available.
#!/usr/bin/env bash
[[ -n $1 ]] || {
printf >&2 'Needs a directory as an argument!\n'
exit 1
}
n=1
directory=("$#")
while IFS= read -r files; do
if [[ $files =~ ^(.+)?\/([[:digit:]]+[^[:blank:]]+)(.+)$ ]]; then
printf -v int '%02d' "$((n++))"
[[ -e "${BASH_REMATCH[1]}/$int${BASH_REMATCH[3]}" ]] && {
printf '%s is already in sequential order, skipping!\n' "$files"
continue
}
echo mv -v "$files" "${BASH_REMATCH[1]}/$int${BASH_REMATCH[3]}"
fi
done < <(find "${directory[#]}" -type f | sort )
Now run the script with the directory in question as the argument.
./myscript Folder*/
or
./myscript Folder1/
or
./myscript Folder2/
or a . the . is the current directory.
./myscript .
and so on...
Remove the echo if you're satisfied with the output.

Renaming Multiples Files To delete first portion of name

I have a list of files like so :
10_I_am_here_001.jpg
20_I_am_here_003.jpg
30_I_am_here_008.jpg
40_I_am_here_004.jpg
50_I_am_here_009.jpg
60_I_am_here_002.jpg
70_I_am_here_005.jpg
80_I_am_here_006.jpg
How can I rename all the files in a directory, so that I can drop ^[0-9]+_ from the filename ?
Thank you
Using pure BASH:
s='10_I_am_here_001.jpg'
echo "${s#[0-9]*_}"
I_am_here_001.jpg
You can then write a simple for loop in that directory like this:
for s in *; do
f="${s#[0-9]*_}" && mv "$s" "$f"
done
Using rename :
rename 's/^[0-9]+_//' *
Here's another bash idea based on files ending .jpg as shown above or whatever>
VonBell
#!/bin/bash
ls *.jpg |\
while read FileName
do
NewName="`echo $FileName | cut -f2- -d "_"`"
mv $FileName $NewName
done
With bash extended globbing
shopt -s extglob
for f in *
do
[[ $f == +([0-9])_*.jpg ]] && mv "$f" "${f#+([0-9])_}"
done

(Bash) rename files but give it a new extension that will count up.. (md5sum)

I need to rename all files in a folder and give it a new file extension. I know how I can rename files with bash. The problem I have is, I need to rename it to:
file.01 file.02 file.03 and counting up for all files found.
Can somebody provide me an example where to start?
This is what i need:
md5sum * | sed 's/^\(\w*\)\s*\(.*\)/\2 \1/' | while read LINE; do
mv $LINE
done
but that doesnt give it an extension that will go from file.01 file.02 file.03 etc.
If one reads your requirements literally...
counter=0
for file in *; do
read sum _ <<<"$(md5sum "$file")"
printf -v file_new "%s.%02d" "$sum" "$counter"
mv -- "$file" "$file_new"
(( counter++ ))
done
This is less efficient than reading the filenames from md5sum's output, but more reliable, as globbing handles files with unusual names (newlines, special characters, etc) safely.
something line this:
i=0
for f in *
do
if [ -f $f ]; then
i=`expr $i + 1`
if [ $i -lt 10 ]; then
i=0$i
fi
sum=`md5sum $f | cut -d ' ' -f 1`
mv $f $sum.$i
fi
done

Bash "for file in" exception

I got this
for file in *; do
any command
done
What I want to do is add an exception to the "for file in *; do".
Any ideas?
If you wanted to skip files with a particular extension, for example, ".pl", you could do:
for file in *
do
[ "${file##*.}" != "pl" ] && echo $file
done
One way to do what I think you are asking is to loop through and check a file name with a if statement (or just grep -v the ls cmd):
for file in `ls`; do
if [ "$file" == "something" ]; then
# do something
else
# something else
fi
done
For example, you have files
$ ls -1 ./
./M2U0001.MPG
./M2U0180.MPG
./text
Exception file is M2U0180.MPG
$> filename="M2U0180.MPG"
And
$> for file in $(ls -1 ./* | grep --invert-match "${filename}" ); do echo $file; done
./M2U0001.MPG
./text
Weird solution done :)
use bash's extended globbing
shopt -s extglob
for f in !(excluded-file); do echo "$f"; done
A simple solution :
The following code snippet will print all the files in the pwd , that have a .py extension
for i in * ;do
if [[ $i == *.py ]];then
echo "$i"
fi
done

Resources