Rename all files in a directory by omitting last 3 characters - bash

I am trying to write a bash command that will rename all the files in the current directory by omitting the last 3 characters. I am not sure if it is possible thats why I am asking here.
I have a lots of files named like this : 720-1458907789605.ts
I need to rename all of them by omitting last 3 characters to obtain from 720-1458907789605.ts ---> 720-1458907789.ts for all files in the current directory.
Is it possible using bash commands? I am new to bash scripts.
Thank you!

Native bash solution:
for f in *.ts; do
[[ -f "$f" ]] || continue # if you do not need to rename directories
mv "$f" "${f:: -6}.ts"
done
This solution is slow if you have really many files: star-expansion in for will take up memory and time.
Ref: bash substring extraction.
If you have a really large data set, a bit more complex but faster solution will be:
find . -type f -name '*.ts' -depth 1 -print0 | while read -d $\0 f; do
mv "$f" "${f%???.ts}.ts"
done

With Larry Wall's rename:
rename -n 's/...\.ts$/.ts/' *.ts
If everything looks okay remove dry run option -n.

Related

BASH one-line alphabetical mass file sort using for, mv, and grep

Problem
I've got thousands of files with the format "^[[:digit:]]\{4\} - [[:alpha:]].*", for exampe: 7958 - a3ykof zyimeo3.txt. I'm trying to simply move them into folders alphabetically beginning with the first alpha-character after the hyphen.
I feel like I'm so close to getting this to happen the way I want but there's a (hopefully simple) problem.
I tested the commmand with echo first to make sure it grabs the correct information. Then I tried to execute it for real with mv. I've included some examples below based on this list of files:
1439 - a74389 josifj3oj.txt
3589 - Bfoei 839982 3il.txt
4719 - an38n8f n839mm20 mi02.txt
6398 - b39ji oij3o8 j2o.txt
9287 - A2984 j289jj9 oiw.txt
.... several thousand more files
Examples
This works
This lists all the files starting with the letter "a" (after the 4 digits-space-hyphen-space pattern in the beginning):
for i in "$(ls | grep -i "^[[:digit:]]\{4\} - a")"; do echo "$i"; done
This fails
This doesn't put all the files starting with the letter "a" (after the 4 digits-space-hyphen-space pattern) in the "A" folder:
for i in "$(ls | grep -i "^[[:digit:]]\{4\} - a")"; do mv "$i" A; done
I expected this second command to move each file named "#### - a*" or "#### - A*" to the folder named A. But it sees it as one big string/filename joined by "\n".
Here's an example error message:
mv: cannot stat '1439 - a74389 josifj3oj.txt\n9287 - A2984 j289jj9 oiw.txt\n2719 - an38n8f n839mm20 mi02.txt': No such file or directory
Does anybody know what I'm missing?
Edit
Between #alvits's answer and #chepner's and #courtlandj comments, what worked flawless for me was this:
for directory in {A..Z}; do
mkdir -p "$directory" &&
find . -iregex "./[0-9]* - ${directory}.*" -exec mv -t "$directory" {} +;
done
Here's the simplest way to do it.
for directory in {A..Z}; do
mkdir "$directory" &&
find . -iregex "./[0-9]* - ${directory}.*" -exec mv "{}" "$directory" \;
done
The for loop will query for filenames according to each directory they belong.
The find command will find the files and move them to the directory.
BASH has RE-like globbing, and sequence creation, built-in. You can make use of it something like this:
for i in {{A..Z},{a..z}}; do
mkdir "${i}" && mv [0-9][0-9][0-9][0-9]" - ${i}"*" "${i}"
done
You notice the four repetitions of the digits, and yeah it looks clumsier than a normal RE like [0-9]{4}.

Bash script to concatenate text files with specific substrings in filenames

Within a certain directory I have many directories containing a bunch of text files. I’m trying to write a script that concatenates only those files in each directory that have the string ‘R1’ in their filename into one file within that specific directory, and those that have ‘R2’ in another . This is what I wrote but it’s not working.
#!/bin/bash
for f in */*.fastq; do
if grep 'R1' $f ; then
cat "$f" >> R1.fastq
fi
if grep 'R2' $f ; then
cat "$f" >> R2.fastq
fi
done
I get no errors and the files are created as intended but they are empty files. Can anyone tell me what I’m doing wrong?
Thank you all for the fast and detailed responses! I think I wasn't very clear in my question, but I need the script to only concatenate the files within each specific directory so that each directory has a new file ( R1 and R2). I tried doing
cat /*R1*.fastq >*/R1.fastq
but it gave me an ambiguous redirect error. I also tried Charles Duffy's for loop but looping through the directories and doing a nested loop to run though each file within a directory like so
for f in */; do
for d in "$f"/*.fastq;do
case "$d" in
*R1*) cat "$d" >&3
*R2*) cat "$d" >&4
esac
done 3>R1.fastq 4>R2.fastq
done
but it was giving an unexpected token error regarding ')'.
Sorry in advance if I'm missing something elementary, I'm still very new to bash.
A Note To The Reader
Please review edit history on the question in considering this answer; several parts have been made less relevant by question edits.
One cat Per Output File
For the purpose at hand, you can probably just let shell globbing do all the work (if R1 or R2 will be in the filenames, as opposed to the directory names):
set -x # log what's happening!
cat */*R1*.fastq >R1.fastq
cat */*R2*.fastq >R2.fastq
One find Per Output File
If it's a really large number of files, by contrast, you might need find:
find . -mindepth 2 -maxdepth 2 -type f -name '*R1*.fastq' -exec cat '{}' + >R1.fastq
find . -mindepth 2 -maxdepth 2 -type f -name '*R2*.fastq' -exec cat '{}' + >R2.fastq
...this is because of the OS-dependent limit on command-line length; the find command given above will put as many arguments onto each cat command as possible for efficiency, but will still split them up into multiple invocations where otherwise the limit would be exceeded.
Iterate-And-Test
If you really do want to iterate over everything, and then test the names, consider a case statement for the job, which is much more efficient than using grep to check just one line:
for f in */*.fastq; do
case $f in
*R1*) cat "$f" >&3
*R2*) cat "$f" >&4
esac
done 3>R1.fastq 4>R2.fastq
Note the use of file descriptors 3 and 4 to write to R1.fastq and R2.fastq respectively -- that way we're only opening the output files once (and thus truncating them exactly once) when the for loop starts, and reusing those file descriptors rather than re-opening the output files at the beginning of each cat. (That said, running cat once per file -- which find -exec {} + avoids -- is probably more overhead on balance).
Operating Per-Directory
All of the above can be updated to work on a per-directory basis quite trivially. For example:
for d in */; do
find "$d" -name R1.fastq -prune -o -name '*R1*.fastq' -exec cat '{}' + >"$d/R1.fastq"
find "$d" -name R2.fastq -prune -o -name '*R2*.fastq' -exec cat '{}' + >"$d/R2.fastq"
done
There are only two significant changes:
We're no longer specifying -mindepth, to ensure that our input files only come from subdirectories.
We're excluding R1.fastq and R2.fastq from our input files, so we never try to use the same file as both input and output. This is a consequence of the prior change: Previously, our output files couldn't be considered as input because they didn't meet the minimum depth.
Your grep is searching the file contents instead of file name. You could rewrite it this way:
for f in */*.fastq; do
[[ -f $f ]] || continue
if [[ $f = *R1* ]]; then
cat "$f" >> R1.fastq
elif [[ $f = *R2* ]]; then
cat "$f" >> R2.fastq
fi
done
Find in a forloop might suit this:
for i in R1 R2
do
find . -type f -name "*${i}*" -exec cat '{}' + >"$i.txt"
done

Go into every subdirectory and mass rename files by stripping leading characters

From the current directory I have multiple sub directories:
subdir1/
001myfile001A.txt
002myfile002A.txt
subdir2/
001myfile001B.txt
002myfile002B.txt
where I want to strip every character from the filenames before myfile so I end up with
subdir1/
myfile001A.txt
myfile002A.txt
subdir2/
myfile001B.txt
myfile002B.txt
I have some code to do this...
#!/bin/bash
for d in `find . -type d -maxdepth 1`; do
cd "$d"
for f in `find . "*.txt"`; do
mv "$f" "$(echo "$f" | sed -r 's/^.*myfile/myfile/')"
done
done
however the newly renamed files end up in the parent directory
i.e.
myfile001A.txt
myfile002A.txt
myfile001B.txt
myfile002B.txt
subdir1/
subdir2/
In which the sub-directories are now empty.
How do I alter my script to rename the files and keep them in their respective sub-directories? As you can see the first loop changes directory to the sub directory so not sure why the files end up getting sent up a directory...
Your script has multiple problems. In the first place, your outer find command doesn't do quite what you expect: it outputs not only each of the subdirectories, but also the search root, ., which is itself a directory. You could have discovered this by running the command manually, among other ways. You don't really need to use find for this, but supposing that you do use it, this would be better:
for d in $(find * -maxdepth 0 -type d); do
Moreover, . is the first result of your original find command, and your problems continue there. Your initial cd is without meaningful effect, because you're just changing to the same directory you're already in. The find command in the inner loop is rooted there, and descends into both subdirectories. The path information for each file you choose to rename is therefore stripped by sed, which is why the results end up in the initial working directory (./subdir1/001myfile001A.txt --> myfile001A.txt). By the time you process the subdirectories, there are no files left in them to rename.
But that's not all: the find command in your inner loop is incorrect. Because you do not specify an option before it, find interprets "*.txt" as designating a second search root, in addition to .. You presumably wanted to use -name "*.txt" to filter the find results; without it, find outputs the name of every file in the tree. Presumably you're suppressing or ignoring the error messages that result.
But supposing that your subdirectories have no subdirectories of their own, as shown, and that you aren't concerned with dotfiles, even this corrected version ...
for f in `find . -name "*.txt"`;
... is an awfully heavyweight way of saying this ...
for f in *.txt;
... or even this ...
for f in *?myfile*.txt;
... the latter of which will avoid attempts to rename any files whose names do not, in fact, change.
Furthermore, launching a sed process for each file name is pretty wasteful and expensive when you could just use bash's built-in substitution feature:
mv "$f" "${f/#*myfile/myfile}"
And you will find also that your working directory gets messed up. The working directory is a characteristic of the overall shell environment, so it does not automatically reset on each loop iteration. You'll need to handle that manually in some way. pushd / popd would do that, as would running the outer loop's body in a subshell.
Overall, this will do the trick:
#!/bin/bash
for d in $(find * -maxdepth 0 -type d); do
pushd "$d"
for f in *.txt; do
mv "$f" "${f/#*myfile/myfile}"
done
popd
done
You can do it without find and sed:
$ for f in */*.txt; do echo mv "$f" "${f/\/*myfile/\/myfile}"; done
mv subdir1/001myfile001A.txt subdir1/myfile001A.txt
mv subdir1/002myfile002A.txt subdir1/myfile002A.txt
mv subdir2/001myfile001B.txt subdir2/myfile001B.txt
mv subdir2/002myfile002B.txt subdir2/myfile002B.txt
If you remove the echo, it'll actually rename the files.
This uses shell parameter expansion to replace a slash and anything up to myfile with just a slash and myfile.
Notice that this breaks if there is more than one level of subdirectories. In that case, you could use extended pattern matching (enabled with shopt -s extglob) and the globstar shell option (shopt -s globstar):
$ for f in **/*.txt; do echo mv "$f" "${f/\/*([!\/])myfile/\/myfile}"; done
mv subdir1/001myfile001A.txt subdir1/myfile001A.txt
mv subdir1/002myfile002A.txt subdir1/myfile002A.txt
mv subdir1/subdir3/001myfile001A.txt subdir1/subdir3/myfile001A.txt
mv subdir1/subdir3/002myfile002A.txt subdir1/subdir3/myfile002A.txt
mv subdir2/001myfile001B.txt subdir2/myfile001B.txt
mv subdir2/002myfile002B.txt subdir2/myfile002B.txt
This uses the *([!\/]) pattern ("zero or more characters that are not a forward slash"). The slash has to be escaped in the bracket expression because we're still inside of the pattern part of the ${parameter/pattern/string} expansion.
Maybe you want to use the following command instead:
rename 's#(.*/).*(myfile.*)#$1$2#' subdir*/*
You can use rename -n ... to check the outcome without actually renaming anything.
Regarding your actual question:
The find command from the outer loop returns 3 (!) directories:
.
./subdir1
./subdir2
The unwanted . is the reason why all files end up in the parent directory (that is .). You can exclude . by using the option -mindepth 1.
Unfortunately, this was onyl the reason for the files landing in the wrong place, but not the only problem. Since you already accepted one of the answers, there is no need to list them all.
a slight modification should fix your problem:
#!/bin/bash
for f in `find . -maxdepth 2 -name "*.txt"`; do
mv "$f" "$(echo "$f" | sed -r 's,[^/]+(myfile),\1,')"
done
note: this sed uses , instead of / as the delimiter.
however, there are much faster ways.
here is with the rename utility, available or easily installed wherever there is bash and perl:
find . -maxdepth 2 -name "*.txt" | rename 's,[^/]+(myfile),/$1,'
here are tests on 1000 files:
for `find`; do mv 9.176s
rename 0.099s
that's 100x as fast.
John Bollinger's accepted answer is twice as fast as the OPs, but 50x as slow as this rename solution:
for|for|mv "$f" "${f//}" 4.316s
also, it won't work if there is a directory with too many items for a shell glob. likewise any answers that use for f in *.txt or for f in */*.txt or find * or rename ... subdir*/*. answers that begin with find ., on the other hand, will also work on directories with any number of items.

Bash scripting, loop through files in folder fails

I'm looping through certain files (all files starting with MOVIE) in a folder with this bash script code:
for i in MY-FOLDER/MOVIE*
do
which works fine when there are files in the folder. But when there aren't any, it somehow goes on with one file which it thinks is named MY-FOLDER/MOVIE*.
How can I avoid it to enter the things after
do
if there aren't any files in the folder?
With the nullglob option.
$ shopt -s nullglob
$ for i in zzz* ; do echo "$i" ; done
$
for i in $(find MY-FOLDER/MOVIE -type f); do
echo $i
done
The find utility is one of the Swiss Army knives of linux. It starts at the directory you give it and finds all files in all subdirectories, according to the options you give it.
-type f will find only regular files (not directories).
As I wrote it, the command will find files in subdirectories as well; you can prevent that by adding -maxdepth 1
Edit, 8 years later (thanks for the comment, #tadman!)
You can avoid the loop altogether with
find . -type f -exec echo "{}" \;
This tells find to echo the name of each file by substituting its name for {}. The escaped semicolon is necessary to terminate the command that's passed to -exec.
for file in MY-FOLDER/MOVIE*
do
# Skip if not a file
test -f "$file" || continue
# Now you know it's a file.
...
done

bash: how to change the basename only of a list of files [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
makefile: how to add a prefix to the basename?
I have a lit of files (which I get from find bla -name "*.so") such as:
/bla/a1.so
/bla/a2.so
/bla/blo/a3.so
/bla/blo/a4.so
/bla/blo/bli/a5.so
and I want to rename them such as it becomes:
/bla/liba1.so
/bla/liba2.so
/bla/blo/liba3.so
/bla/blo/liba4.so
/bla/blo/bli/liba5.so
... i.e. add the prefix 'lib' to the basename
any idea on how to do that in bash ?
Something along the lines of:
for a in /bla/a1.so /bla/a2.so /bla/blo/a4.so
do
dn=$(dirname $a)
fn=$(basename $a)
mv "$a" "${dn}/lib${fn}"
done
should do it. You might want to add code to read the list of filenames from a file, rather than listing them verbatim in the script, of course.
find . -name "*.so" -printf "mv '%h/%f' '%h/lib%f'\n" | bash
The code will rename files in current directory and subdirectories to append "lib" in front of .so filenames.
No looping needed, as find already does its recursive work to list the files. The code builds the "mv" commands one by one and executes them. To see the "mv" commands without executing them, simply remove the piping to shell part "| bash".
find's printf command understands many variables which makes it pretty scalable. I only needed to use two here:
%h: directory
%f: filename
How to test it:
Run this first (will perform nothing yet, only print lines on the screen):
find . -name "*.so" -printf "mv '%h/%f' '%h/lib%f'\n" | less -S
This will show you all the commands that your script will execute. If you're satisfied with the result, simply execute it afterwards by piping it into bash instead of less.
find . -name "*.so" -printf "mv '%h/%f' '%h/lib%f'\n" | bash
while multiliner
A slightly more robust and generalized solution based on $nfm (maybe more than you really need) would be
while IFS= read -r -u3 -d $'\0' FILE; do
DIR=`dirname $FILE`;
FILENAME=`basename $FILE`;
mv $FILE ${DIR}/lib${FILENAME};
done 3< <(find bla -name *.so -print0 | sort -rz)
This is quite robust:
read -u3 and 3< does not interfere with stdin
-print0 + IFS= + -d $'/0' allows for newlines in filenames
sort -rz renames deeper paths first, so that you can even rename directories and the files inside them at once
find -execdir + rename
This would be perfect if it weren't for the PATH annoyances, see: Find multiple files and rename them in Linux
Try mmv:
cd /bla/
mmv "*.so" "lib#1.so"
(mmv "*" "lib#1" would also work but it's less safe).
If you don't have mmv installed, get it.
basename and dirname are your friends :)
You want something like this (excuse my bash syntax - it's a little rusty):
for FILE in `find bla -name *.so` do
DIR=`dirname $FILE`;
FILENAME=`basename $FILE`;
mv $FILE ${DIR}/lib${FILENAME};
done
Beaten to the punch!
Note I've commented out the mv command to prevent any accidental mayhem
for f in *
do
dir=`dirname "$f"`
fname=`basename "$f"`
new="$dir/lib$fname"
echo "new name is $new"
# only uncomment this if you know what you are doing
# mv "$f" "$new"
done

Resources