rename long file names while keeping a part of the name - bash

I have a lot of files which have a certain pattern:
some123_name4.with5.number01-02_and6-other7.stuff.txt
some123_name4.with5.number05-06_and6-other7.stuff.txt
some123_name4.with5.number11-12_and6-other7.stuff.txt
and I would like to rename them keeping the part in the middle number??-??. For example like:
different45_start.keep76.number01-02_but.change34_rest.txt
different45_start.keep76.number05-06_but.change34_rest.txt
different45_start.keep76.number11-12_but.change34_rest.txt
I have played around with expr, %% and ? but I didn't even manage to extract the number??-?? part of the filename.

This ought to do it (replace with actual patterns)
#!/bin/bash
for f in some123* ; do
mv $f `echo $f | sed -e 's/some123_name4.with5/different45_start.keep76/' -e 's/and6-other7.stuff/but.change34_rest/'`
done

May I suggest you use regexp'es to extract your numbers from the old name into the new name? Then it's just a question about
creating a new subdirectory (just in case you make a mistake)
using "ls" to list the file names (with options for 1 (one) name per line, not following down into subdirs)
iterating over the file names
In each iteration,
set the new name
run the copy commande "cp" using the old and the new names (but as a trick, copy down into your new subdirectory)
All in all, something like this:
mkdir NEW
ls -1d some* \
| while read FILE; do
NEWFILE=`echo "$FILE" \
| sed 's|^some12\\([0-9]\\)_name\\([0-9]\\)[.]with\\([0-9]\\)[.]number\\([0-9][0-9]-[0-9][0-9]\\)_and\\([0-9]\\)-other\\([0-9]\\)[.]stuff[.]txt$|different\\2\\3_start.keep\\6\\5.number\\4_but.change\\1\\2_rest.txt|'`
cp "$FILE" NEW/"$NEWFILE"
done
As you can see, due to the backticks (`) you have to use extra backslashes in the regexp.
Does this help you, as a start?

a possible solution using expr looks like the following:
for f in *number??-??*; do
fixedPart=$(expr "$f" : '.*\(number[0-9][0-9]-[0-9][0-9]\).*')
newName="different45_start.keep76.${fixedPart}_but.change34_rest.txt"
mv "$f" "$newName"
done

Related

batch rename matching files using 1st field to replace and 2nd as search criteria

I have a very large selection of files eg.
foo_de.vtt, foo_en.vtt, foo_es.vtt, foo_fr.vtt, foo_pt.vtt, baa_de.vtt, baa_en.vtt, baa_es.vtt, baa_fr.vtt, baa_pt.vtt... etc.
I have created a tab separated file, filenames.txt containing the current string and replacement string eg.
foo 1000
baa 1016
...etc
I want to rename all of the files to get the following:
1000_de.vtt, 1000_en.vtt, 1000_es.vtt, 1000_fr.vtt, 1000_pt.vtt, 1016_de.vtt, 1016_en.vtt, 1016_es.vtt, 1016_fr.vtt, 1016_pt.vtt
I know I can use a utility like rename to do it manually term by term eg:
rename 's/foo/1000/g' *.vtt
could i chain this into an awk command so that it could run through the filenames.txt?
or is there an easier way to do it just in awk? I know I can rename with awk such as:
find . -type f | awk -v mvCmd='mv "%s" "%s"\n' \
'{ old=$0;
gsub(/foo/,"1000");
printf mvCmd,old,$0;
}' | sh
How can I get awk to process filenames.txt and do all of this in one go?
This question is similar but uses sed. I feel that being tab separated this should be quite easy in awk?
First ever post so please be gentle!
Solution
Thanks for all your help. Ultimately I was able to solve by adapting your answers to the following:
while read new old; do
rename "s/$old/$new/g" *.vtt;
done < filenames.txt
I'm assuming that the strings in the TSV file are literals (not regexes nor globs) and that the part to be replaced can be located anywhere in the filenames.
With that said, you can use mv with shell globs and bash parameter expansion:
#!/bin/bash
while IFS=$'\t' read -r old new
do
for f in *"$old"*.vtt
do
mv "$f" "${f/"$old"/$new}"
done
done < file.tsv
Or with GNU rename (more performant):
while IFS=$'\t' read -r old new
do
rename "$old" "$new" *"$old"*.vtt
done < file.tsv
This might work for you (GNU sed and rename):
sed -E 's#(.*)\t(.*)#rename -n '\''s/\1/\2/'\'' \1*#e' ../file
This builds a script which renames the files in the current directory using file to match and replace parts of the filenames.
Once you are happy with the results, remove the -n and the renaming will be enacted.

bash change absolute path in file line by line for script creation

I'm trying to create a bash script based on a input file (list.txt). The input File contains a list of files with absolute path. The output should be a bash script (move.sh) which moves the files to another location, preserve the folder structure, but changing the target folder name slightly before.
the Input list.txt File example looks like this :
/In/Folder_1/SomeFoldername1/somefilename_x.mp3
/In/Folder_2/SomeFoldername2/somefilename_y.mp3
/In/Folder_3/SomeFoldername3/somefilename_z.mp3
The output file (move.sh) should looks like this after creation :
mv "/In/Folder_1/SomeFoldername1/somefilename_x.mp3" /gain/Folder_1/
mv "/In/Folder_2/SomeFoldername2/somefilename_y.mp3" /gain/Folder_2/
mv "/In/Folder_3/SomeFoldername3/somefilename_z.mp3" /gain/Folder_3/
The folder structure should be preserved, more or less.
after executing the created bash script (move.sh), the result should looks like this :
/gain/Folder_1/somefilename_x.mp3
/gain/Folder_2/somefilename_y.mp3
/gain/Folder_3/somefilename_z.mp3
What I've done so far.
1. create a list of files with absolute path
find /In/ -iname "*.mp3" -type f > /home/maars/mp3/list.txt
2. create the move.sh script
cp -a /home/maars/mp3/list.txt /home/maars/mp3/move.sh
# read the list and split the absolute path into fields
while IFS= read -r line;do
fields=($(printf "%s" "$line"|cut -d'/' --output-delimiter=' ' -f1-))
done < /home/maars/mp3/move.sh
# add the target path based on variables at the end of the line
sed -i -E "s|\.mp3|\.mp3"\"" /gain/"${fields[1]}"/|g" /home/maars/mp3/move.sh
sed -i "s|/In/|mv "\""/In/|g" /home/maars/mp3/move.sh
The script just use the value of ${fields[1]}, which is Folder_1 and put this in all lines at the end. Instead of Folder_2 and Folder_3.
The current result looks like
mv "/In/Folder_1/SomeFoldername1/somefilename_x.mp3" /gain/Folder_1/
mv "/In/Folder_2/SomeFoldername2/somefilename_y.mp3" /gain/Folder_1/
mv "/In/Folder_3/SomeFoldername3/somefilename_z.mp3" /gain/Folder_1/
rsync is not an option since I need the full control of files to be moved.
What could I do better to solve this issue ?
EDIT : #Socowi helped me a lot by pointing me in the right direction. After I did a deep dive into the World of Regex, I could solve my Issues. Thank you very much
The script just use the value of ${fields[1]}, which is Folder_1 and put this in all lines at the end. Instead of Folder_2 and Folder_3.
You iterate over all lines and update fields for every line. After you finished the loop, fields retains its value (from the last line). You would have to move the sed commands into your loop and make sure that only the current line is replaced by sed. However, there's a better way – see down below.
What could I do better
There are a lot of things you could improve, for instance
Creating the array fields with mapfile -d/ fields instead of printf+cut+($()). That way, you also wouldn't have problems with spaces in paths.
Use sed only once instead of creating the array fields and using multiple sed commands. You can replace step 2 with this small script:
cp -a /home/maars/mp3/list.txt /home/maars/mp3/move.sh
sed -i -E 's|^/[^/]*/([^/]*).*$|mv "&" "/gain/\1"|' /home/maars/mp3/move.sh
However, the best optimization would be to drop that three step approach and use only one script to find and move the files:
find /In/ -iname "*.mp3" -type f -exec rename -n 's|^/.*?/(.*?)/.*/(.*)$|/gain/$1/$2|' {} +
The -n option will print what will be renamed without actually renaming anything . Remove the -n when you are happy with the result. Here is the output:
rename(/In/Folder_1/SomeFoldername1/somefilename_x.mp3, /gain/Folder_1/somefilename_x.mp3)
rename(/In/Folder_2/SomeFoldername2/somefilename_y.mp3, /gain/Folder_2/somefilename_y.mp3)
rename(/In/Folder_3/SomeFoldername3/somefilename_z.mp3, /gain/Folder_3/somefilename_z.mp3)
It's not builtin to bash, but the mmv command is nice for this kind of mv where you need to use wildcards in paths. Something like the following should work:
mmv "in/*/*/*" "#1/#3"
Note that this won't create the directories for you - but in your example above it looks like these already exist?

substitute file names using rename

I want to rename files names by substituting all the characters starting from "_ " followed by eight capital letter and keep only the extension.
4585_10_148_H2A119Ub_GTCTGTCA_S51_mcdf_mdup_ngsFlt.fm
4585_10_148_H3K27me3_TCTTCACA_S51_mcdf_mdup_ngsFlt.fm
4585_27_128_Bap1_Bethyl_ACAGATTC_S61_mcdf_mdup_ngsFlt.fw
4585_32_148_1_INPUT_previous_AGAGTCAA_S72_mcdf_mdup_ngsFlt.bw
expected output
4585_10_148_H2A119Ub.fm
4585_10_148_H3K27me3.fm
4585_27_128_Bap1_Bethyl.fm
4585_32_148_1_INPUT_previous.fm
Try this:
for f in *; do
target=$(echo "${f}" | sed -E 's/_[[:upper:]]{8}.*\././')
mv "${f}" "${target}"
done
The key thing is the -E argument to sed, since it enables expanded regular expressions.
You can also use rename (a.k.a. prename or Perl rename) like this:
rename --dry-run 's|_[[:upper:]]{8}.*\.|.|' *
Sample Output
'4585_10_148_H2A119Ub_GTCTGTCA_S51_mcdf_mdup_ngsFlt.fm' would be renamed to '4585_10_148_H2A119Ub.fm'
'4585_32_148_1_INPUT_previous_AGAGTCAA_S72_mcdf_mdup_ngsFlt.bw' would be renamed to '4585_32_148_1_INPUT_previous.bw'
Remove the --dry-run and run again for real, if the output looks good.
This has several added benefits:
that it will warn and avoid any conflicts if two files rename to the same thing,
that it can rename across directories, creating any necessary intermediate directories on the way,
that you can do a dry run first to test it,
that you can use arbitrarily complex Perl code to specify the new name.
On a Mac, install it with homebrew using:
brew install rename
You may try this.
for i in *.fm; do mv $i $(echo $i | sed 's/_GTCTGTCA_S51_mcdf_mdup_ngsFlt//g'); done;
for i in *.fm; do mv $i $(echo $i | sed 's/_TCTTCACA_S51_mcdf_mdup_ngsFlt//g'); done;

I want to copy a file's name into the document using bash

I'm fairly new to using bash.
I have several hundred documents, each named QP1172, QP1474, QP9926, etc. I need the name of the file to be in the first row of the document (so QP1172 for example would be in row 1 of the document QP1172.txt).
Does anyone know how I could do this? Thank you!
You could do something like
for f in QP????.txt; do echo $f | cat - $f >$f.withname; done
to create new files QP1172.txt.withname etc., and then replace the old ones with them after checking that everything looks ok.
(cat here concatenates the name (given on standard input) with the file contents of each file.)
ADDED: To make it easier to let the new versions get the right name afterwards it might be easier to let them have the same name, but in another directory.
mkdir withname
for f in QP????.txt; do echo $f | cat - $f >withname/$f; done
You could use sed to insert a line at the beginning:
for f in QP*; do
sed -i "1i$f" "$f"
done
1i$f means "insert a line containing the value of $f before line 1".

How to copy multiple files and rename them at once by appending a string in between the file names in Unix?

I have a few files that I want to copy and rename with the new file names generated by adding a fixed string to each of them.
E.g:
ls -ltr | tail -3
games.txt
files.sh
system.pl
Output should be:
games_my.txt
files_my.sh
system_my.pl
I am able to append at the end of file names but not before *.txt.
for i in `ls -ltr | tail -10`; do cp $i `echo $i\_my`;done
I am thinking if I am able to save the extension of each file by a simple cut as follows,
ext=cut -d'.' -f2
then I can append the same in the above for loop.
do cp $i `echo $i$ext\_my`;done
How do I achieve this?
You can use the following:
for file in *
do
name="${file%.*}"
extension="${file##*.}"
cp $file ${name}_my${extension}
done
Note that ${file%.*} returns the file name without extension, so that from hello.txt you get hello. By doing ${file%.*}_my.txt you then get from hello.txt -> hello_my.txt.
Regarding the extension, extension="${file##*.}" gets it. It is based on the question Extract filename and extension in bash.
If the shell variable expansion mechanisms provided by fedorqui's answer look too unreadable to you, you also can use the unix tool basename with a second argument to strip off the suffix:
for file in *.txt
do
cp -i "$file" "$(basename "$file" .txt)_my.txt"
done
Btw, in such cases I always propose to apply the -i option for cp to prevent any unwanted overwrites due to typing errors or similar.
It's also possible to use a direct replacement with shell methods:
cp -i "$file" "${file/.txt/_my.txt}"
The ways are numerous :)

Resources