Remove middle of filenames - bash

I have a list of filenames like this in bash
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
And I want them to look like this
UTSHoS10_R1.fq.gz
UTSHoS10_R2.fq.gz
UTSHoS11_R1.fq.gz
UTSHoS11_R2.fq.gz
UTSHoS12_R1.fq.gz
UTSHoS12_R2.fq.gz
I do not have the perl rename command and sed 's/_Other*160418./_/' *.gz
is not doing anything. I've tried other rename scripts on here but either nothing occurs or my shell starts printing huge amounts of code to the console and freezes.
This post (Removing Middle of Filename) is similar however the answers given do not explain what specific parts of the command are doing so I could not apply it to my problem.

Parameter expansions in bash can perform string substitutions based on glob-like patterns, which allows for a more efficient solution than calling an extra external utility such as sed in each loop iteration:
for f in *.gz; do echo mv "$f" "${f/_Other_*-TTAGGA_R_160418./_}"; done
Remove the echo before mv to perform actual renaming.

You can do something like this in the directory which contains the files to be renamed:
for file_name in *.gz
do
new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name");
mv "$file_name" "$new_file_name";
done
The pattern (_[^.]*\.) starts matching from the FIRST _ till the FIRST . (both inclusive). [^.]* means 0 or more non-dot (or non-period) characters.
Example:
AMD$ ls
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
AMD$ for file_name in *.gz
> do new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name")
> mv "$file_name" "$new_file_name"
> done
AMD$ ls
UTSHoS10_R1.fq.gz UTSHoS10_R2.fq.gz UTSHoS11_R2.fq.gz UTSHoS12_R1.fq.gz UTSHoS12_R2.fq.gz

Pure Bash, using substring operation and assuming that all file names have the same length:
for file in UTS*.gz; do
echo mv -i "$file" "${file:0:9}${file:38:8}"
done
Outputs:
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS10_R1.fq.gz
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS10_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz UTSHoS12_R1.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz UTSHoS12_R2.fq.gz
Once verified, remove echo from the line inside the loop and run again.

Going with your sed command, this can work as a bash one-liner:
for name in UTSH*fq.gz; do newname=$(echo $name | sed 's/_Other.*160418\./_/'); echo mv $name $newname; done
Notes:
I've adjusted your sed command: it had an * without a preceeding . (sed takes a regular expression, not a globbing pattern). Similarly, the dot needs escaping.
To see if it works, without actually renaming the files, I've left the echo command in. Easy to remove just that to make it functional.
It doesn't have to be a one-liner, obviously. But sometimes, that makes editing and browsing your command-line history easier.

Related

Keep 9 characters intact and rename all files in a folder

I am new with Bash, and trying to rename files in my folder keeping the first 9 characters intact and get rid of anything that comes after.
abc123456olda.jpg > abc123456.jpg
I wrote this;
for file in *
do
echo mv "$file" `echo "$file" | sed -e 's/(.{9}).*(\.jpg)$/$1$2/' *.jpg
done
Did not get it to work. Can someone guide what am I doing wrong?
You're not far off, try this:
for file in *.jpg; do
echo mv "$file" "$(echo "$file" | sed -E -e 's/(.{9}).*(\.jpg)$/\1\2/')"
done
There are some corrections. A important one is that $1$2 should be \1\2, and you need the -E flag to sed so that it understands the grouping with parenthesis.
Once you see the command is alright, remove the echo from the second line so mv actually gets executed.
Use bash's built-in parameter expansion operator rather than sed.
Also, you should put *.jpg in the for statement, not the sed argument; what you're doing is processing the contents of the files, not the filenames.
for file in *.jpg
do
mv "$file" "${file:0:9}.jpg"
done
${file:0:9} means the substring of $file starting from index 0 and having 9 characters.

How can I remove characters in parentheses from file names?

I have a list of file names of the form:
Filename (region).gba
And I would like to rename them all without the (region) tag.
How can I do this using standard command line tools?
Try:
for f in *'('*')'*; do mv -i "$f" "${f/(*)/}"; done
Or, for those who prefer their commands spread out over multiple lines:
for f in *'('*')'*
do
mv -i "$f" "${f/(*)/}"
done
How it works
for f in *'('*')'*; do
This starts a loop over all files whose names contain ( followed by ).
mv -i "$f" "${f/(*)/}"
This renames those files removing the parens and everything between the parens.
"${f/(*)/}" is an example of a shell feature called pattern substitution. It looks for an occurrence of the glob (*) and replaces it with an empty string. See man bash for more details.
The -i option tells mv not to overwrite a target file without asking. This is optional. You may prefer to make a backup copy instead. See man mv for more options.
done
This signals the end of the loop.
Example
Let's start in a directory with these files:
$ ls -1
Filename (region) 2.gba
Filename (region).gba
Now, let's run our command:
$ for f in *'('*')'*; do mv -i "$f" "${f//(*)/}"; done
After our command, the files have these names:
$ ls -1
Filename 2.gba
Filename .gba
You can use sed to solve this problem.
ls * | sed 's/\(.*\) \([(].*[)]\).*/mv "\1 \2.gba" "\1.gba"/g'
This will list the mv commands to move the files. Pipe through sh or bash to actually execute.
To explain:
ls * lists the files in the directory
sed will edit the incoming strings.
's/ begins a substitution
\(.*\) matches the non-region part of the file name, in capture group 1
\([(].*[)]\).* matches the remainder of the file name, except the extension
/mv "\1 \2.ext" "\1.ext"/g' composes the mv command and ends the substitution.

bash removing part of a file name

I have the following files in the following format:
$ ls CombinedReports_LLL-*'('*.csv
CombinedReports_LLL-20140211144020(Untitled_1).csv
CombinedReports_LLL-20140211144020(Untitled_11).csv
CombinedReports_LLL-20140211144020(Untitled_110).csv
CombinedReports_LLL-20140211144020(Untitled_111).csv
CombinedReports_LLL-20140211144020(Untitled_12).csv
CombinedReports_LLL-20140211144020(Untitled_13).csv
CombinedReports_LLL-20140211144020(Untitled_14).csv
CombinedReports_LLL-20140211144020(Untitled_15).csv
CombinedReports_LLL-20140211144020(Untitled_16).csv
CombinedReports_LLL-20140211144020(Untitled_17).csv
CombinedReports_LLL-20140211144020(Untitled_18).csv
CombinedReports_LLL-20140211144020(Untitled_19).csv
I would like this part removed:
20140211144020 (this is the timestamp the reports were run so this will vary)
and end up with something like:
CombinedReports_LLL-(Untitled_1).csv
CombinedReports_LLL-(Untitled_11).csv
CombinedReports_LLL-(Untitled_110).csv
CombinedReports_LLL-(Untitled_111).csv
CombinedReports_LLL-(Untitled_12).csv
CombinedReports_LLL-(Untitled_13).csv
CombinedReports_LLL-(Untitled_14).csv
CombinedReports_LLL-(Untitled_15).csv
CombinedReports_LLL-(Untitled_16).csv
CombinedReports_LLL-(Untitled_17).csv
CombinedReports_LLL-(Untitled_18).csv
CombinedReports_LLL-(Untitled_19).csv
I was thinking simply along the lines of the mv command, maybe something like this:
$ ls CombinedReports_LLL-*'('*.csv
but maybe a sed command or other would be better
rename is part of the perl package. It renames files according to perl-style regular expressions. To remove the dates from your file names:
rename 's/[0-9]{14}//' CombinedReports_LLL-*.csv
If rename is not available, sed+shell can be used:
for fname in Combined*.csv ; do mv "$fname" "$(echo "$fname" | sed -r 's/[0-9]{14}//')" ; done
The above loops over each of your files. For each file, it performs a mv command: mv "$fname" "$(echo "$fname" | sed -r 's/[0-9]{14}//')" where, in this case, sed is able to use the same regular expression as the rename command above. s/[0-9]{14}// tells sed to look for 14 digits in a row and replace them with an empty string.
Without using an other tools like rename or sed and sticking strictly to bash alone:
for f in CombinedReports_LLL-*.csv
do
newName=${f/LLL-*\(/LLL-(}
mv -i "$f" "$newName"
done
for f in CombinedReports_LLL-* ; do
b=${f:0:20}${f:34:500}
mv "$f" "$b"
done
You can try line by line on shell:
f="CombinedReports_LLL-20140211144020(Untitled_11).csv"
b=${f:0:20}${f:34:500}
echo $b
You can use the rename utility for this. It uses syntax much like sed to change filenames. The following example (from the rename man-page) shows how to remove the trailing '.bak' extension from a list of backup files in the local directory:
rename 's/\.bak$//' *.bak
I'm using the advice given in the top response and have put the following line into a shell script:
ls *.nii | xargs rename 's/[f_]{2}//' f_0*.nii
In terminal, this line works perfectly, but in my script it will not execute and reads * as a literal part of the file name.

Remove hyphens from filename with Bash

I am trying to create a small Bash script to remove hyphens from a filename. For example, I want to rename:
CropDamageVO-041412.mpg
to
CropDamageVO041412.mpg
I'm new to Bash, so be gentle :] Thank you for any help
Try this:
for file in $(find dirWithDashedFiles -type f -iname '*-*'); do
mv $file ${file//-/}
done
That's assuming that your directories don't have dashes in the name. That would break this.
The ${varname//regex/replacementText} syntax is explained here. Just search for substring replacement.
Also, this would break if your directories or filenames have spaces in them. If you have spaces in your filenames, you should use this:
for file in *-*; do
mv $file "${file//-/}"
done
This has the disadvantage of having to be run in every directory that contains files you want to change, but, like I said, it's a little more robust.
FN=CropDamageVO-041412.mpg
mv $FN `echo $FN | sed -e 's/-//g'`
The backticks (``) tell bash to run the command inside them and use the output of that command in the expression. The sed part applies a regular expression to remove the hyphens from the filename.
Or to do this to all files in the current directory matching a certain pattern:
for i in *VO-*.mpg
do
mv $i `echo $i | sed -e 's/-//g'`
done
A general solution for removing hyphens from any string:
$ echo "remove-all-hyphens" | tr -d '-'
removeallhyphens
$
f=CropDamageVO-041412.mpg
echo "${f//-}"
or, of course,
mv "$f" "${f//-}"

Redirect output from sed 's/c/d/' myFile to myFile

I am using sed in a script to do a replace and I want to have the replaced file overwrite the file. Normally I think that you would use this:
% sed -i 's/cat/dog/' manipulate
sed: illegal option -- i
However as you can see my sed does not have that command.
I tried this:
% sed 's/cat/dog/' manipulate > manipulate
But this just turns manipulate into an empty file (makes sense).
This works:
% sed 's/cat/dog/' manipulate > tmp; mv tmp manipulate
But I was wondering if there was a standard way to redirect output into the same file that input was taken from.
I commonly use the 3rd way, but with an important change:
$ sed 's/cat/dog/' manipulate > tmp && mv tmp manipulate
I.e. change ; to && so the move only happens if sed is successful; otherwise you'll lose your original file as soon as you make a typo in your sed syntax.
Note! For those reading the title and missing the OP's constraint "my sed doesn't support -i": For most people, sed will support -i, so the best way to do this is:
$ sed -i 's/cat/dog/' manipulate
Yes, -i is also supported in FreeBSD/MacOSX sed, but needs the empty string as an argument to edit a file in-place.
sed -i "" 's/old/new/g' file # FreeBSD sed
If you don't want to move copies around, you could use ed:
ed file.txt <<EOF
%s/cat/dog/
wq
EOF
Kernighan and Pike in The Art of Unix Programming discuss this issue. Their solution is to write a script called overwrite, which allows one to do such things.
The usage is: overwrite file cmd file.
# overwrite: copy standard input to output after EOF
opath=$PATH
PATH=/bin:/usr/bin
case $# in
0|1) echo 'Usage: overwrite file cmd [args]' 1>&2; exit 2
esac
file=$1; shift
new=/tmp/overwr1.$$; old=/tmp/overwr2.$$
trap 'rm -f $new $old; exit 1' 1 2 15 # clean up
if PATH=$opath "$#" >$new
then
cp $file $old # save original
trap '' 1 2 15 # wr are commmitted
cp $new $file
else
echo "overwrite: $1 failed, $file unchanged" 1>&2
exit 1
fi
rm -f $new $old
Once you have the above script in your $PATH, you can do:
overwrite manipulate sed 's/cat/dog/' manipulate
To make your life easier, you can use replace script from the same book:
# replace: replace str1 in files with str2 in place
PATH=/bin:/usr/bin
case $# in
0|2) echo 'Usage: replace str1 str2 files' 1>&2; exit 1
esac
left="$1"; right="$2"; shift; shift
for i
do
overwrite $i sed "s#$left#$right#g" $i
done
Having replace in your $PATH too will allow you to say:
replace cat dog manipulate
You can use sponge from the moreutils.
sed "s/cat/dog/" manipulate | sponge manipulate
Perhaps -i is gnu sed, or just an old version of sed, but anyways. You're on the right track. The first option is probably the most common one, the third option is if you want it to work everywhere (including solaris machines)... :) These are the 'standard' ways of doing it.
To change multiple files (and saving a backup of each as *.bak):
perl -p -i -e "s/oldtext/newtext/g" *
replaces any occurence of oldtext by newtext in all files in the current folder. However you will have to escape all perl special characters within oldtext and newtext using the backslash
This is called a “Perl pie” (mnemonic: easy as a pie)
The -i flag tells it do do in-place replacement, and it should be ok to use single (“'”) as well as double (“””) quotes.
If using ./* instead of just *, you should be able to do it in all sub-directories
See man perlrun for more details, including how to take a backup file of the original.
using sed:
sed -i 's/old/new/g' ./* (used in GNU)
sed -i '' 's/old/new/g' ./* (used in FreeBSD)
-i option is not available in standard sed.
Your alternatives are your third way or perl.
A lot of answers, but none of them is correct. Here is the correct and simplest one:
$ echo "111 222 333" > file.txt
$ sed -i -s s/222/444/ file.txt
$ cat file.txt
111 444 333
$
Workaround using open file handles:
exec 3<manipulate
Prevent open file from being truncated:
rm manipulate
sed 's/cat/dog/' <&3 > manipulate

Resources