How can I remove characters in parentheses from file names? - bash

I have a list of file names of the form:
Filename (region).gba
And I would like to rename them all without the (region) tag.
How can I do this using standard command line tools?

Try:
for f in *'('*')'*; do mv -i "$f" "${f/(*)/}"; done
Or, for those who prefer their commands spread out over multiple lines:
for f in *'('*')'*
do
mv -i "$f" "${f/(*)/}"
done
How it works
for f in *'('*')'*; do
This starts a loop over all files whose names contain ( followed by ).
mv -i "$f" "${f/(*)/}"
This renames those files removing the parens and everything between the parens.
"${f/(*)/}" is an example of a shell feature called pattern substitution. It looks for an occurrence of the glob (*) and replaces it with an empty string. See man bash for more details.
The -i option tells mv not to overwrite a target file without asking. This is optional. You may prefer to make a backup copy instead. See man mv for more options.
done
This signals the end of the loop.
Example
Let's start in a directory with these files:
$ ls -1
Filename (region) 2.gba
Filename (region).gba
Now, let's run our command:
$ for f in *'('*')'*; do mv -i "$f" "${f//(*)/}"; done
After our command, the files have these names:
$ ls -1
Filename 2.gba
Filename .gba

You can use sed to solve this problem.
ls * | sed 's/\(.*\) \([(].*[)]\).*/mv "\1 \2.gba" "\1.gba"/g'
This will list the mv commands to move the files. Pipe through sh or bash to actually execute.
To explain:
ls * lists the files in the directory
sed will edit the incoming strings.
's/ begins a substitution
\(.*\) matches the non-region part of the file name, in capture group 1
\([(].*[)]\).* matches the remainder of the file name, except the extension
/mv "\1 \2.ext" "\1.ext"/g' composes the mv command and ends the substitution.

Related

sed can't read executable files when iterating in for loop

In a terminal shell, I am trying to loop over a set of Python files and perform find and replace with sed, e.g.:
$ for f in `ls *.py`; do sed -i 's|foo|bar|g' $f; done;
However, for some of the files (in particular just those Python scripts that I've changed to be executable), it gives the error:
sed: can't read example_script.py: No such file or directory
Why might it not be working for executable files, but working for other files?
The reason that the executable files are not being read by sed is because I have ls aliased to ls --color=auto. Therefore, the filenames returned by ls in the for loop are not just ascii strings with the filename, they also contain the colour information, e.g.,:
''$'\033''[01;32mexample_script.py'$'\033''[0m'
so sed can't find this weird file!
As pointed out in the comments, in this case there's actually no need to use ls to create my iterable list and I could instead do:
$ for f in *.py; do sed -i 's|foo|bar|g' "$f"; done;
or even without the for loop:
$ sed -i 's|foo|bar|g' *.py
Original answer: here's my original answer for posterity, kept here because it's what I did myself, before the help from the commenters.
The solution for me (given that this alias is set) is to instead run my for loop making sure to specify ls --color=none, i.e.,:
$ for f in `ls --color=none *.py`; do sed -i 's|foo|bar|g' $f; done;

Remove middle of filenames

I have a list of filenames like this in bash
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
And I want them to look like this
UTSHoS10_R1.fq.gz
UTSHoS10_R2.fq.gz
UTSHoS11_R1.fq.gz
UTSHoS11_R2.fq.gz
UTSHoS12_R1.fq.gz
UTSHoS12_R2.fq.gz
I do not have the perl rename command and sed 's/_Other*160418./_/' *.gz
is not doing anything. I've tried other rename scripts on here but either nothing occurs or my shell starts printing huge amounts of code to the console and freezes.
This post (Removing Middle of Filename) is similar however the answers given do not explain what specific parts of the command are doing so I could not apply it to my problem.
Parameter expansions in bash can perform string substitutions based on glob-like patterns, which allows for a more efficient solution than calling an extra external utility such as sed in each loop iteration:
for f in *.gz; do echo mv "$f" "${f/_Other_*-TTAGGA_R_160418./_}"; done
Remove the echo before mv to perform actual renaming.
You can do something like this in the directory which contains the files to be renamed:
for file_name in *.gz
do
new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name");
mv "$file_name" "$new_file_name";
done
The pattern (_[^.]*\.) starts matching from the FIRST _ till the FIRST . (both inclusive). [^.]* means 0 or more non-dot (or non-period) characters.
Example:
AMD$ ls
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
AMD$ for file_name in *.gz
> do new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name")
> mv "$file_name" "$new_file_name"
> done
AMD$ ls
UTSHoS10_R1.fq.gz UTSHoS10_R2.fq.gz UTSHoS11_R2.fq.gz UTSHoS12_R1.fq.gz UTSHoS12_R2.fq.gz
Pure Bash, using substring operation and assuming that all file names have the same length:
for file in UTS*.gz; do
echo mv -i "$file" "${file:0:9}${file:38:8}"
done
Outputs:
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS10_R1.fq.gz
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS10_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz UTSHoS12_R1.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz UTSHoS12_R2.fq.gz
Once verified, remove echo from the line inside the loop and run again.
Going with your sed command, this can work as a bash one-liner:
for name in UTSH*fq.gz; do newname=$(echo $name | sed 's/_Other.*160418\./_/'); echo mv $name $newname; done
Notes:
I've adjusted your sed command: it had an * without a preceeding . (sed takes a regular expression, not a globbing pattern). Similarly, the dot needs escaping.
To see if it works, without actually renaming the files, I've left the echo command in. Easy to remove just that to make it functional.
It doesn't have to be a one-liner, obviously. But sometimes, that makes editing and browsing your command-line history easier.

Iterating through files in a folder with sed

I've a list of csv-files and would like to use a for loop to edit the content for each file. I'd like to do that with sed. I have this sed commands which works fine when testing it on one file:
sed 's/[ "-]//g'
So now I want to execute this command for each file in a folder. I've tried this but so far no luck:
for i in *.csv; do sed 's/[ "-]//g' > $i.csv; done
I would like that he would overwrite each file with the edit performed by sed. The sed commands removes all spaces, the " and the '-' character.
Small changes,
for i in *.csv
do
sed -i 's/[ "-]//g' "$i"
done
Changes
when you iterate through the for you get the filenames in $i as example one.csv, two.csv etc. You can directly use these as input to the sed command.
-i Is for inline changes, the sed will do the substitution and updates the file for you. No output redirection is required.
In the code you wrote, I guess you missed any inputs to the sed command
In my case i want to replace every first occurrence of a particular string in each line for several text files, i've use the following:
//want to replace 16 with 1 in each files only for the first occurance
sed -i 's/16/1/' *.txt
In your case, In terminal you can try this
sed 's/[ "-]//g' *.csv
In certain scenarios it might be worth considering finding the files and executing a command on them like explained in this answer (as stated there, make sure echo $PATH doesn't contain .)
find /path/to/csv/ -type f '*.csv' -execdir sed -i 's/[ "-]//g' {} \;
here we:
find all files (type f) which end with .csv in the folder /path/to/csv/
sed the found files in place, ie we replace the original files with the changed version instead of creating numbered csv files ($i.csv)

Add suffix to all files in the directory with an extension

How to add a suffix to all files in the current directory in bash?
Here is what I've tried, but it keeps adding an extra .png to the filename.
for file in *.png; do mv "$file" "${file}_3.6.14.png"; done
for file in *.png; do
mv "$file" "${file%.png}_3.6.14.png"
done
${file%.png} expands to ${file} with the .png suffix removed.
You could do this through rename command,
rename 's/\.png/_3.6.14.png/' *.png
Through bash,
for i in *.png; do mv "$i" "${i%.*}_3.6.14.png"; done
It replaces .png in all the .png files with _3.6.14.png.
${i%.*} Anything after last dot would be cutdown. So .png part would be cutoff from the filename.
mv $i ${i%.*}_3.6.14.png Rename original .png files with the filename+_3.6.14.png.
If you are familiar with regular expressions sed is quite nice.
a) modify the regular expression to your liking and inspect the output
ls | sed -E "s/(.*)\.png$/\1_foo\.png/
b) add the p flag, so that sed provides you the old and new paths. Feed this to xargs with -n2, meaning that it should keep the pairing of 2 arguments.
ls | sed -E "p;s/(.*)\.png/\1_foo\.png/" | xargs -n2 mv
If you know how to rename a single file to your liking programmatically
fname=myfile.png
mv $fname ${fname%.png}_extended.png
you can batch apply this command with xargs:
find -name "*.png" | xargs -n1 bash -c 'mv $0 ${0%.png}_extended.png'
Explanation
We pipe the list of files to xargs and tell it to process one line at a time with the -n1 flag. We then tell xargs to call bash on each instance and provide it with the code to execute via the -c flag.
The $0 references the first input argument the bash receives.
If you need other string substitutions than ${0%.png} there are many cheat sheets such as https://devhints.io/bash.
For more complex substitutions you provide multiple arguments using -n2; these can be collected with $0, $1, etc..
This use of piping + xargs + bash -c is fairly general.
In the short example above, beware that I assumed proper file names (without special characters).

bash removing part of a file name

I have the following files in the following format:
$ ls CombinedReports_LLL-*'('*.csv
CombinedReports_LLL-20140211144020(Untitled_1).csv
CombinedReports_LLL-20140211144020(Untitled_11).csv
CombinedReports_LLL-20140211144020(Untitled_110).csv
CombinedReports_LLL-20140211144020(Untitled_111).csv
CombinedReports_LLL-20140211144020(Untitled_12).csv
CombinedReports_LLL-20140211144020(Untitled_13).csv
CombinedReports_LLL-20140211144020(Untitled_14).csv
CombinedReports_LLL-20140211144020(Untitled_15).csv
CombinedReports_LLL-20140211144020(Untitled_16).csv
CombinedReports_LLL-20140211144020(Untitled_17).csv
CombinedReports_LLL-20140211144020(Untitled_18).csv
CombinedReports_LLL-20140211144020(Untitled_19).csv
I would like this part removed:
20140211144020 (this is the timestamp the reports were run so this will vary)
and end up with something like:
CombinedReports_LLL-(Untitled_1).csv
CombinedReports_LLL-(Untitled_11).csv
CombinedReports_LLL-(Untitled_110).csv
CombinedReports_LLL-(Untitled_111).csv
CombinedReports_LLL-(Untitled_12).csv
CombinedReports_LLL-(Untitled_13).csv
CombinedReports_LLL-(Untitled_14).csv
CombinedReports_LLL-(Untitled_15).csv
CombinedReports_LLL-(Untitled_16).csv
CombinedReports_LLL-(Untitled_17).csv
CombinedReports_LLL-(Untitled_18).csv
CombinedReports_LLL-(Untitled_19).csv
I was thinking simply along the lines of the mv command, maybe something like this:
$ ls CombinedReports_LLL-*'('*.csv
but maybe a sed command or other would be better
rename is part of the perl package. It renames files according to perl-style regular expressions. To remove the dates from your file names:
rename 's/[0-9]{14}//' CombinedReports_LLL-*.csv
If rename is not available, sed+shell can be used:
for fname in Combined*.csv ; do mv "$fname" "$(echo "$fname" | sed -r 's/[0-9]{14}//')" ; done
The above loops over each of your files. For each file, it performs a mv command: mv "$fname" "$(echo "$fname" | sed -r 's/[0-9]{14}//')" where, in this case, sed is able to use the same regular expression as the rename command above. s/[0-9]{14}// tells sed to look for 14 digits in a row and replace them with an empty string.
Without using an other tools like rename or sed and sticking strictly to bash alone:
for f in CombinedReports_LLL-*.csv
do
newName=${f/LLL-*\(/LLL-(}
mv -i "$f" "$newName"
done
for f in CombinedReports_LLL-* ; do
b=${f:0:20}${f:34:500}
mv "$f" "$b"
done
You can try line by line on shell:
f="CombinedReports_LLL-20140211144020(Untitled_11).csv"
b=${f:0:20}${f:34:500}
echo $b
You can use the rename utility for this. It uses syntax much like sed to change filenames. The following example (from the rename man-page) shows how to remove the trailing '.bak' extension from a list of backup files in the local directory:
rename 's/\.bak$//' *.bak
I'm using the advice given in the top response and have put the following line into a shell script:
ls *.nii | xargs rename 's/[f_]{2}//' f_0*.nii
In terminal, this line works perfectly, but in my script it will not execute and reads * as a literal part of the file name.

Resources