I have been using the rename command to batch rename files. Up to now, I have had files like:
2010.306.18.08.11.0000.BO.ADM..BHZ.SAC
2010.306.18.08.11.0000.BO.AMM..BHZ.SAC
2010.306.18.08.11.0000.BO.ASI..BHE.SAC
2010.306.18.08.11.0000.BO.ASI..BHZ.SAC
and using rename 2010.306.18.08.11.0000.BO. "" * and rename .. _. * I have reduced them to:
ADM_.BHZ.SAC
AMM_.BHZ.SAC
ASI_.BHE.SAC
ASI_.BHZ.SAC
which is exactly what I want. A bit clumsy, I guess, but it works. The problem occurs now that I have files like:
2010.306.18.06.12.8195.TW.MASB..BHE.SAC
2010.306.18.06.14.7695.TW.CHGB..BHN.SAC
2010.306.18.06.24.4195.TW.NNSB..BHZ.SAC
2010.306.18.06.25.0695.TW.SSLB..BHZ.SAC
which exist in the same folder. I have been trying to get the similar results to above using wildcards in the rename command eg. rename 2010.306.18.*.*.*.*. "" but this appends the first appearance of 2010.306.18.*.*.*.*. to the beginning of all the other files - clearly not what I'm after, such that I get:
2010.306.18.06.12.8195.TW.MASB..BHE.SAC
2010.306.18.06.12.8195.TW.MASB..BHE.SAC2010.306.18.06.14.7695.TW.CHGB..BHN.SAC
2010.306.18.06.12.8195.TW.MASB..BHE.SAC2010.306.18.06.24.4195.TW.NNSB..BHZ.SAC
2010.306.18.06.12.8195.TW.MASB..BHE.SAC2010.306.18.06.25.0695.TW.SSLB..BHZ.SAC
I guess I am not understanding a fairly fundamental principal of wildcards here so, can someone please explain why this doesn't work and what I can do to get the desired result (preferably using rename).
N.B.
To clarify, the output wants to be:
ADM_.BHZ.SAC
AMM_.BHZ.SAC
ASI_.BHE.SAC
ASI_.BHZ.SAC
MASB.BHE.SAC
CHGB.BHN.SAC
NNSB.BHZ.SAC
SSLB.BHZ.SAC
You can try this first to see what commands would be executed
for f in *; do echo mv $f `echo $f | sed 's/2010.*.TW.//'` ; done
If it's what you expect, you can remove echo from the command to execute
for f in *; do mv $f `echo $f | sed 's/2010.*.TW.//'` ; done
rename does not allow wildcards in the from and to strings. When you run rename 2010.306.18.*.*.*.*. "" * it is actually your shell which first expands the wildcard and then passes the result of the expansion to rename, hence why it does not work.
Instead of using rename, use a loop as follows:
for file in *
do
tmp="${file##2010*TW.}" # remove the file prefix
mv "$file" "${tmp/../_}" # replace dots with underscore
done
Related
I have a folder structure where two files are in a folder. The files have long names, yet are distinguished by R1 and R2. Note I am running this over many folders using the for loop but keeping it simple for this example. I am running a loop and am wonder how to correctly call the files with a (*) star character to autocomplete without having to type in all file name. My attempt is below:
#!/bin/bash
for item in Folder_Directory:
do
forward=$item/*R1*
reverse=$item/*R2*
bbmap.sh ref=reference.fna in1=$forward in2=$reverse outu=Unmapped.fasta
done
The output I am getting is an error because the variable is not identifying the desired file:
Error:
align2.BBMap build=1 overwrite=true fastareadlen=500 ref=reference.fna
in1=Folder_Dictory/*R1* in2=Folder_Dictory/*R2* outu=Folder_Dictory/Unmapped.fastq
In this example I could autocomplete the files, however, when I expand this loop to include multiple folders that is no longer ideal. Autocompleting using (*) characters was my first approach, any other suggestions or fixes to my issue are greatly appreciated.
The problem is that the shell sees in1=Folder_Dictory/*R1* and notices that there are no files which match the glob with the literal in1= prefix, and so the wildcard does not get expanded at all.
You probably want to evaluate the wildcard before passing it to the command, like for instance
for item in Folder_Directory:
do
forward=$item/*R1*
reverse=$item/*R2*
bbmap.sh ref=reference.fna in1="$(echo $forward)" in2="$(echo $reverse)" outu=Unmapped.fasta
done
This will of course still be erratic if the wildcard expands to more than one file.
If you want only two files from your folder_structure, then i believe it would be good to use find to search for the files and assign then into separate variables as per your requirement...don't see use of for loop here.
forward=$(find Folder_Directory -type f -name "*R1*")
reverse=$(find Folder_Directory -type f -name "*R2*")
bbmap.sh ref=reference.fna in1="$forward" in2="$reverse" outu=Unmapped.fasta
It works like this:
test=f*
$ echo $test
file
But
$ echo "$test"
f*
And
test2=$test
$ echo "$test" $test2
f* file
$ echo "$test" "$test2"
f* f*
To make it work, you have to do something like this:
test3="$(echo $test)"
$ echo "$test" "$test2" "$test3"
f* f* file
Trying to remove a string that is located after the file name extension, on multiple files at once. I do not know where the files will be, just that they will reside in a subfolder of the one I am in.
Need to remove the last string, everything after the file extension. File name is:
something-unknown.js?ver=12234.... (last bit is unknown too)
This one (below) I found in this thread:
for nam in *sqlite3_done
do
newname=${nam%_done}
mv $nam $newname
done
I know that I have to use % to remove the bit from the end, but how do I use wildcards in the last bit, when I already have it as the "for any file" selector?
Have tried with a modifies bit of the above:
for nam in *.js*
do
newname=${ nam .js% } // removing all after .js
mv $nam $newname
done
I´m in MacOS Yosemite, got bash shell and sed. Know of rename and sed, but I´ve seen only topics with specific strings, no wildcards for this issue except these:
How to rename files using wildcard in bash?
https://unix.stackexchange.com/questions/227640/rename-first-part-of-multiple-files-with-mv
I think this is what you are looking for in terms of parameter substitution:
$ ls -C1
first-unknown.js?ver=111
second-unknown.js?ver=222
third-unknown.js?ver=333
$ for f in *.js\?ver=*; do echo ${f%\?*}; done
first-unknown.js
second-unknown.js
third-unknown.js
Note that we escape the ? as \? to say that we want to match the literal question mark, distinguishing it from the special glob symbol that matches any single character.
Renaming the files would then be something like:
$ for f in *.js\?ver=*; do echo "mv $f ${f%\?*}"; done
mv first-unknown.js?ver=111 first-unknown.js
mv second-unknown.js?ver=222 second-unknown.js
mv third-unknown.js?ver=333 third-unknown.js
Personally I like to output the commands, save it to a file, verify it's what I want, and then execute the file as a shell script.
If it needs to be fully automated you can remove the echo and do the mv directly.
for x in $(find . -type f -name '*.js*');do mv $x $(echo $x | sed 's/\.js.*/.js/'); done
I have a bunch of files (more than 1000) on this like the followings
$ ls
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-dev.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-dev.lex
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-train.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-train.lex
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lex
org.allenai.ari.solvers.termselector.ExpandedLearner.lc
org.allenai.ari.solvers.termselector.ExpandedLearner.lex
org.allenai.ari.solvers.termselector.ExpandedLearnerSVM.lc
org.allenai.ari.solvers.termselector.ExpandedLearnerSVM.lex
....
I have to rename these files files by adding a learners right before the capitalized name. For example
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lex
would change to
org.allenai.ari.solvers.termselector.learners.BaselineLearnersurfaceForm.lex
and this one
org.allenai.ari.solvers.termselector.ExpandedLearner.lc
would change to
org.allenai.ari.solvers.termselector.learners.ExpandedLearner.lc
Any ideas how to do this automatically?
for f in org.*; do
echo mv "$f" "$( sed 's/\.\([A-Z]\)/.learner.\1/' <<< "$f" )"
done
This short loop outputs an mv command that renames the files in the manner that you wanted. Run it as-is first, and when you are certain it's doing what you want, remove the echo and run again.
The sed bit in the middle takes a filename ($f, via a here-string, so this requires bash) and replaces the first occurrence of a capital letter after a dot with .learner. followed by that same capital letter.
There is a tool called perl-rename, sometimes rename. Not to be confused with rename from util-linux.
It's very good for tasks like this as it takes a perl expression and renames accordingly:
perl-rename 's/(?=\.[A-Z])/.learners/' *
You can play with the regex online
Alternative you can a for loop and $BASH_REMATCH:
for file in *; do
[ -e "$file" ] || continue
[[ "$file" =~ ^([^A-Z]*)(.*)$ ]]
mv -- "$file" "${BASH_REMATCH[1]}learners.${BASH_REMATCH[2]}"
done
A very simple approach (useful if you only need to do this one time) is to ls >dummy them into a text file dummy, and then use find/replace in a text editor to make lines of the form mv xxx.yyy xxx.learners.yyy. Then you can simple execute the resulting file with ./dummy.
The exact find/replace commands depend on the text editor you use, but something like
replace org. with mv org.. That gets you the mv in the beginning.
replace mv org.allenai.ari.solvers.termselector.$1 with mv org.allenai.ari.solvers.termselector.$1 org.allenai.ari.solvers.termselector.learner.$1 to duplicate the filename and insert the learner.
There is also syntax with a for, which can do it probably in one line, (long) but I cannot explain it - try help for if you want to learn about it.
I have a directory with more than 500 files, here's a sample of the files:
random-code_aa.log
random-code_aa_r-13.log
random-code_ab.log
random-code_ae.log
random-code_ag.log
random-code_ag_r-397.log
random-code_ah.log
random-code_ac.log
random-code_ac_r-41.log
random-code_ax.log
random-code_ax_r-273.log
random-code_az.log
what I would like to do, preferably using a bash loop, is look into the directory for the *_r-*.log files and if found then try to see if similar .log files exist but without whatever is preceding _r-*.log, if found then rename the .log files into their corresponding _r-*.log files but change the r into i.
Better demonstrate with an example from the files sample above:
if "random-code_aa_r-13.log" and "random-code_aa.log" exist then
rename "random-code_aa.log" to "random-code_aa_i-13.log"
I've tried with mv and rename but nothing worked.
This simple BASH script should take care of that:
for f in *_r-*.log; do
rf="${f/_r-*log/.log}"
[[ -f "$rf" ]] && mv "$rf" "${f/_r-/_i-}"
done
You can use sed:
for file in *_r-*.log ; do
barename=`echo $file | sed 's/_r-.*/.log/'`
newname=`echo $file | sed 's/_r-\(.*\)/_i-\1/'`
if [ -f $barename ] ; then
mv $barename $newname
fi
done
You can try to improve the regexes, as it is not safe for some file names. But it should work for file names that contain the minus sign only as the separator character.
You should be able to do that with a parameter substitution:
for f in *_r-*.log
do
stem="${f%_r-*.log}
num="${f%.log}"; num="${num##_r-}"
if test -e "${stem}_aa.log"
then mv "${stem}_aa.log" "${stem}_aa-${num}.log"
fi
done
I need to rename 45 files, and I don't want to do it one by one. These are the file names:
chr10.fasta chr13_random.fasta chr17.fasta chr1.fasta chr22_random.fasta chr4_random.fasta chr7_random.fasta chrX.fasta
chr10_random.fasta chr14.fasta chr17_random.fasta chr1_random.fasta chr2.fasta chr5.fasta chr8.fasta chrX_random.fasta
chr11.fasta chr15.fasta chr18.fasta chr20.fasta chr2_random.fasta chr5_random.fasta chr8_random.fasta chrY.fasta
chr11_random.fasta chr15_random.fasta chr18_random.fasta chr21.fasta chr3.fasta chr6.fasta chr9.fasta
chr12.fasta chr16.fasta chr19.fasta chr21_random.fasta chr3_random.fasta chr6_random.fasta chr9_random.fasta
chr13.fasta chr16_random.fasta chr19_random.fasta chr22.fasta chr4.fasta chr7.fasta chrM.fasta
I need to change the extension ".fasta" to ".fa". I'm trying to write a bash script to do it:
for i in $(ls chr*)
do
NEWNAME = `echo $i | sed 's/sta//g'`
mv $i $NEWNAME
done
But it doesn't work. Can you tell me why, or give another quick solution?
Thanks!
Several mistakes here:
NEWNAME = should be without space. Here bash is looking for a command named NEWNAME and that fails.
you parse the output of ls. this is bad if you had files with spaces. Bash can build itself a list of files with the glob operator *.
You don't escape "$i" and "$NEWNAME". If any of them contains a space it makes two arguments for mv.
If a file name begins with a dash mv will believe it is a switch. Use -- to stop argument processing.
Try:
for i in chr*
do
mv -- "$i" "${i/%.fasta/.fa}"
done
or
for i in chr*
do
NEWNAME="${i/%.fasta/.fa}"
mv -- "$i" "$NEWNAME"
done
The "%{var/%pat/replacement}" looks for pat only at the end of the variable and replaces it with replacement.
for f in chr*.fasta; do mv "$f" "${f/%.fasta/.fa}"; done
If you have the rename command, you can do:
rename .fasta .fa chr*.fasta