I have a bunch of files (more than 1000) on this like the followings
$ ls
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-dev.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-dev.lex
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-train.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-train.lex
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lex
org.allenai.ari.solvers.termselector.ExpandedLearner.lc
org.allenai.ari.solvers.termselector.ExpandedLearner.lex
org.allenai.ari.solvers.termselector.ExpandedLearnerSVM.lc
org.allenai.ari.solvers.termselector.ExpandedLearnerSVM.lex
....
I have to rename these files files by adding a learners right before the capitalized name. For example
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lex
would change to
org.allenai.ari.solvers.termselector.learners.BaselineLearnersurfaceForm.lex
and this one
org.allenai.ari.solvers.termselector.ExpandedLearner.lc
would change to
org.allenai.ari.solvers.termselector.learners.ExpandedLearner.lc
Any ideas how to do this automatically?
for f in org.*; do
echo mv "$f" "$( sed 's/\.\([A-Z]\)/.learner.\1/' <<< "$f" )"
done
This short loop outputs an mv command that renames the files in the manner that you wanted. Run it as-is first, and when you are certain it's doing what you want, remove the echo and run again.
The sed bit in the middle takes a filename ($f, via a here-string, so this requires bash) and replaces the first occurrence of a capital letter after a dot with .learner. followed by that same capital letter.
There is a tool called perl-rename, sometimes rename. Not to be confused with rename from util-linux.
It's very good for tasks like this as it takes a perl expression and renames accordingly:
perl-rename 's/(?=\.[A-Z])/.learners/' *
You can play with the regex online
Alternative you can a for loop and $BASH_REMATCH:
for file in *; do
[ -e "$file" ] || continue
[[ "$file" =~ ^([^A-Z]*)(.*)$ ]]
mv -- "$file" "${BASH_REMATCH[1]}learners.${BASH_REMATCH[2]}"
done
A very simple approach (useful if you only need to do this one time) is to ls >dummy them into a text file dummy, and then use find/replace in a text editor to make lines of the form mv xxx.yyy xxx.learners.yyy. Then you can simple execute the resulting file with ./dummy.
The exact find/replace commands depend on the text editor you use, but something like
replace org. with mv org.. That gets you the mv in the beginning.
replace mv org.allenai.ari.solvers.termselector.$1 with mv org.allenai.ari.solvers.termselector.$1 org.allenai.ari.solvers.termselector.learner.$1 to duplicate the filename and insert the learner.
There is also syntax with a for, which can do it probably in one line, (long) but I cannot explain it - try help for if you want to learn about it.
Related
Trying to remove a string that is located after the file name extension, on multiple files at once. I do not know where the files will be, just that they will reside in a subfolder of the one I am in.
Need to remove the last string, everything after the file extension. File name is:
something-unknown.js?ver=12234.... (last bit is unknown too)
This one (below) I found in this thread:
for nam in *sqlite3_done
do
newname=${nam%_done}
mv $nam $newname
done
I know that I have to use % to remove the bit from the end, but how do I use wildcards in the last bit, when I already have it as the "for any file" selector?
Have tried with a modifies bit of the above:
for nam in *.js*
do
newname=${ nam .js% } // removing all after .js
mv $nam $newname
done
I´m in MacOS Yosemite, got bash shell and sed. Know of rename and sed, but I´ve seen only topics with specific strings, no wildcards for this issue except these:
How to rename files using wildcard in bash?
https://unix.stackexchange.com/questions/227640/rename-first-part-of-multiple-files-with-mv
I think this is what you are looking for in terms of parameter substitution:
$ ls -C1
first-unknown.js?ver=111
second-unknown.js?ver=222
third-unknown.js?ver=333
$ for f in *.js\?ver=*; do echo ${f%\?*}; done
first-unknown.js
second-unknown.js
third-unknown.js
Note that we escape the ? as \? to say that we want to match the literal question mark, distinguishing it from the special glob symbol that matches any single character.
Renaming the files would then be something like:
$ for f in *.js\?ver=*; do echo "mv $f ${f%\?*}"; done
mv first-unknown.js?ver=111 first-unknown.js
mv second-unknown.js?ver=222 second-unknown.js
mv third-unknown.js?ver=333 third-unknown.js
Personally I like to output the commands, save it to a file, verify it's what I want, and then execute the file as a shell script.
If it needs to be fully automated you can remove the echo and do the mv directly.
for x in $(find . -type f -name '*.js*');do mv $x $(echo $x | sed 's/\.js.*/.js/'); done
I want to take a group of files with names like 123456_1_2.mpg and turn it into 123456.mpg how can I do this using terminal commands?
To loop over all the available files you can use a for loop over the file names of the form ??????_?_?.mpg.
To rename the files you can retain the shortest match of a pattern from the beginning of the string using ${MYVAR%%pattern} without using any external command.
This said, your code should look like:
#!/bin/bash
shopt -s nullglob # do nothing if no matches found
for file in ??????_?_?.mpg; do
[[ -f $file ]] || continue # skip if not a regular file
new_file="${file%%_*}.mpg" # compose the new file name
echo mv "$file" "$new_file" # remove echo after testing
done
rename 's/_.*/.mpg/' *mpg
this will remove everything between the first underscore and the mpg file extension for all files ending in mpg
We can use grep to strip out everything but the first sequence of numbers. The --interactive flag will ask you if you're sure for each move, so you can make sure it's not doing anything you don't expect.
for file in *.mpg; do
mv --interactive "$file" "$(grep -o '^[0-9]\+' <<< "$file")".mpg
done
The regex ^[0-9]\+ translates to "any sequence of characters that starts with a number and is followed by zero or more numbers".
I have many files and all of them have the word text in them.
like test text22 test.mp3
"test" can include all kinds of characters -> -/()(&%0-9...
Now I want to rename every file so that a underscore is added before every "text" like test_test22 test.mp4. Is there a straight forward way to do this?
With Perl‘s standalone rename command:
rename -n 's/ (text[0-9]{1,2})/_$1/' *text*
If everything looks okay, remove option -n.
Another way to do it which illustrates the (often very useful) use of regular expression matching in Bash.
#!/bin/bash
for file in *
do
if [[ $file =~ (.*)text(.*) ]] ; then
mv "$file" "${BASH_REMATCH[1]}text_${BASH_REMATCH[2]}"
fi
done
This would be especially useful if you want to do something other than just rename the files.
I assume you don't want to add the underscore but replace leading space with it like in your example (but then again it had mp3 -> mp4 so just making sure):
$ ls
test text22 test.mp3
text22 test.mp3
$ for f in *text*; do "echo ${f/ text/_text}" ; done
test_text22 test.mp3
text22 test.mp3
To mv replace the echo with mv "$f" "${f/ text/_text}"
I have been using the rename command to batch rename files. Up to now, I have had files like:
2010.306.18.08.11.0000.BO.ADM..BHZ.SAC
2010.306.18.08.11.0000.BO.AMM..BHZ.SAC
2010.306.18.08.11.0000.BO.ASI..BHE.SAC
2010.306.18.08.11.0000.BO.ASI..BHZ.SAC
and using rename 2010.306.18.08.11.0000.BO. "" * and rename .. _. * I have reduced them to:
ADM_.BHZ.SAC
AMM_.BHZ.SAC
ASI_.BHE.SAC
ASI_.BHZ.SAC
which is exactly what I want. A bit clumsy, I guess, but it works. The problem occurs now that I have files like:
2010.306.18.06.12.8195.TW.MASB..BHE.SAC
2010.306.18.06.14.7695.TW.CHGB..BHN.SAC
2010.306.18.06.24.4195.TW.NNSB..BHZ.SAC
2010.306.18.06.25.0695.TW.SSLB..BHZ.SAC
which exist in the same folder. I have been trying to get the similar results to above using wildcards in the rename command eg. rename 2010.306.18.*.*.*.*. "" but this appends the first appearance of 2010.306.18.*.*.*.*. to the beginning of all the other files - clearly not what I'm after, such that I get:
2010.306.18.06.12.8195.TW.MASB..BHE.SAC
2010.306.18.06.12.8195.TW.MASB..BHE.SAC2010.306.18.06.14.7695.TW.CHGB..BHN.SAC
2010.306.18.06.12.8195.TW.MASB..BHE.SAC2010.306.18.06.24.4195.TW.NNSB..BHZ.SAC
2010.306.18.06.12.8195.TW.MASB..BHE.SAC2010.306.18.06.25.0695.TW.SSLB..BHZ.SAC
I guess I am not understanding a fairly fundamental principal of wildcards here so, can someone please explain why this doesn't work and what I can do to get the desired result (preferably using rename).
N.B.
To clarify, the output wants to be:
ADM_.BHZ.SAC
AMM_.BHZ.SAC
ASI_.BHE.SAC
ASI_.BHZ.SAC
MASB.BHE.SAC
CHGB.BHN.SAC
NNSB.BHZ.SAC
SSLB.BHZ.SAC
You can try this first to see what commands would be executed
for f in *; do echo mv $f `echo $f | sed 's/2010.*.TW.//'` ; done
If it's what you expect, you can remove echo from the command to execute
for f in *; do mv $f `echo $f | sed 's/2010.*.TW.//'` ; done
rename does not allow wildcards in the from and to strings. When you run rename 2010.306.18.*.*.*.*. "" * it is actually your shell which first expands the wildcard and then passes the result of the expansion to rename, hence why it does not work.
Instead of using rename, use a loop as follows:
for file in *
do
tmp="${file##2010*TW.}" # remove the file prefix
mv "$file" "${tmp/../_}" # replace dots with underscore
done
I need to rename 45 files, and I don't want to do it one by one. These are the file names:
chr10.fasta chr13_random.fasta chr17.fasta chr1.fasta chr22_random.fasta chr4_random.fasta chr7_random.fasta chrX.fasta
chr10_random.fasta chr14.fasta chr17_random.fasta chr1_random.fasta chr2.fasta chr5.fasta chr8.fasta chrX_random.fasta
chr11.fasta chr15.fasta chr18.fasta chr20.fasta chr2_random.fasta chr5_random.fasta chr8_random.fasta chrY.fasta
chr11_random.fasta chr15_random.fasta chr18_random.fasta chr21.fasta chr3.fasta chr6.fasta chr9.fasta
chr12.fasta chr16.fasta chr19.fasta chr21_random.fasta chr3_random.fasta chr6_random.fasta chr9_random.fasta
chr13.fasta chr16_random.fasta chr19_random.fasta chr22.fasta chr4.fasta chr7.fasta chrM.fasta
I need to change the extension ".fasta" to ".fa". I'm trying to write a bash script to do it:
for i in $(ls chr*)
do
NEWNAME = `echo $i | sed 's/sta//g'`
mv $i $NEWNAME
done
But it doesn't work. Can you tell me why, or give another quick solution?
Thanks!
Several mistakes here:
NEWNAME = should be without space. Here bash is looking for a command named NEWNAME and that fails.
you parse the output of ls. this is bad if you had files with spaces. Bash can build itself a list of files with the glob operator *.
You don't escape "$i" and "$NEWNAME". If any of them contains a space it makes two arguments for mv.
If a file name begins with a dash mv will believe it is a switch. Use -- to stop argument processing.
Try:
for i in chr*
do
mv -- "$i" "${i/%.fasta/.fa}"
done
or
for i in chr*
do
NEWNAME="${i/%.fasta/.fa}"
mv -- "$i" "$NEWNAME"
done
The "%{var/%pat/replacement}" looks for pat only at the end of the variable and replaces it with replacement.
for f in chr*.fasta; do mv "$f" "${f/%.fasta/.fa}"; done
If you have the rename command, you can do:
rename .fasta .fa chr*.fasta