Rename files to drop a date stamp

Rename files to drop a date stamp - bash

I have files in an input directory. The file names are as given below
SEMAPHOREINPUT_10-06-2015.xlsx
WRAPPERINPUT_10-06-2015.xlsx
These files will be updated on daily basis so, tomorrow, the input would be
SEMAPHOREINPUT_11-06-2015.xlsx
WRAPPERINPUT_11-06-2015.xlsx
I need to rename these files to the filenames below:
SEMAPHOREINPUT.xlsx
WRAPPERINPUT.xlsx
I tried using the shell script below, but it is not working.
#!/bin/bash
ls | while read FILES
do
newfile = ${FILES/\SEMAPHOREINPUT_.*.xlsx/}
mv $newfile /home/test
done

Three critical errors:
When assigning a value to a variable, there must not be spaces before and after the = sign.
Your string substitution is wrong. To drop the date stamp portion of the filename, use
${FILES/_[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]/}
… or better yet, since the pattern to be dropped must occur at the end of the string,
${FILES%_[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9].xlsx}.xlsx
You seem to have confused the initial and final filenames.
There are other problems as well:
Parsing the output of ls is more complex and less reliable than a loop with a glob:
for file in *; do
…
done
Better yet, be explicit to avoid surprises:
for file in SEMAPHOREINPUT_[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9].xlsx
WRAPPERINPUT_[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9].xlsx; do
mv "$file" "/home/test/${file%_[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9].xlsx}.xlsx"
done

You can do it like this
for i in `ls *.xlsx`;
do n1=`echo $i|cut -d '_' -f1`;
t=`echo $n1.xlsx`;
mv $i $t;
mv $t /home/test;
done

Related

How can I create a rename script using multiple rules?

I constantly get a bunch of files named "Unknown.png" into a folder, and often times they get renamed "unknown (1).png, unknown (2).png" etc. This is a bit of a problem as sometimes when cleaning up files and moving them somewhere else I get asked if I want to replace or rename, etc.
So I decided to make a crontab task that renames the files to CB_RANDOM this way I don't even have to worry about potentially overwriting two files with the same name.
I could figure it so far, I find the files, replace the name Unknown to CB_ and add a random number.
the problem comes to (x) at the end of the filename. I managed to figure out also how to solve it I just strip away any parenthesis and numbers.
The problem is I can't figure out how to make the rename function to follow both rules.
for u in (find -name unknown*); do
rCode = random
rename -v 's/unknown/CB_$rCode' $u
rename -v 's/[ ()0123456789]//g' $u
Ideally I'd like to be able to follow both rules on the same line of code, specially since once it runs the first line, then $u wont be able to find the file for the second step.

No need for a loop:
find -name 'unknown*' -exec rename 's/unknown \([0-9]+\)\.(.*)$/"CB_".sprintf("%04s",int(rand(10000))).".".$1/e' {} \;
find all the files, starting in the current directory, recursively, with names similar to "unknown (1).png"
rename them with a resulting filename similar to "CB_0135.png"
This produces an error message if a filename already exists.

Your code should first be changed into
# find is a subcommand, use $()
# find a file with wildcard, use quotes
for u in $(find -name "unknown*"); do
# Is random a command? Use $()
rCode=$(random)
# Debug with echo, will show other problem
echo "File $u"
# $rCode will not be replaced by its value in single quotes
# Write a filename in double quotes, so it will not be split by a space
rename -v "s/unknown/CB_$rCode" "$u"
rename -v 's/[ ()0123456789]//g' "$u"
done
The new line with echo shows that the loop is breaking up the filenames at the spaces. You can change this in
while IFS= read -r u; do
# Use unique timestamp, not random value
rCode=$(date '+%Y%m%d_%H%M')
echo "File $u"
rename -v "s/unknown/CB_$rCode" "$u"
rename -v 's/[ ()0123456789]//g' "$u"
done < <(find -name "unknown*")
I never use rename and would use
while IFS= read -r u; do
# Use unique timestamp, not random value
rCode=$(date '+%Y%m%d_%H%M')
# construct new filename.
# Restriction: Path to file is without newlines, spaces or parentheses
newfile=$(sed 's/[ ()]//g; s/.*unknown/&_'"${rCode}"'_/' <<< "$u")
echo "Moving file $u to ${newfile}"
mv "$u" to "${newfile}"
done < <(find -name "unknown*")
EDIT:
I removed a sed command for renaming files with (something) in it:
# Removed command
newfile=$(sed 's/\(.*\)(\(.*\))/\1'"${rCode}"'_\2/' <<< "$u")

automatically renaming files

I have a bunch of files (more than 1000) on this like the followings
$ ls
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-dev.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-dev.lex
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-train.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm-train.lex
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lc
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lex
org.allenai.ari.solvers.termselector.ExpandedLearner.lc
org.allenai.ari.solvers.termselector.ExpandedLearner.lex
org.allenai.ari.solvers.termselector.ExpandedLearnerSVM.lc
org.allenai.ari.solvers.termselector.ExpandedLearnerSVM.lex
....
I have to rename these files files by adding a learners right before the capitalized name. For example
org.allenai.ari.solvers.termselector.BaselineLearnersurfaceForm.lex
would change to
org.allenai.ari.solvers.termselector.learners.BaselineLearnersurfaceForm.lex
and this one
org.allenai.ari.solvers.termselector.ExpandedLearner.lc
would change to
org.allenai.ari.solvers.termselector.learners.ExpandedLearner.lc
Any ideas how to do this automatically?

for f in org.*; do
echo mv "$f" "$( sed 's/\.\([A-Z]\)/.learner.\1/' <<< "$f" )"
done
This short loop outputs an mv command that renames the files in the manner that you wanted. Run it as-is first, and when you are certain it's doing what you want, remove the echo and run again.
The sed bit in the middle takes a filename ($f, via a here-string, so this requires bash) and replaces the first occurrence of a capital letter after a dot with .learner. followed by that same capital letter.

There is a tool called perl-rename, sometimes rename. Not to be confused with rename from util-linux.
It's very good for tasks like this as it takes a perl expression and renames accordingly:
perl-rename 's/(?=\.[A-Z])/.learners/' *
You can play with the regex online
Alternative you can a for loop and $BASH_REMATCH:
for file in *; do
[ -e "$file" ] || continue
[[ "$file" =~ ^([^A-Z]*)(.*)$ ]]
mv -- "$file" "${BASH_REMATCH[1]}learners.${BASH_REMATCH[2]}"
done

A very simple approach (useful if you only need to do this one time) is to ls >dummy them into a text file dummy, and then use find/replace in a text editor to make lines of the form mv xxx.yyy xxx.learners.yyy. Then you can simple execute the resulting file with ./dummy.
The exact find/replace commands depend on the text editor you use, but something like
replace org. with mv org.. That gets you the mv in the beginning.
replace mv org.allenai.ari.solvers.termselector.$1 with mv org.allenai.ari.solvers.termselector.$1 org.allenai.ari.solvers.termselector.learner.$1 to duplicate the filename and insert the learner.
There is also syntax with a for, which can do it probably in one line, (long) but I cannot explain it - try help for if you want to learn about it.

bash loop to match and rename multiple files with multiple variables within the filenames

I have a directory with more than 500 files, here's a sample of the files:
random-code_aa.log
random-code_aa_r-13.log
random-code_ab.log
random-code_ae.log
random-code_ag.log
random-code_ag_r-397.log
random-code_ah.log
random-code_ac.log
random-code_ac_r-41.log
random-code_ax.log
random-code_ax_r-273.log
random-code_az.log
what I would like to do, preferably using a bash loop, is look into the directory for the *_r-*.log files and if found then try to see if similar .log files exist but without whatever is preceding _r-*.log, if found then rename the .log files into their corresponding _r-*.log files but change the r into i.
Better demonstrate with an example from the files sample above:
if "random-code_aa_r-13.log" and "random-code_aa.log" exist then
rename "random-code_aa.log" to "random-code_aa_i-13.log"
I've tried with mv and rename but nothing worked.

This simple BASH script should take care of that:
for f in *_r-*.log; do
rf="${f/_r-*log/.log}"
[[ -f "$rf" ]] && mv "$rf" "${f/_r-/_i-}"
done

You can use sed:
for file in *_r-*.log ; do
barename=`echo $file | sed 's/_r-.*/.log/'`
newname=`echo $file | sed 's/_r-\(.*\)/_i-\1/'`
if [ -f $barename ] ; then
mv $barename $newname
fi
done
You can try to improve the regexes, as it is not safe for some file names. But it should work for file names that contain the minus sign only as the separator character.

You should be able to do that with a parameter substitution:
for f in *_r-*.log
do
stem="${f%_r-*.log}
num="${f%.log}"; num="${num##_r-}"
if test -e "${stem}_aa.log"
then mv "${stem}_aa.log" "${stem}_aa-${num}.log"
fi
done

In shell, how do I delete numbered duplicate files?

I've got a directory with a few thousand files in it, named things like:
filename.ext
filename (1).ext
filename (2).ext
otherfile.ext
otherfile (1).ext
etc.
Most of the files with bracketed numbers are duplicates of the original, but in some cases they're not.
How can I keep my original files, delete the duplicates, but not lose the files that are different?
I know that I could rm *\).ext, but that obviously doesn't make sure that files match the original.
I'm using OS X, so I have a md5 program that functions sort of like md5sum in Linux, though it puts the hash at the end of the line instead of the beginning. I was thinking I could use an awk script to take the output of md5 *.ext | awk 'some script', find duplicates by md5, and delete them, but the command line is too long (bash: /sbin/md5: Argument list too long).
And I don't know what to write in the script. I was thinking of storing things in an array with this:
awk '{a[$NF]++} a[$NF]>1{sub(/).*/,""); sub(/.*(/,""); system("rm " $0);}'
But that always seems to delete my original.
What am I doing wrong? How do I do it right?
Thanks.

Your awk script deletes original files because when you sort your files, . (period) sorts after (space). SO the first file that's seen is numbered, not the original, and subsequent checks (including the one against the original) compare files to the first numbered one.
Not only does rm *\).txt fail to match the original, it loses files that may not have an original in the first place.
I wouldn't do this quite this way. Rather than checking every numbered file and verifying whether it matches an original, you can go through your list of originals, then delete the numbered files that match them.
Instead:
$ for file in *[^\)].txt; do echo "-- Found: $file"; rm -v $(basename "$file" .txt)\ \(*\).txt; done
You can expand this to check MD5's along the way. But it's more code, so I'll break it into multiple lines, in a script:
#!/bin/bash
shopt -s nullglob # Show nothing if a fileglob matches no files
for file in *[^\)].ext; do
md5=$(md5 -q "$file") # The -q option gives you only the message digest
echo "-- Found: $file ($md5)"
for duplicate in $(basename "$file" .ext)\ \(*\).ext; do
if [[ "$md5" = "$(md5 -q "$duplicate")" ]]; then
rm -v "$duplicate"
fi
done
done
As an alternative, you can probably get away with doing this a little more simply, with less CPU overhead than calculating MD5 digests. Unix and Linux have a shell tool called cmp, which is like diff without the output. So:
#!/bin/bash
shopt -s nullglob
for file in *[^\)].ext; do
for duplicate in $(basename "$file" .ext)\ \(*\).ext; do
  if cmp "$file" "$duplicate"; then
rm -v "$file"
fi
done
done

If you don't need to use AWK, you could maybe do something simpler in bash:
for file in *\([0-9]*\)*; do
[ -e "$(echo "$file" | sed -e 's/ ([0-9]\+)//')" ] && rm "$file"
done
Hope this helps a little =)

Renaming multiples files with a bash loop

I need to rename 45 files, and I don't want to do it one by one. These are the file names:
chr10.fasta chr13_random.fasta chr17.fasta chr1.fasta chr22_random.fasta chr4_random.fasta chr7_random.fasta chrX.fasta
chr10_random.fasta chr14.fasta chr17_random.fasta chr1_random.fasta chr2.fasta chr5.fasta chr8.fasta chrX_random.fasta
chr11.fasta chr15.fasta chr18.fasta chr20.fasta chr2_random.fasta chr5_random.fasta chr8_random.fasta chrY.fasta
chr11_random.fasta chr15_random.fasta chr18_random.fasta chr21.fasta chr3.fasta chr6.fasta chr9.fasta
chr12.fasta chr16.fasta chr19.fasta chr21_random.fasta chr3_random.fasta chr6_random.fasta chr9_random.fasta
chr13.fasta chr16_random.fasta chr19_random.fasta chr22.fasta chr4.fasta chr7.fasta chrM.fasta
I need to change the extension ".fasta" to ".fa". I'm trying to write a bash script to do it:
for i in $(ls chr*)
do
NEWNAME = `echo $i | sed 's/sta//g'`
mv $i $NEWNAME
done
But it doesn't work. Can you tell me why, or give another quick solution?
Thanks!

Several mistakes here:
NEWNAME = should be without space. Here bash is looking for a command named NEWNAME and that fails.
you parse the output of ls. this is bad if you had files with spaces. Bash can build itself a list of files with the glob operator *.
You don't escape "$i" and "$NEWNAME". If any of them contains a space it makes two arguments for mv.
If a file name begins with a dash mv will believe it is a switch. Use -- to stop argument processing.
Try:
for i in chr*
do
mv -- "$i" "${i/%.fasta/.fa}"
done
or
for i in chr*
do
NEWNAME="${i/%.fasta/.fa}"
mv -- "$i" "$NEWNAME"
done
The "%{var/%pat/replacement}" looks for pat only at the end of the variable and replaces it with replacement.

for f in chr*.fasta; do mv "$f" "${f/%.fasta/.fa}"; done

If you have the rename command, you can do:
rename .fasta .fa chr*.fasta

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Rename files to drop a date stamp - bash

You can do it like this for i in `ls *.xlsx`; do n1=`echo $i|cut -d '_' -f1`; t=`echo $n1.xlsx`; mv $i $t; mv $t /home/test; done

Related

How can I create a rename script using multiple rules?

automatically renaming files

bash loop to match and rename multiple files with multiple variables within the filenames

In shell, how do I delete numbered duplicate files?

Renaming multiples files with a bash loop

Categories

Resources