bash: using rename to left pad filenames with a zero under when their prefix is too short

bash: using rename to left pad filenames with a zero under when their prefix is too short - bash

I'm using a naming convention with number prefixes to track some files. But I am running out with 2-digit prefix. So, instead of 11.abc 12.def I want to move to 011.abc 012.def. I already have some 013.xxx 014.yyy.
Trying this in an empty directory:
touch 11.abc 12.def 013.xxx 014.yyy
ls -1 gives:
013.xxx
014.yyy
11.abc
12.def
Try #1:
This should match anything that starts with 2 digits, but not 3.
rename -n 's/^\d\d[^\d]/0$1/' *
Now I was kind of hoping that $1 would hold the match, like 11, with 0$1 giving me 011.
No such luck:
Use of uninitialized value $1 in concatenation (.) or string at (eval 2) line 1.
'11.abc' would be renamed to '0abc'
Use of uninitialized value $1 in concatenation (.) or string at (eval 2) line 1.
'12.def' would be renamed to '0def'
On the positive side, it's willing to leave 013 and 014 alone.
Try #2 rename -n 's/^\d\d[^\d]/0/' *
'11.abc' would be renamed to '0abc'
'12.def' would be renamed to '0def'
Since this is regex based, can I somehow save the match group 11 and 12?
If I can't use rename I'll probably write a quick Python script. Don't want to loop with mv on it.
And, actually, my naming covention is 2-3 digits followed by a dot, so this is a good match too.
rename -n 's/^\d\d\./<whatever needs to go here>/' *
For what it's worth, I am using the Homebrew version of rename, as I am on a mac.

try this:
rename 's/^(\d{2}\..*)/0$1/' *

rename is problematic because it's not part of POSIX (so it isn't normally available on many Unix-like systems), and there are two very different forms of it in widespread use. See Why is the rename utility on Debian/Ubuntu different than the one on other distributions, like CentOS? for more information.
This Bash code does the renaming with mv (which is part of POSIX):
#! /bin/bash -p
shopt -s nullglob # Patterns that match nothing expand to nothing.
for f in [0-9][0-9].* ; do
mv "$f" "0$f"
done
shopt -s nullglob is to prevent problems if the code is run in a directory that has no files that need to be renamed. If nullglob isn't enabled the code would try to rename a file called '[0-9][0-9].*', which would have unwanted consequences whether or not such a file existed.

Related

BASH Shell Find Multiple Files with Wildcard and Perform Loop with Action

I have a script that I call with an application, I can't run it from command line. I derive the directory where the script is called and in the next variable go up 1 level where my files are stored. From there I have 3 variables with the full path and file names (with wildcard), which I will refer to as "masks".
I need to find and "do something with" (copy/write their names to a new file, whatever else) to each of these masks. The do something part isn't my obstacle as I've done this fine when I'm working with a single mask, but I would like to do it cleanly in a single loop instead of duplicating loop and just referencing each mask separately if possible.
Assume in my $FILESFOLDER directory below that I have 2 existing files, aaa0.csv & bbb0.csv, but no file matching the ccc*.csv mask.
#!/bin/bash
SCRIPTFOLDER=${0%/*}
FILESFOLDER="$(dirname "$SCRIPTFOLDER")"
ARCHIVEFOLDER="$FILESFOLDER"/archive
LOGFILE="$SCRIPTFOLDER"/log.txt
FILES1="$FILESFOLDER"/"aaa*.csv"
FILES2="$FILESFOLDER"/"bbb*.csv"
FILES3="$FILESFOLDER"/"ccc*.csv"
ALLFILES="$FILES1
$FILES2
$FILES3"
#here as an example I would like to do a loop through $ALLFILES and copy anything that matches to $ARCHIVEFOLDER.
for f in $ALLFILES; do
cp -v "$f" "$ARCHIVEFOLDER" > "$LOGFILE"
done
echo "$ALLFILES" >> "$LOGFILE"
The thing that really spins my head is when I run something like this (I haven't done it with the copy command in place) that log file at the end shows:
filesfolder/aaa0.csv filesfolder/bbb0.csv filesfolder/ccc*.csv
Where I would expect echoing $ALLFILES just to show me the masks
filesfolder/aaa*.csv filesfolder/bbb*.csv filesfolder/ccc*.csv
In my "do something" area, I need to be able to use whatever method to find the files by their full path/name with the wildcard if at all possible. Sometimes my network is down for maintenance and I don't want to risk failing a change directory. I rarely work in linux (primarily SQL background) so feel free to poke holes in everything I've done wrong. Thanks in advance!

Here's a light refactoring with significantly fewer distracting variables.
#!/bin/bash
script=${0%/*}
folder="$(dirname "$script")"
archive="$folder"/archive
log="$folder"/log.txt # you would certainly want this in the folder, not $script/log.txt
shopt -s nullglob
all=()
for prefix in aaa bbb ccc; do
cp -v "$folder/$prefix"*.csv "$archive" >>"$log" # append, don't overwrite
all+=("$folder/$prefix"*.csv)
done
echo "${all[#]}" >> "$log"
The change in the loop to append the output or cp -v instead of overwrite is a bug fix; otherwise the log would only contain the output from the last loop iteration.
I would probably prefer to have the files echoed from inside the loop as well, one per line, instead of collect them all on one humongous line. Then you can remove the array all and instead simply
printf '%s\n' "$folder/$prefix"*.csv >>"$log"
shopt -s nullglob is a Bash extension (so won't work with sh) which says to discard any wildcard which doesn't match any files (the default behavior is to leave globs unexpanded if they don't match anything). If you want a different solution, perhaps see Test whether a glob has any matches in Bash
You should use lower case for your private variables so I changed that, too. Notice also how the script variable doesn't actually contain a folder name (or "directory" as we adults prefer to call it); fixing that uncovered a bug in your attempt.
If your wildcards are more complex, you might want to create an array for each pattern.
tmpspaces=(/tmp/*\ *)
homequest=($HOME/*\?*)
for file in "${tmpspaces[#]}" "${homequest[#]}"; do
: stuff with "$file", with proper quoting
done
The only robust way to handle file names which could contain shell metacharacters is to use an array variable; using string variables for file names is notoriously brittle.
Perhaps see also https://mywiki.wooledge.org/BashFAQ/020

How to delete files like 'Incoming11781rKD'

I have a programme that is generating files like this "Incoming11781Arp", and there is always Incoming, and there is always 5 numbers, but there are 3 letters/upper-case/lower-case/numbers/special case _ in any way. Like Incoming11781_pi, or Incoming11781rKD.
How can I delete them using a script run from a cron job please? I've tried -
#!/bin/bash
file=~/Mail/Incoming******
rm "$file";
but it failed saying that there was no matching file or directory.

You mustn't double-quote the variable reference for pathname expansion to occur - if you do, the wildcard characters are treated as literals.
Thus:
rm $file
Caveat: ~/Mail/Incoming****** doesn't work the way you think it does and will potentially match more files than intended, as it is equivalent to ~/Mail/Incoming*, meaning that any file that starts with Incoming will match.
To only match files starting with Incoming that are followed by exactly 6 characters, use ~/Mail/Incoming??????, as #Jidder suggests in a comment.
Note that you could make your glob (pattern) even more specific:
file=~/Mail/Incoming[0-9][0-9][0-9][0-9][0-9][[:alpha:]_][[:alpha:]_][[:alpha:]_]
See the bash manual for a description of pathname expansion and pattern syntax: http://www.gnu.org/software/bash/manual/bashref.html#index-pathname-expansion.

You can achieve the same effect with the find command...
$ directory='~/Mail/'
$ file_pattern='Incoming*'
$ find "${directory}" -name "${file_pattern}" -delete
The first two lines define the directory and the file pattern separately, the find command will then proceed to delete any matching files inside that directory.

Rename a file in a directory without retyping the directory name

Say we have a file test.txt in my_directory that I want to rename to yeah.txt.
Is there a way with zsh (or even just bash, just to know) to avoid retyping my_directory?
I find the following a bit long:
mv my_directory/test.txt my_directory/yeah.txt
Thanks.

I'd do it with brace expansion:
mv my_directory/{test,yeah}.txt

I have copy-prev-shell-word assigned to ^P
% cp my_directory/test.txt [^P] # expands to....
% cp my_directory/test.txt my_directory/test.txt
Then I just manually edit the last argument. For me this is a better solution than brace expansion, but I reckon it is just a preference.
If interested, you should look at one of these functions (man zshzle):
copy-prev-word (ESC-^_) (unbound) (unbound)
Duplicate the word to the left of the cursor.
copy-prev-shell-word
Like copy-prev-word, but the word is found by using shell parsing,
whereas copy-prev-word looks for blanks. This makes a difference when the
word is quoted and contains spaces.
I use this to bind the function bindkey -M emacs "^p" copy-prev-shell-word

Create a new sequence of files from an existing sequence, along with numbering

I know this question has been asked, but I can't find more than one solution, and it does not work for me. Essentially, I'm looking for a bash script that will take a file list that looks like this:
image1.jpg
image2.jpg
image3.jpg
And then make a copy of each one, but number it sequentially backwards. So, the sequence would have three new files created, being:
image4.jpg
image5.jpg
image6.jpg
And yet, image4.jpg would have been an untouched copy of image3.jpg, and image5.jpg an untouched copy of image2.jpg, and so on. I have already tried the solution outlined in this stackoverflow question with no luck. I am admittedly not very far down the bash scripting path, and if I take the chunk of code in the first listed answer and make a script, I always get "2: Syntax error: "(" unexpected" over and over. I've tried changing the syntax with the ( around a bit, but no success ever. So, either I am doing something wrong or there's a better script around.
Sorry for not posting this earlier, but the code I'm using is:
image=( image*.jpg )
MAX=${#image[*]}
for i in ${image[*]}
do
num=${i:5:3} # grab the digits
compliment=$(printf '%03d' $(echo $MAX-$num | bc))
ln $i copy_of_image$compliment.jpg
done
And I'm taking this code and pasting it into a file with nano, and adding !#/bin/bash as the first line, then chmod +x script and executing in bash via sh script. Of course, in my test runs, I'm using files appropriately titled image1.jpg - but I was also wondering about a way to apply this script to a directory of jpegs, not necessarily titled image(integer).jpg - in my file keeping structure, most of these are a single word, followed by a number, then .jpg, and it would be nice to not have to rewrite the script for each use.

Perhaps something like this. It will work well for something like script image*.jpg where the wildcard matches a set of files which match a regular pattern with monotonously increasing numbers of the same length, and less ideally with a less regular subset of the files in the current directory. It simply assumes that the last file's digit index plus one through the total number of file names is the range of digits to loop over.
#!/bin/sh
# Extract number from final file name
eval lastidx=\$$#
tmp=${lastidx#*[!0-9][0-9]}
lastidx=${lastidx#${lastidx%[0-9]$tmp}}
tmp=${lastidx%[0-9][!0-9]*}
lastidx=${lastidx%${lastidx#$tmp[0-9]}}
num=$(expr $lastidx + $#)
width=${#lastidx}
for f; do
pref=${f%%[0-9]*}
suff=${f##*[0-9]}
# Maybe show a warning if pref, suff, or width changed since the previous file
printf "cp '$f' '$pref%0${width}i$suff'\\n" $num
num=$(expr $num - 1)
done |
sh
This is sh-compatible; the expr stuff and the substring extraction up front is ugly but Bourne-compatible. If you are fine with the built-in arithmetic and string manipulation constructs of Bash, converting to that form should be trivial.
(To be explicit, ${var%foo} returns the value of $var with foo trimmed off the end, and ${var#foo} does similar trimming from the beginning of the value. Regular shell wildcard matching operators are available in the expression for what to trim. ${#var} returns the length of the value of $var.)

Maybe your real test data runs from 001 to 300, but here you have image1 2 3, and therefore you extract one, not three digits from the filename. num=${i:5:1}
Integer arithmetic can be done in the bash without calling bc
${#image[#]} is more robust than ${#image[*]}, but shouldn't be a difference here.
I didn't consult a dictionary, but isn't compliment something for your girl friend? The opposite is complement, isn't it? :)
the other command made links - to make copies, call cp.
Code:
#!/bin/bash
image=( image*.jpg )
MAX=${#image[#]}
for i in ${image[#]}
do
num=${i:5:1}
complement=$((2*$MAX-$num+1))
cp $i image$complement.jpg
done
Most important: If it is bash, call it with bash. Best: do a shebang (as you did), make it executable and call it by ./name . Calling it with sh name will force the wrong interpreter. If you don't make it executable, call it bash name.

Shell script to iterate through files with only one '4' in the file name

I am trying to iterate through files in the same directory with only one 4 in them.
Here is what I have so far. The problem with my current script is that files with any number of 4's get selected, not files with only one 4.
for i in *4*.cpp;
do
...
Sort of like [!4] but for any number of non 4 characters.
*http://www.tuxfiles.org/linuxhelp/wildcards.html
I want to iterate through file names such as me4.cpp, 4.cpp, and hi4hi.cpp
I want to ignore file names such as lala.cpp, 44.cpp, 4hi4.cpp
Thank you!
Figured it out. I tried [!4]* on a whim.
Oops turned out I didn't. That is interpreted as ([!4]) then (*)

The grep style regex you need is:
^[^4]*4[^4]*$
A bunch of not-4's after the start of the line, a 4, and another bunch of not-4's to the end of the line.
In pure shell, consider using a case statement:
for file in *4*.cpp
do
case "$file" in
(*4*4*) : Ignore;;
(*) : Process;;
esac
done
That looks for names containing 4's, and then ignores those containing 2 or more 4's.

How about using find
find ./ -regex "<regular expression>"

Assuming bash:
shopt -s extglob
for file in *([^4])4*([^4]).cpp; ...
where *([^4]) means zero or more characters that are not "4"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio