Bash change filenames from month name to month number - bash

I have been trying to do this all afternoon and cannot figure out how to do this. I'm running MXLinux and from the commandline am trying (unsucessfully) to batch edit a bunch of filenames (I've about 500 so don't want to do this by hand) from:
I cannot find anything that does this (in a way I understand) so am wondering. Is this possible or am I to do this by hand?
Admittedly I'm not very good with Bash but I can use sed, awk, rename, date, etc. I just can't seem to find a way to combine them to rename my files.
I cannot find anything on here that has been of any help in doing this.
Many thanks.
I'm looking for a way to combine commands and ideally not have to overtly for-loop through the files and the months. What I mean is I would prefer, and was trying to, pipe ls into a command combination to convert as specified above. Sorry for the confusion.
Thank you to everyone who came up with answers, and for you patience with my lack of ability. I don't think I'm qualified to make a decision as to the best answer however have settled, for my use-case on the following:
declare -A months=( [January]=01 [February]=02 [March]=03 [April]=04 [May]=05\
[June]=06 [July]=07 [August]=08 [September]=09 [October]=10 [November]=11 [December]=12 )
for oldname in 202[01]-[A-za-z]*-15.pdf
IFS=-. read y m d ext <<< "${oldname}"
mv "$oldname" "$y-${months[$m]}-$d.$ext"
I think this offer the best flexibility. I would have liked the date command but don't know how to not have the file extension hard coded. I was unaware of the read command or that you could use patterns in the for-loop.
I have learned a lot from this thread so again thank you all. Really my solution is a cross of most of the solutions below as I've taken from them all.

With just Bash built-ins, try
January February March April May June \
July August September October November December)
for file in ./*; do
for ((i=1; i<=${#months[#]}; ++i)); do
((i<10)) && i=0$i
mv -- "$file" "$dst"
This builds up an array of month names, and loops over it to find the correct substitution.
The line ((i<10)) && i=0$i adds zero padding for single-digit month numbers; remove it if that's undesired.
As an aside, you should basically never use ls in scripts.
The explicit loop could be avoided if you had a command which already knows how to rename files, but this implements that command. If you want to save it in a file, replace the hard-coded ./* with "$#", add a #!/bin/bash shebang up top, save it as monthrenamer somewhere in your PATH, and chmod a+x monthrenamer. Then you can run it like
monthrenamer ./*
to rename all the files in the current directory without an explicit loop, or a more restricted wildcard argument to only select a smaller number of files, like
monthrenamer /path/to/files/2020*.pdf
You could run date twelve times to populate the array, but it's not like hard-coding the month names is going to be a problem. We don't expect them to change (and calling twelve subprocesses at startup just to avoid that seems quite excessive in this context).
As an aside, probably try to fix the process which creates these files to produce machine-readable file names. It's fairly obvious to a human, too, that 2021-07 refers to the month of July, whereas going the other way is always cumbersome (you will need to work around it in every tool or piece of code which wants to order the files by name).

Assuming you have the GNU version of date(1), you could use date -d to map the month names to numbers:
for f in *.pdf; do
IFS=- read y m d <<<"${f%.pdf}"
mv "$f" "$(date -d "$m $d, $y" +%F.pdf)"
I doubt it's any more efficient than your sed -e 's/January/01/' -e 's/February/02/' etc, but it does feel less tedious to type. :)
Loop over the .pdf files, setting f to each filename in turn.
The read line is best explained right to left:
"${f%.pdf}" expands to the filename without the .pdf part, e.g. "2020-August-15".
b. <<< turns that value into a here-string, which is a mechanism for feeding a string as standard input to some command. Essentially, x <<<y does the same thing as echo y | x, with the important difference that the x command is run in the current shell instead of a subshell, so it can have side effects like setting variables.
c. read is a shell builtin that by default reads a single line of input and assigns it to one or more shell variables.
d. IFS is a parameter that tells the shell how to split lines up into words. Here we're setting it – only for the duration of the read command – to -. That tells read to split the line it reads on hyphens instead of whitespace; IFS=- read y m d <<<"2020-August-15" assigns "2020" to y, "August" to m, and "15" to d.
The GNU version of date(1) has a -d parameter that tells it to display another date instead of the current one. It accepts a number of different formats itself, sadly not including "yyyy-Mon-dd", which is why I had to split the filename up with read. But it does accept "Mon dd, yyyy", so that's what I pass to it. +%F.pdf tells it that when it prints the date back out it should do so ISO-style as "yyyy-mm-dd", and append ".pdf" to the result. ("%F" is short for "%Y-%m-%d"; I could also have used -I instead of +anything and moved the .pdf outside the command expansion.)
f. The call to date is wrapped in $(...) to capture its output, and that result is used as the second parameter to mv to rename the files.

Another way with POSIX shell:
# Iterate over pattern that will exclude already renamed pdf files
for file in [0-9][0-9][0-9][0-9]-[^0-9]*.pdf
# Remove echo if result match expectations
echo mv -- "$file" "$(
# Set field separator to - or . to split filename components
# Transfer filename components into arguments using IFS
set -- $file
# Format numeric date string
date --date "$3 $2 $1" '+%Y-%m-%d.pdf'

If you are using GNU utilities and the Perl version of rename (not the util-linux version), you can build a one-liner quite easily:
rename "$(
seq -w 1 12 |
LC_ALL=C xargs -I# date -d 1970-#-01 +'s/^(\d{4}-)%B(-\d{2}\.pdf)$/$1%m$2/;'
)" *.pdf
You can shorten if you don't care about safety (or legibility)... :-)
rename "$(seq -f%.f/1 12|date -f- +'s/%B/%m/;')" *.pdf

What I mean is I would prefer, and was trying to, pipe ls into a command combination to convert as specified above.
Well, you may need to implement that command combination then. Here’s one consisting of a single “command” and in pure Bash without external processes. Pipe your ls output into that and, once satisfied with the output, remove the final echo…
declare -Ar MONTHS=(
while IFS= read -r path; do
IFS=- read -ra segments <<<"$path"
IFS=- new_path="${segments[*]}"
echo mv "$path" "$new_path"

What is working for me in Mac OS 12.5 with GNU bash, version 3.2.57(1)-release (arm64-apple-darwin21)
is the following :
for f in *.pdf; do mv "$f" "$(echo $f |sed -e 's/Jan/-01-/gi' -e 's/Feb/-02-/gi' -e 's/Mar/-03-/gi' -e 's/Apr/-04-/gi' -e 's/May/-05-/gi' -e 's/jun/-06-/gi' -e 's/Jul/-07-/gi' -e 's/Aug/-08-/gi' -e 's/Sep/-09-/gi' -e 's/Oct/-10-/gi' -e 's/Nov/-11-/gi' -e 's/Dec/-12-/gi' )"; done
Note the original file had the month expressed in three litters in my case :


bash: rename files dropping a specific delimited part of the filename

I've been trying to find an efficient way to rename lots of files, by removing a specific component of the filename, in bash shell in linux. Filenames are like:
I want to remove the 2nd to last element entirely, resulting in:
I've seen suggestions to use perl-rename, that might handle this (I'm not clear), but this system does not have perl-rename available. (Has GNU bash 4.2, and rename from util-linux 2.23)
I like extended globbing and parameter parsing for things like this.
$: shopt -s extglob
$: n=DATA_X3.A2022086.40e50s.231.2022087023101.csv
$: echo ${n/.+([0-9]).csv/.csv}
So ...
for f in *.csv; do mv "$f" "${f/.+([0-9]).csv/.csv}"; done
This assumes all the files in the local directory, and no other CSV files with similar formatting you don't want to rename, etc.
In the more general case where the .csv is not immediately following the component to be removed, is there a way to drop the nth dot-separated component in the filename? (without a more complicated sequence to string-split in bash (always seems cumbersome) and rebuild the filename?
There is usually a way. If you know which field needs to be removed -
$: ( IFS=. read -ra line <<< "$n"; unset line[4]; IFS=".$IFS"; echo "${line[*]}" )
Breaking that out:
( # open a subshell to localize IFS
IFS=. read -ra line <<< "$n"; # inline set IFS to . to parse to fields
unset line[4]; # unset the desired field from the array
IFS=".$IFS"; # prepend . as the OUTPUT separator
echo "${line[*]}" # reference with * to reinsert
) # closing the subshell restores IFS
I will confess I am not certain why the inline setting of IFS doesn't work on the reassembly. /shrug
This is a simple split/drop-field/reassemble, but I think it may be an X/Y Problem
If what you are doing is dropping the one field that has the date/timestamp info, then as long as the format of that field is consistent and unique, it's probably easier to use a version of the first approach.
Is it possible you meant for DATA_X3.A2022086.40e50s.231.2022087023101.csv's 5th field to be 20220807023101? i.e., August 7th of 2022 # 02:31:01 AM? Because if that's what you mean, and it's supposed to be 14 digits instead of 13, and that is the only field that is always supposed to be exactly 14 digits, then you don't need shopt and can leave the field position floating -
$: n=DATA_X3.A2022086.40e50s.231.20220807023101.csv
$: $: echo ${n/.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]./.}

Combining variable concatenation and for loops in bash

I have this function in R, which I use to produce a list of dates:
#! usr/bin/env Rscript
date_seq = function(){
args = commandArgs(trailingOnly = TRUE)
days = seq(ymd(args[1]),ymd(args[2]),1)
days =format(days, "%Y%m%d")
I call this function in a bash script to create a vector of dates:
Rscript date_seq.R 20160730 20160801 > dates
I define a couple of other string variables in the bash script:
The final goal is to create a vector of download links, that incorporates the three variables home_url, date and file_name, like so:
I tried a few lines in bash script:
for date in $dates; do download_url=$home_url$date$hrrr_file; cat
$download_url; done
for date in $dates; do download_url="${home_url}${date}${hrrr_file}"; cat $download_url;
for date in $dates; do download_url="$home_url"; download_url+="$date"; download_url+="$hrrr_file"; cat $download_url; done
None of these produce the output I expect. I am not sure if the download_url variable is not being produced, or is being produced and stored somewhere, and I am not able to reproduce it. Can anyone please help me understand?
Results of trying the suggestions below:
#triplee suggested using
sed "s#.*#$home_url&$hrrr_file#" "dates"
while read -r date; do; printf '%s%s%s\n' "$home_url" "$date" "$hrrr_file"; done <dates
Both of these produce this output:[1] "20160730" "20160731" "20160801"/hrrr.t{00-23}z.wrfsfcf00.grib2
#xdhmoore suggested using
for date in $(cat dates); do; echo ${home_url}${date}${hrrr_file}"; done
which produces this output:[1]/hrrr.t{00-23}z.wrfsfcf00.grib2"20160730"/hrrr.t{00-23}z.wrfsfcf00.grib2"20160731"/hrrr.t{00-23}z.wrfsfcf00.grib2"20160801"/hrrr.t{00-23}z.wrfsfcf00.grib2`
Both are not the output I am expecting, though the solution by #xdhmoore is closer. But I see another problem in #xdhmoore's solution: The quotations around the date in output. The output of cat dates looks like this: "20160730" "20160731" "20160801", so I think I have to rework the function or the way I call it in the bash script as well.
I'll keep updating the question to reflect the output of all suggestions, since it is simpler to do so than trying to answer each comment. As always, thanks a lot!
The for statement loops over the tokens you give it as arguments, not the contents of files.
You seem to be looking for
sed "s#.*#$home_url&$hrrr_file#" "dates"
The token & recalls the text which was matched by the regex in a sed substitution.
The same thing could be done vastly more slowly with a shell loop;
while read -r date; do
printf '%s%s%s\n' "$home_url" "$date" "$hrrr_file"
done <dates
which illustrates how to (slowly) iterate over the lines in a file without the use of external utilities.
Either of hese can be piped to xargs curl (or perhaps xargs -n 1 curl); or you could refactor the while loop;
while read -r date; do
curl "$home_url$date$hrrr_file"
done <dates
As noted in comments, cat is a command for copying files, not echoing text; for the latter, use echo or (for any nontrivial formatting) printf.
Update: The above assumes your R output generated one date per line. To split the file into lines and remove quotes around the values, you can preprocess with sed 's/"\([^"]\)" */\1\n/g' "dates" (provided your sed dialects supports \n as an escape for newline); or perhaps do
sed "s#\"\([^\"]*\)\" *#$home_url\\1$frrr_file\\
#g" "dates"
again with some reservation for differences between sed dialects. In the worst case, maybe switch to Perl, which actually brings some relief to the backslashitis, but requires new backslashes in other places:
perl -pe "s#\"(\d+)\" *#$home_url\$1$frrr_file\n#g" "dates"
But probably a better solution is to change your R script so it doesn't produce wacky output. Or just don't use R in the first place. See e.g. for how to get dates from Perl. Or if you have GNU date, try
start=$(date -d "$1" +%s)
end=$(date -d "$2" +%s)
for ((i=start; i<=end; i+=60*60*24)); do
date -d "#$i" +%Y%m%d
(If you are on a Mac or similar, the date program won't accept a date as an argument to -d and you will have to use slightly different syntax. It's not hard to do but this answer has too many speculations already.)

How do I use `sed` to alter a variable in a bash script?

I'm trying to use enscript to print PDFs from Mutt, and hitting character encoding issues. One way around them seems to be to just use sed to replace the problem characters: sed -ir 's/[“”]/"/g' {input}
My test input file is this:
“very dirty”
I'm hoping to get "very dirty" and we're but instead I'm still getting
â\200\234very dirtyâ\200\235
I found a nice little post on printing to PDFs from Mutt that I used as a starting point. I have a bash script that I point to from my .muttrc with set print_command="$HOME/.mutt/" -- the script currently reads about like this:
input="$1" pdir="$HOME/Desktop" open_pdf=evince
# Straighten out curly quotes
sed -ir 's/[“”]/"/g' $input
sed -ir "s/[’]/'/g" $input
tmpfile="`mktemp $pdir/mutt_XXXXXXXX.pdf`"
enscript --font=Courier8 $input -2r --word-wrap --fancy-header=mutt -p - 2>/dev/null | ps2pdf - $tmpfile
$open_pdf $tmpfile >/dev/null 2>&1 &
sleep 1
rm $tmpfile
It does a fine job of creating a PDF (and works fine if you give it a file as an argument) but I can't figure out how to fix the curly quotes.
I've tried a bunch of variations on the sed line:
input=sed -r 's/[“”]/"/g' $input
$input=sed -ir "s/[’]/'/g" $input
Per the suggestion at Can I use sed to manipulate a variable in bash? I also tried input=$(sed -r 's/[“”]/"/g' <<< $input) and I get an error: "Syntax error: redirection unexpected"
But none manages to actually change $input -- what is the correct syntax to change $input with sed?
Note: I accepted an answer that resolved the question I asked, but as you can see from the comments there are a couple of other issues here. enscript is taking in a whole file as a variable, not just the text of the file. So trying to tweak the text inside the file is going to take a few extra steps. I'm still learning.
On Editing Variables In General
BashFAQ #21 is a comprehensive reference on performing search-and-replace operations in bash, including within variables, and is thus recommended reading. On this particular case:
Use the shell's native string manipulation instead; this is far higher performance than forking off a subshell, launching an external process inside it, and reading that external process's output. BashFAQ #100 covers this topic in detail, and is well worth reading.
Depending on your version of bash and configured locale, it might be possible to use a bracket expression (ie. [“”], as your original code did). However, the most portable thing is to treat “ and ” separately, which will work even without multi-byte character support available.
input='“hello ’cruel’ world”'
printf '%s\n' "$input"
...correctly outputs:
"hello 'cruel' world"
On Using sed
To provide a literal answer -- you almost had a working sed-based approach in your question.
input=$(sed -r 's/[“”]/"/g' <<<"$input")
...adds the missing syntactic double quotes around the parameter expansion of $input, ensuring that it's treated as a single token regardless of how it might be string-split or glob-expanded.
But All That May Not Help...
The below is mentioned because your test script is manipulating content passed on the command line; if that's not the case in production, you can probably disregard the below.
If your script is invoked as ./yourscript “hello * ’cruel’ * world”, then information about exactly what the user entered is lost before the script is started, and nothing you can do here will fix that.
This is because $1, in that scenario, will only contain “hello; ’cruel’ and world” are in their own argv locations, and the *s will have been replaced with lists of files in the current directory (each such file substituted as a separate argument) before the script was even started. Because the shell responsible for parsing the user's command line (which is not the same shell running your script!) did not recognize the quotes as valid at the time when it ran this parsing, by the time the script is running, there's nothing you can do to recover the original data.
Abstract: The way to use sed to change a variable is explored, but what you really need is a way to use and edit a file. It is covered ahead.
The (two) sed line(s) could be solved with this (note that -i is not used, it is not a file but a value):
input='“very dirty”
sed 's/[“”]/\"/g;s/’/'\''/g' <<<"$input"
But it should be faster (for small strings) to use the internals of the shell:
input='“very dirty”
printf '%s\n' "$input"
But there is an underlying problem with your script, you are trying to clean an input received from the command line. You are using $1 as the source of the string. Once somebody writes:
./script “very dirty”
That input is lost. It is broken into shell's tokens and "$1" will be “very only.
But I do not believe that is what you really have.
However, you are also saying that the input comes from a file. If that is the case, then read it in with:
input="$(<infile)" # not $1
sed 's/[“”]/\"/g;s/’/'\''/g' <<<"$input"
Or, if you don't mind to edit (change) the file, do this instead:
sed -i 's/[“”]/\"/g;s/’/'\''/g' infile
Or, if you are clear and certain that what is being given to the script is a filename, like:
./script infile
You can use:
sed -i 's/[“”]/\"/g;s/’/'\''/g' "$infile"
Other comments:
Quote your variables.
Do not use the very old `…` syntax, use $(…) instead.
Do not use variables in UPPER case, those are reserved for environment variables.
And (unless you actually meant sh) use a shebang (first line) that targets bash.
The command enscript most definitively requires a file, not a variable.
Maybe you should use evince to open the PS file, there is no need of the step to make a pdf, unless you know you really need it.
I believe that is better use a file to store the output of enscript and ps2pdf.
Do not hide the errors printed by the commands until everything is working as desired, then, just call the script as:
./script infile 2>/dev/null
Or as required to make it less verbose.
Final script.
If you call the script with the name of the file that enscript is going to use, something like:
./script infile
Then, the whole script will look like this (runs both in bash or sh):
#!/usr/bin/env bash
Usage(){ echo "$0; This script require a source file"; exit 1; }
[ $# -lt 1 ] && Usage
[ ! -e $1 ] && Usage
# Straighten out curly quotes
sed -i 's/[“”]/\"/g;s/’/'\''/g' "$infile"
tmpfile="$(mktemp "$pdir"/mutt_XXXXXXXX.pdf)"
enscript --font=Courier10 "$infile" -2r \
--word-wrap --fancy-header=mutt -p "$outfile"
ps2pdf "$outfile" "$tmpfile"
"$open_pdf" "$tmpfile" >/dev/null 2>&1 &
sleep 5
rm "$tmpfile" "$outfile"

automatically renaming files

I have a bunch of files (more than 1000) on this like the followings
$ ls
I have to rename these files files by adding a learners right before the capitalized name. For example
would change to
and this one
would change to
Any ideas how to do this automatically?
for f in org.*; do
echo mv "$f" "$( sed 's/\.\([A-Z]\)/.learner.\1/' <<< "$f" )"
This short loop outputs an mv command that renames the files in the manner that you wanted. Run it as-is first, and when you are certain it's doing what you want, remove the echo and run again.
The sed bit in the middle takes a filename ($f, via a here-string, so this requires bash) and replaces the first occurrence of a capital letter after a dot with .learner. followed by that same capital letter.
There is a tool called perl-rename, sometimes rename. Not to be confused with rename from util-linux.
It's very good for tasks like this as it takes a perl expression and renames accordingly:
perl-rename 's/(?=\.[A-Z])/.learners/' *
You can play with the regex online
Alternative you can a for loop and $BASH_REMATCH:
for file in *; do
[ -e "$file" ] || continue
[[ "$file" =~ ^([^A-Z]*)(.*)$ ]]
mv -- "$file" "${BASH_REMATCH[1]}learners.${BASH_REMATCH[2]}"
A very simple approach (useful if you only need to do this one time) is to ls >dummy them into a text file dummy, and then use find/replace in a text editor to make lines of the form mv xxx.yyy xxx.learners.yyy. Then you can simple execute the resulting file with ./dummy.
The exact find/replace commands depend on the text editor you use, but something like
replace org. with mv org.. That gets you the mv in the beginning.
replace mv org.allenai.ari.solvers.termselector.$1 with mv org.allenai.ari.solvers.termselector.$1 org.allenai.ari.solvers.termselector.learner.$1 to duplicate the filename and insert the learner.
There is also syntax with a for, which can do it probably in one line, (long) but I cannot explain it - try help for if you want to learn about it.

How do I list files in a folder showing name, time and number of rows in a line with bash?

How do I merge ls with wc -l to get the name of a file, modification time and number of rows in a file?
There are a number of ways you can approach this from the shell or your programming language of choice, but there's really no "right" way to do this, since you need to both stat and read each file in order to form your custom output. You can do this without pipelines inside a basic for-loop by using command substitution:
custom_ls () {
for file in "$#"; do
echo "$file, $(date -r "$file" '+%T'), $(wc -l < "$file")"
This will generate output like this:
$ custom_ls .git*
.gitconfig, 14:02:56, 44
.gitignore, 17:07:13, 21
There are certainly other ways to do it, but command substitution allows the intent of the format string to remain short and clear, without complex pipelines or temporary variables.
