How to remove suffix from file in Bash? - bash

I found out that you can apparently add suffix on the fly, this is useful for me to disable a service that relies on the existence of file extension.
$ mv -v file.txt{,.bak}
renamed 'file.txt' -> 'file.txt.bak'
But how can one do the reverse ? What command I have to use ?
$ mv -v file.txt.bak{??}
renamed 'file.txt.bak' -> 'file.txt'

Your command (mv -v file.txt{,.bak}) relies on Bash brace expansion, which translates file.txt{,.bak} to file.txt file.txt.bak. Check out the Bash manpage section for "Brace Expansion". In this case ({,.bak}), there are two strings in the comma-separated list: an empty string and .bak. As such, you can add extensions and the like.
Several examples come to mind for removing extensions using brace expansions (all expanding to mv -iv file.txt.bak file.txt):
mv -iv file.txt{.bak,}
f=file.txt.bak; mv -iv ${f%.bak}{.bak,}, which doesn't presuppose the filename preceding the ".bak" extension (in the Bash manpage, see "Remove matching suffix pattern")
f=file.txt.bak; mv -iv ${f%.*}{.${f##*.},}, which doesn't presuppose any specific extension (in the Bash manpage, additionally see "Remove matching prefix pattern")
As an alternative, and as another contributor suggests, rename is a useful and powerful utility (e.g., rename 's/\.bak$//' *.bak).
P.S. Whenever documenting or scripting variabilized examples, I generally recommend quoting for whitespace safety (e.g., mv -iv "${f%.*}"{."${f##*.}",}).

why donot use rename command.
rename '.bak' '' fileNames

You can use:
file.txt{.bak,}
In this way you substitute the word before the comma with the empty string that is after the comma.

Related

How to remove all file extensions in bash?

x=./gandalf.tar.gz
noext=${x%.*}
echo $noext
This prints ./gandalf.tar, but I need just ./gandalf.
I might have even files like ./gandalf.tar.a.b.c which have many more extensions.
I just need the part before the first .
If you want to give sed a chance then:
x='./gandalf.tar.a.b.c'
sed -E 's~(.)\..*~\1~g' <<< "$x"
./gandalf
Or 2 step process in bash:
x="${s#./}"
echo "./${x%%.*}"
./gandalf
Using extglob shell option of bash:
shopt -s extglob
x=./gandalf.tar.a.b.c
noext=${x%%.*([!/])}
echo "$noext"
This deletes the substring not containing a / character, after and including the first . character. Also works for x=/pq.12/r/gandalf.tar.a.b.c
Perhaps a regexp is the best way to go if your bash version supports it, as it doesn't fork new processes.
This regexp works with any prefix path and takes into account files with a dot as first char in the name (hidden files):
[[ "$x" =~ ^(.*/|)(.[^.]*).*$ ]] && \
noext="${BASH_REMATCH[1]}${BASH_REMATCH[2]}"
Regexp explained
The first group captures everything up to the last / included (regexp are greedy in bash), or nothing if there are no / in the string.
Then the second group captures everything up to the first ., excluded.
The rest of the string is not captured, as we want to get rid of it.
Finally, we concatenate the path and the stripped name.
Note
It's not clear what you want to do with files beginning with a . (hidden files). I modified the regexp to preserve that . if present, as it seemed the most reasonable thing to do. E.g.
x="/foo/bar/.myinitfile.sh"
becomes /foo/bar/.myinitfile.
If performance is not an issue, for instance something like this:
fil=$(basename "$x")
noext="$(dirname "$x")"/${fil%%.*}

Downloading a list of files with WGET - rename files up to .jpg ie. get rid of extraneous text

My problem is pretty straightforward to understand.
I have images.txt which is a list of line separated URLs pointing to .jpg files separated as follows:
https://region.URL.com/files/2/2f/dir/2533x1946_IMG.jpg?Tag=2&Policy=BLAH__&Signature=BLAH7-BLAH-BLAH__&Key-Pair-Id=BLAH
I'm able to successfully download with wget -i but they are formatted like 2533x1946_IMG.jpg?BLAH_BLAH_BLAH_BLAH when I need them named like this instead: 2533x1946_IMG.jpg
Note that I've already tried the popular solutions to no avail (see below), so I'm thinking more along the lines of a solution that would involved sed, grep and awk
wget --content-disposition-i images.txt
wget --trust-server-names -i images.txt
wget --metalink-over-http --trust-server-names --content-disposition -i images.txt
wget --trust-server-names --content-disposition -i images.txt
and more iterations like this based on those three flags....
I'd ideally like to do it with one command, but even if it's a matter of downloading the files as-is and later doing a recursive command that renames them to the 2533x1946_IMG.jpg format is acceptable too.
1) you can use rename in ONE liner to rename all files
rename -n 's/[?].*//' *_BLAH
rename uses the next sintax 's/selectedString/whatYouChange/'
rename uses regex to find all your files and also to rename using a loop. Because your name is very specific, you can select it very easy. you're going to select the char ? and because in regex it has a special meaning youre going to put that in brackets [ ]. end result [?].
-n argument it's to show you what is going to change and not make the changes until you remove it. delete -n and changes will be applied.
.* is for selecting everything after the char ?, so BLAH_BLAH_BLAH_BLAH
// is for remove what you select, because there are NOT words OR anything in here.
*_BLAH is for selecting all files that end with _BLAH, you could use * but maybe you have other files, folders in that same place, so it's safer this way.
output
find . \
-name '*[?]*' \
-exec bash -c $'for f; do mv -- "$f" "${f%%\'?\'*}"; done' _ {} +
Why *[?]*? That prevents the ? from being treated as a single-character wildcard, and instead ensures that it only matches itself.
Why $'...\'?\'...'? The $'...' ANSI-C-style string quoting form allows backslash escapes to be able to specify literal ' characters even inside a single-quoted string.
Why bash -c '...' _ {} +? Unlike approaches that substitute the filenames that were found into code to be executed, this keeps those names out-of-band from the code, preventing shell injection attacks via hostile filenames. The _ placeholder fills in $0, so subsequent arguments become $1 and onword; and the for loop iterates over them (for f; do is the same as for f in "$#"; do).
What does ${f%%'?'*} do? This paramater expansion expands $f with the longest possible string matching the glob-style/fnmatch pattern '?'* removed from the end.

Replacing multiple preceding numbers from files

Good day,
I have a bunch of files that need to be batch renamed like so:
01-filename1.txt > filename1.txt
02-filename2.txt > filename2.txt
32-filename3.txt > filename3.txt
322-filename4.txt > filename4.txt
31112-filename5.txt > filename5.txt
I run into an example of achieving this using bash ${string#substring} string operation, so this almost works:
for i in `ls`; do mv $i ${i#[0-9]}; done
However, this removes only a single digit and adding regex '+' does not seem to work. Is there a way to strip ALL preceding digits characters?
Thank you!
With Perl's standalone rename command:
rename -n 's/.*?-//' *.txt
If output looks okay, remove -n.
See: The Stack Overflow Regular Expressions FAQ
If you have a single character that always marks the end of the prefix, Pattern Matching makes it very simple.
for f in *; do
mv -nv "$f" "${f#*-}";
done;
Things worth noting:
In your case, the use of ls does not cause problems, but for a more generalized solution, certain filenames would break it. Additionally, the lack of quotes around parameter expansions would cause issues for files with newlines, spaces or tabs in them.
The pattern *- matches any string ending with - combined with lazy prefix removal (one # instead of 2), leads to ${f#*-} evaluating to "$f" with the shortest prefix ending in - removed (if one exists).
Bash's pattern matching is different from and inferior to RegEx, but you can get a little more power by enabling extended pattern matching with shopt -s extglob. Some distributions have this enabled by default.
Also, I threw the -nv flags in mv to ensure no mishaps when playing around with parameter expansion.
More Pattern Matching tricks I often use:
If you want to remove all leading digits and don't always have a single character terminating the prefix, extended pattern matching is helpful: "${f##+([0-9])}"
for i in *
do
name=$( echo "$i" | cut -d "-" -f 2 )
mv "$i" "$name" 2>/dev/null
done

Add string to each member of variable in Bash

I have the following command to gather all files in a folder and concatenate them....but what is held in the variable is only the file names and not the directory. How can I add 'colid-data/' to each of the files for cat to us?
cat $(ls -t colid-data) > catfiles.txt
List the filenames, not the directory.
cat $(ls -t colid-data/*) > catfiles.txt
Note that this will not work if any of the filenames contain whitespace. See Why not parse ls? for better alternatives.
If you want to concatenate them in date order, consider using zsh:
cat colid-data/*(.om) >catfiles.txt
That would concatenate all regular files only, in order of most recently modified first.
From bash, you could do this with
zsh -c 'cat colid-data/*(.om)' >catfiles.txt
If the ordering of the files is not important (and if there's only regular files in the directory, no subdirectories), just use
cat colid-data/* >catfiles.txt
All of these variations would work with filenames containing spaces, tabs and newlines, since the list of pathnames returned by a filename globbing pattern is not split into further words (which the result of an unquoted command substitution is).

Mass renaming of files in folder

I need to renami all the files below few files format in the folder in such a way that last _2.txt will be the same and apac, emea, mds will be the same in all files but before _XXX_2.txt need to add logs_date to all the files.
ABC_xyz_123_apac_2.txt
POR5_emea_2.txt
qw_1_0_122_mds_2.txt
to
logs_date_apac_2.txt
logs_date_emea_2.txt
logs_date_mds_2.txt
I'm not sure but maybe this is what you want:
#!/bin/bash
for file in *_2.txt;do
# remove echo to rename the files once you check it does what you expect
echo mv -v "$file" "$(sed 's/.*\(_.*_2\.txt\)$/logs_date\1/' <<<"$file")"
done
Do you have to use bash?
Bulk Rename Utility is an awesome tool that can easily rename multiple files in an intuitive way.
http://www.bulkrenameutility.co.uk/Main_Intro.php
Using mmv command should be easy.
mmv '*_*_2.txt' 'logs_date_#2_2.txt' *.txt
You could also use the rename tool:
rename 's/.+(_[a-z]+_[0-9].)/logs_date$1/' files
This will give you the desired output.
If you don't want to or can't use sed, you can also try this, which might even run faster. No matter what solution you use, be sure to backup before if possible.
shopt +s extglob # turn on the extglob shell option, which enables several extended pattern matching operators
set +H # turn off ! style history substitution
for file in *_2.txt;do
# remove echo to rename the files once you check it does what you expect
echo mv -v "$file" "${file/?(*_)!(*apac*|*emea*|*mds*)_/logs_date_}"
done
${parameter/pattern/string} performs pattern substitution. First optionally a number of characters ending with an underscore are matched, then a following number of characters not containing apac, emea or mds and ending with an underscore are matched, then the match is replaced with "logs_date_".
Copied from the bash man page:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns

Resources