How to remove all file extensions in bash?

How to remove all file extensions in bash? - bash

x=./gandalf.tar.gz
noext=${x%.*}
echo $noext
This prints ./gandalf.tar, but I need just ./gandalf.
I might have even files like ./gandalf.tar.a.b.c which have many more extensions.
I just need the part before the first .

If you want to give sed a chance then:
x='./gandalf.tar.a.b.c'
sed -E 's~(.)\..*~\1~g' <<< "$x"
./gandalf
Or 2 step process in bash:
x="${s#./}"
echo "./${x%%.*}"
./gandalf

Using extglob shell option of bash:
shopt -s extglob
x=./gandalf.tar.a.b.c
noext=${x%%.*([!/])}
echo "$noext"
This deletes the substring not containing a / character, after and including the first . character. Also works for x=/pq.12/r/gandalf.tar.a.b.c

Perhaps a regexp is the best way to go if your bash version supports it, as it doesn't fork new processes.
This regexp works with any prefix path and takes into account files with a dot as first char in the name (hidden files):
[[ "$x" =~ ^(.*/|)(.[^.]*).*$ ]] && \
noext="${BASH_REMATCH[1]}${BASH_REMATCH[2]}"
Regexp explained
The first group captures everything up to the last / included (regexp are greedy in bash), or nothing if there are no / in the string.
Then the second group captures everything up to the first ., excluded.
The rest of the string is not captured, as we want to get rid of it.
Finally, we concatenate the path and the stripped name.
Note
It's not clear what you want to do with files beginning with a . (hidden files). I modified the regexp to preserve that . if present, as it seemed the most reasonable thing to do. E.g.
x="/foo/bar/.myinitfile.sh"
becomes /foo/bar/.myinitfile.

If performance is not an issue, for instance something like this:
fil=$(basename "$x")
noext="$(dirname "$x")"/${fil%%.*}

Related

what happens in bash 'echo ${full_path##/*}'

I found this easy filename printing on the internet. But I cant find explanation what does these ##*/ mean? It doesnt look like regex. More over, could it be used with result of readlink in one line?

From Manipulating String, Advanced Bash-Scripting Guide
${string##substring}
Deletes longest match of substring from front of $string.
So in your case, the * in the substring indicates: match everything.
The command echo ${full_path##/*} will:
Print $full_path unless it starts with a forward slash (/), in that case an empty string will be shown
Example cases;
$ test_1='/foo/bar'
$ test_2='foo/bar'
$
$ echo "${test_1##/*}"
$ echo "${test_2##/*}"
foo/bar
$
Regarding your second question:
More over, could it be used with result of readlink in one line?
Please take a look at Can command substitution be nested in variable substitution?.
If you're using bash I'd recommend keeping it simple, by assigning the result of readlink to a variable, then using the regular variable substitution to get the desired output. Linking both actions could be done using the && syntax.
An one-liner could look something like:
tmp="$(readlink -f file_a)" && echo "${tmp##/*}"

Replacing multiple preceding numbers from files

Good day,
I have a bunch of files that need to be batch renamed like so:
01-filename1.txt > filename1.txt
02-filename2.txt > filename2.txt
32-filename3.txt > filename3.txt
322-filename4.txt > filename4.txt
31112-filename5.txt > filename5.txt
I run into an example of achieving this using bash ${string#substring} string operation, so this almost works:
for i in `ls`; do mv $i ${i#[0-9]}; done
However, this removes only a single digit and adding regex '+' does not seem to work. Is there a way to strip ALL preceding digits characters?
Thank you!

With Perl's standalone rename command:
rename -n 's/.*?-//' *.txt
If output looks okay, remove -n.
See: The Stack Overflow Regular Expressions FAQ

If you have a single character that always marks the end of the prefix, Pattern Matching makes it very simple.
for f in *; do
mv -nv "$f" "${f#*-}";
done;
Things worth noting:
In your case, the use of ls does not cause problems, but for a more generalized solution, certain filenames would break it. Additionally, the lack of quotes around parameter expansions would cause issues for files with newlines, spaces or tabs in them.
The pattern *- matches any string ending with - combined with lazy prefix removal (one # instead of 2), leads to ${f#*-} evaluating to "$f" with the shortest prefix ending in - removed (if one exists).
Bash's pattern matching is different from and inferior to RegEx, but you can get a little more power by enabling extended pattern matching with shopt -s extglob. Some distributions have this enabled by default.
Also, I threw the -nv flags in mv to ensure no mishaps when playing around with parameter expansion.
More Pattern Matching tricks I often use:
If you want to remove all leading digits and don't always have a single character terminating the prefix, extended pattern matching is helpful: "${f##+([0-9])}"

for i in *
do
name=$( echo "$i" | cut -d "-" -f 2 )
mv "$i" "$name" 2>/dev/null
done

Using egrep and regular expression together

I want to search the below text file for words that ends in _letter, and get the whole portion upto "::". There is no space between any letter
blahblah:/blahblah::abc_letter:/blahblah/blahblah
blahblah:/blahblah::cd_123_letter:/blahblah/blahblah
blahblah:::/blahblah::24_cde_letter:/blahblah/blahblah
blahblah::/blahblah::45a6_letter:/blahblah/blahblah
blahblah:/blahblah::fgh_letter:/blahblah/blahblah
blahblah:/blahblah::789_letter:/blahblah/blahblah
I tried
egrep -o '*_letter'
and
egrep -o "*_letter"
But it only returns the word _letter
then I want to feed the input to the parametre of a shell script for loop. So the script will look like following
for i in [grep command]
mkdir $i
end
It will create the following directories
abc_letter/
cd_123_letter/
24_cde_letter/
45a6_letter/
fgh_letter/
789_letter/
ps: The result between :: and _letter doesn't contain any special character, only alphanumeric character
also my system doesn't have perl

Assuming no spaces or new-lines:
for i in $(sed 's/^.*:\([^/]*_letter\):.*$/\1/g' infile); do
mkdir $i
done

To extract after : to _letter strings from a file.txt and use them in your for loop, you can use the following egrep and revise your: script.sh, like this:
#!/bin/bash
for i in $(egrep -o "[^:]+_letter" file.txt); do
mkdir -p $i
done
Then you run ./script.sh, and later you check with ls, you see:
$ ls -1
24_cde_letter
45a6_letter
789_letter
abc_letter
cd_123_letter
fgh_letter
file.txt
script.sh
Explanation
Your original egrep -o '*_letter' probably just confused bash filename expansion with regular expression,
In bash, *something uses star globbing character to match * = anything here + something.
However in regular expression star * means the preceding character zero or more times. Since * is at the beginning of what you wrote, there is nothing before it, so it does not match anything there.
The only thing egrep can match is _letter, and since we are using the -o option it only displays the match, on an individual line, and thus why you originally only saw a line of _letter matches
Our new changes:
egrep pattern starts with [^ ... ], a negation, matches the opposite of what characters you put within. We put : within.
The + says to match the preceding one or more times.
So combined, it says look for anything-but-:, and do this one or more times.
Thus of course it matches anything after :, and keeps matching, until the next part of the pattern
The next part of the pattern is just _letter
egrep -o so only matched text will be shown, one per line
So in this way, from lines such as:
blahblah:/blahblah::abc_letter:/blahblah/blahblah
It successfully extracts:
abc_letter
Then, changes to your bash script:
Bash command substitution $() to have the results of the egrep command sent to the for-loop
for i value...; do ... done syntax
mkdir -p just a convenience in case you are re-testing, it will not error if directory was already made.
So altogether it helps to extract the pattern you wanted and generate directories with those names.

How to get a vim-tab-like short path in bash script?

I'd like to create a bash prompt that includes a shortened path to the current working directory, so
~/folder/directory/foo
would become
~/f/d/foo
I got this idea from a custom bash prompt described here (http://engineerwithoutacause.com/show-current-virtualenv-on-bash-prompt.html) which includes code that's supposed to do exactly that (according to the comment) but doesn't. I don't know anything about bash scripting, but I bet this would be an easy one to answer.
What line or lines of code in a bash script will let me generate that shortened version of the working directory?

You can put your $PWD variable (a string with your current directory) to being changed by a sed command, as in:
echo $PWD | sed 's:/\(.\)[^/]*/:/\1/:g'
Basically, this sed is finding everything (g) that is between two / (and that's why I'm using : as a delimiter), and replacing it to just the first char (the . enclosed by \( and \), referenced as \1 later), surrounded by /s again.
If you set this to your PS1 variable, you can change your bash prompt as request.
Hope that helps.

Using sed is easiest because of regex backreference support, but for fun and profit a pure bash solution:
path="$(while read -rd/; do echo -n ${REPLY::1}/; done <<< "$PWD"; echo "${PWD##*/}")"
The value of $PWD is fed into the while loop via the herestring syntax <<<, then split on slashes by read -rd/. Conveniently, the last component is ignored because it doesn't end in a slash, so read exits with a nonzero status and terminates the loop.
Inside the loop, ${REPLY::1} takes only the first character of the path component, and echo -n prints it without a newline.
Finally, we print the last pathname component in full using ${PWD##*/}, which strips the longest prefix that matches */.

Combining the information at the link I referenced in the question, with Fernando's answer and some research of my own into regex, This is the code that provides the path I want:
pwd | sed -e "s:$HOME:~:" -e "s:\(.\)[^/]*/:\1/:g"
The first sed pattern replaces my home directory /home/joe with a ~ and the second one replaces every multi-character directory name with its first character.
If anyone's interested, the complete code for my bashprompt is here: https://gist.github.com/joeclark-phd/d6be2dca717788e6a872. The part you helped me with is in line 39.

Bash foreach on cronjob

I am trying to create a "watch" folder where I will be able to copy files 2 sets of files with the same name, but different file extensions. I have a program that need to reference both files, but since they have the same name, only differing by extension I figure I might be able to do something like this with a cron job
cronjob.sh:
#/bin/bash
ls *.txt > processlist.txt
for filename in 'cat processlist.txt'; do
/usr/local/bin/runcommand -input1=/home/user/process/$filename \
-input2=/home/user/process/strsub($filename, -4)_2.stl \
-output /home/user/process/done/strsub($filename, -4)_2.final;
echo "$filename finished processing"
done
but substr is a php command, not bash. What would be the right way of doing this?

strsub($filename, -4)
in Bash is
${filename:(-4)}
See Shell Parameter Expansion.
Your command can look like
/usr/local/bin/runcommand "-input1=/home/user/process/$filename" \
"-input2=/home/user/process/${filename:(-4)}_2.stl" \
"-output /home/user/process/done/${filename:(-4)}_2.final"
Note: Prefer quoting your arguments with variables around double-quotes to prevent word splitting and possible pathname expansion. This would be helpful to filenames with spaces.
It would also be better to directly pass your glob pattern as an argument to for to properly distribute tokens without getting split with word splitting.
for filename in *.txt; do

So Konsolebox's solution was almost right, but the issue was that when you do ${filename:(-4)} it only returns the last 4 letters of the variable instead of trimming the last 4 off. When I did was change it to ${filename%.txt} where the %.txt matches to the text I want to find and remove, and then just tagged .mp3 on at the end to change the extension.
His other suggestion of using this for loop also was much better than mine:
for filename in *.txt; do
The only other modification was putting the full command all on one line in the end. I divided it up here to make sure it was all easily visible.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio