Checking the last character of a filename in bash - bash

I am trying to write a script that checks whether the user accidentally added an extra "/" to the end of a filepath (eg. MY_PATH) and if they did, it removes the last character. My script (see below) does successfully remove the last character of a path, but for some reason it also sometimes removes the last character even if it isn't "/". Does anyone know why it is doing this or how to fix it? I am open to alternative solutions.
MY_PATH="~/directory/Rscript.R"
#MY_PATH="~/directory/Rscript.R/"
if [ "${MY_PATH:$((${#MY_PATH}-1)):${#MY_PATH}}"=="/" ]
then MY_PATH=${MY_PATH:0:$((${#MY_PATH}-1))}; fi
echo ${MY_PATH}

The last character of a string is accessed by ${MY_PATH: -1}. You can test it as
if test ${MY_PATH: -1} = / ; then
or
if [ ${MY_PATH: -1} = / ]; then
or
if [[ ${MY_PATH: -1} = / ]]; then
or
if [[ ${MY_PATH: -1} == / ]]; then
or
if [[ ${MY_PATH: -1} =~ / ]]; then
The first three alternatives use string comparision, the 4th one wildcard matching, and the last one regex matching. Most of the spaces in those alternatives matter, so be sure that you get the spaces right.

You can use sed
MY_PATH=$(sed 's#\/$##g' <<< ${MY_PATH})
Demo:
$MY_PATH="~/directory/Rscript.R/"
$MY_PATH=$(sed 's#\/$##g' <<< ${MY_PATH})
$echo $MY_PATH
~/directory/Rscript.R
$MY_PATH="~/directory/Rscript.R"
$MY_PATH=$(sed 's#\/$##g' <<< ${MY_PATH})
$echo $MY_PATH
~/directory/Rscript.R
$

You can try checking the last character by
if [[ ${MY_PATH:(-1)} = '/' ]]; then
...
fi

Related

BASH regex syntax for replacing a sub-string

I'm working in bash and I want to remove a substring from a string, I use grep to detect the string and that works as I want, my if conditions are true, I can test them in other tools and they select exactly the string element I want.
When it comes to removing the element from the string I'm having difficulty.
I want to remove something like ": Series 1", where there could be different numbers including 0 padded, a lower case s or extra spaces.
temp='Testing: This is a test: Series 1'
echo "A. "$temp
if echo "$temp" | grep -q -i ":[ ]*[S|s]eries[ ]*[0-9]*" && [ "$temp" != "" ]; then
title=$temp
echo "B. "$title
temp=${title//:[ ]*[S|s]eries[ ]*[0-9]*/ }
echo "C. "$temp
fi
# I trim temp for spaces here
series_title=${temp// /_}
echo "D. "$series_title
The problem I have is that at points C & D
Give me:
C. Testing
D. Testing_
You can perform regex matching from bash alone without using external tools.
It's not clear what your requirement is. But from your code, I guess following will help.
temp='Testing: This is a test: Series 1'
# Following will do a regex match and extract necessary parts
# i.e. extract everything before `:` if the entire pattern is matched
[[ $temp =~ (.*):\ *[Ss]eries\ *[0-9]* ]] || { echo "regex match failed"; exit; }
# now you can use the extracted groups as follows
echo "${BASH_REMATCH[1]}" # Output = Testing: This is a test
As mentioned in the comments, if you need to extract parts both before and after the removed section,
temp='Testing: This is a test: Series 1 <keep this>'
[[ $temp =~ (.*):\ *[Ss]eries\ *[0-9]*\ *(.*) ]] || { echo "invalid"; exit; }
echo "${BASH_REMATCH[1]} ${BASH_REMATCH[2]}" # Output = Testing: This is a test <keep this>
Keep in mind that [0-9]* will match zero lengths too. If you need to force that there need to be at least single digit, use [0-9]+ instead. Same goes for <space here>* (i.e. zero or more spaces) and others.

Detecting when a string exists but doesn't start with - in bash

I am trying to make a bash program that saves results to a file with the name of the user's choosing if the program is supplied the --file argument followed by an option, in which the option should not start with a dash. So I used the following conditional:
if [[ -n $2 && !($2="[^-]") ]]
But that didn't work. It still saves the output to a file even if the second argument starts with a dash. I also tried using this:
1) if ! [[ -z $2 && ($2="[^-]") ]]
It also did as the previous one. What's the problem? Thanks in advance!
As a pattern match, this might look like:
[[ $2 ]] && [[ $2 != -* ]]
Note:
Moving && outside of [[ ]] isn't mandatory, but it is good form: It ensures that your code can be rewritten to work with the POSIX test command without either using obsolescent functionality (-a and -o) or needing to restructure.
Whitespace is mandatory. In !($2="[^-]"), neither the ! nor the ( and ) nor the = are parsed as separate operators.
= and != check for pattern matches, not regular expressions. The regular expression operator in [[ ]] is =~. Among the differences, anchors (^ to match at the beginning of a string, or $ to match at the end) are implicit in a pattern whereas they need to be explicit in a regex, and * has a very different meaning (* in a pattern means the same thing as .* in a regex).
The ^ in [^-] already negates the -, so by using ! in addition, you're making your code only match when there is a dash in the second argument.
To test this yourself:
$ check_args() { [[ $2 ]] && [[ $2 != -* ]]; echo $?; }
$ check_args one --two
1
$ check_args one two
0
$ check_args one
1

check for string format in bash script

I am attempting to check for proper formatting at the start of a string in a bash script.
The expected format is like the below where the string must always begin with "ABCDEFG-" (exact letters and order) and the numbers would vary but be at least 3 digits. Everything after the 3rd digit is a do not care.
Expected start of string: "ABCDEFG-1234"
I am using the below code snippet.
[ $(echo "$str" | grep -E "ABCDEFG-[0-9][0-9][0-9]") ] && echo "yes"
str1 = "ABCDEFG-1234"
str2 = "ABCDEFG-1234 - Some more text"
When I use str1 in place of str everything works ok and yes is printed.
When I use str2 in place of str i get the below error
[: ABCDEFG-1234: unary operator expected
I am pretty new to working with bash scripts so any help would be appreciated.
If this is bash, you have no reason to use grep for this at all; the shell has built-in regular expression support.
re="ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, you might want your regex to be anchored if you want a match in the beginning rather than anywhere in the content:
re="^ABCDEFG-[0-9][0-9][0-9]"
[[ $str =~ $re ]] && echo "yes"
That said, this doesn't need to be an ERE at all -- a glob-style pattern match would also be adequate:
if [[ $str = ABCDEFG-[0-9][0-9][0-9]* ]]; then echo "yes"; fi
Try grep -E "ABCDEFG-[0-9][0-9][0-9].*"

BASH: Everything but not slash? IF STATEMENT (STRING COMPARISION)

I'm trying to match any strings that start with /John/ but does not contain / after /John/
if
[ $string == /John/[!/]+ ]; then ....
fi
This is what I got and it doesn't seem to be working.
So I tried
if
[[ $string =~ ^/John/[!/]+$ ]]; then ....
fi
It still didn't work, and so I changed it to
if
[[ $string =~ /John/[^/] ]]; then ....
fi
It worked but will match with all the strings that has / behind /John/ too.
For bash you want [[ $string =~ /John/[^/]*$ ]] -- the end-of-line anchor ensures there are no slashes after the last acceptable slash.
How about "the string starts with '/John/' and doesn't contain any slashes after '/John/'"?
[[ $string = /John/* && $string != /John/*/* ]]
Or you could compare against a parameter expansion that only expands if the conditions are met. This says "after stripping off everything including and after the last slash, the string is /John":
[[ ${string%/*} = /John ]]
In fact, this last solution is the only entirely POSIXLY_STRICT one I can come up with without multiple test expressions.
[ "${string%/*}" = /John ]
By the way, your problem is probably simply be using double-equals inside a single-bracket test expression. bash actually does accept them inside double-bracket test expressions, but a single equals is a better idea.
You can also use plain old grep:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -q "/John[^/]" ; then
echo "matched"
else
echo "no match found"
fi
This only fails if /John is at the very end of the string... if that's a possibility then you can tweak to handle that case, for instance:
string='/John Lennon/Yoko Ono'
if echo "$string" | grep -qP "(/John[^/])|(/John$)" ; then
echo "matched"
else
echo "no match found"
fi
Not sure what language you're using, but normal negative character classes are prefixed with a ^
e.g.
[^/]
You can also put in start/end qualifiers (clojure example, so Java's regex engine). Usually ^ at beginning and $ at end.
user => (re-matches #"^/[a-zA-Z]+[^/]$" "/John/")
nil

multiline regexp matching in bash

I would like to do some multiline matching with bash's =~
#!/bin/bash
str='foo = 1 2 3
bar = what about 42?
boo = more words
'
re='bar = (.*)'
if [[ "$str" =~ $re ]]; then
echo "${BASH_REMATCH[1]}"
else
echo no match
fi
Almost there, but if I use ^ or $, it will not match, and if I don't use them, . eats newlines too.
EDIT:
sorry, values after = could be multi-word values.
I could be wrong, but after a quick read from here, especially Note 2 at the end of the page, bash can sometimes include the newline character when matching with the dot operator. Therefore, a quick solution would be:
#!/bin/bash
str='foo = 1
bar = 2
boo = 3
'
re='bar = ([^\
]*)'
if [[ "$str" =~ $re ]]; then
echo "${BASH_REMATCH[1]}"
else
echo no match
fi
Notice that I now ask it match anything except newlines. Hope this helps =)
Edit: Also, if I understood correctly, the ^ or $ will actually match the start or the end (respectively) of the string, and not the line. It would be better if someone else could confirm this, but it is the case and you do want to match by line, you'll need to write a while loop to read each line individually.

Resources