Operations that can be performed on Bash shell variables - bash

I've known several operations we can do to variables in shell, e.g:
1) "#" & "##" operation
with ${var#pattern}, we remove "pattern" in the head of ${var}. "*" could be used in the pattern to match everything. And the difference between "#" and "##" is that, "##" will remove the longest match substring while "#" removes the shortest. For example,
var=brbread
${var##*br} // ead
${var#*br} // bread
2) "%" & "%%" operation
with ${var%pattern}, we remove "pattern" at the end of ${var}. Of course, "%%" indicates longest match while "%" means the shortest. For example,
var=eadbreadbread
${var%%*br} // eadbreadbread
${var%%br*} // ead
${var%br*} // eadbread
3) "/" operation
with ${var/haha/heihei}, we replace "haha" in $var with "heihei". For example,
var=ihahai
${var/haha/heihei/} / iheiheii
I'm just curious wether or not we can make more operations to variables other than above ones?
Thanks.

Yes there is a lot a other operations on variables with bash, as case modification, array keys listing, name expanding, etc.
You should check the manual page at the Parameter Expansion chapter.

In one of your examples, you could do a global replacement with two slashes:
${var//ha/hei/} # the result would be the same
(Note that in Bash, the comment character is "#".)
Here are some examples of Parameter Expansion variable operations:
Provide a default:
$ unset foo
$ bar="hello"
$ echo ${foo:-$bar} # if $foo had a value, it would be output
hello
Alternate value:
$ echo ${bar:+"goodbye"}
goodbye
$ echo ${foo:+"goodbye"} # no substitution
Substrings:
$ echo ${bar:1:2}
el
$ echo ${bar: -4:2} # from the end (note the space before the minus)
el
List of array keys:
$ array=(123 456)
$ array[12]=7890
$ echo ${!array[#]}
0 1 12
Parameter Length:
$ echo ${#bar}
5
$ echo ${#array[#]} # number of elements in an array
3
$ echo ${#array[12]} # length of an array element
4
Modify Case (Bash 4):
$ greeting="hello jim"
$ echo ${greeting^}
Hello jim
$ echo ${greeting^^}
HELLO JIM
$ greeting=($greeting)
$ echo ${greeting[#]^}
Hello Jim

Related

Parse filename string and extract parent at specific level using shell

I have a filename as a string, say filname="a/b/c/d.png".
Is there a general method to extract the parent directory at a given level using ONLY shell parameter expansion?
I.e. I would like to extract "level 1" and return c or "level 2" and return b.
Explicitly, I DO NOT want to get the entire parent path (i.e. a/b/c/, which is the result of ${filename%/*}).
Using just shell parameter expansion, assuming bash, you can first transform the path into an array (splitting on /) and then ask for specific array indexes:
filename=a/b/c/d.png
IFS=/
filename_array=( $filename )
unset IFS
echo "0 = ${filename_array[0]}"
echo "1 = ${filename_array[1]}"
echo "2 = ${filename_array[2]}"
echo "3 = ${filename_array[3]}"
Running the above produces:
0 = a
1 = b
2 = c
3 = d.png
These indexes are the reverse of what you want, but a little
arithmetic should fix that.
Using zsh, the :h modifier trims the final component off a path in variable expansion.
The (s:...:) parameter expansion flag can be used to split the contents of a variable. Combine those with normal array indexing where a negative index goes from the end of the array, and...
$ filename=a/b/c/d.png
$ print $filename:h
a/b/c
$ level=1
$ print ${${(s:/:)filename:h}[-level]}
c
$ level=2
$ print ${${(s:/:)filename:h}[-level]}
b
You could also use array subscript flags instead to avoid the nested expansion:
$ level=1
$ print ${filename[(ws:/:)-level-1]}
c
$ level=2
$ print ${filename[(ws:/:)-level-1]}
b
w makes the index of a scalar split on words instead of by character, and s:...: has the same meaning, to say what to split on. Have to subtract one from the level to skip over the trailing d.png, since it's not stripped off already like the first way.
The :h (head) and :t (tail) expansion modifiers in zsh accept digits to specify a level; they can be combined to get a subset of the path:
> filname="a/b/c/d.png"
> print ${filname:t2}
c/d.png
> print ${filname:t2:h1}
c
> print ${filname:t3:h1}
b
If the level is in a variable, then the F modifier can be used to repeat the h modifier a specific number of times:
> for i in 1 2 3; printf '%s: %s\n' $i ${filname:F(i)h:t}
1: c
2: b
3: a
If using printf (a shell builtin) is allowed then this will do the trick in bash:
filename='a/b/c/d.png'
level=2
printf -v spaces '%*s' $level
pattern=${spaces//?/'/*'}
component=${filename%$pattern}
component=${component##*/}
echo $component
prints out
b
You can assign different values to the variable level.

Bash math expression

I need help with this its been busting my mind.
I have a read with a variable with integers 10 20 -30.
All separated by white space. I try to change the minus to plus and save it onto another variable but it's not saving. If I can't change to plus I would like to remove it so then I can do:
var=$((${num// /+/}))
So it can add all integers.
This is what I have:
read num
echo $num
sum=$num | sed -e 's/-/+/g'
echo $sum
Using standard POSIX variable expansion and arithmetic:
#!/usr/bin/env sh
# Computes the sum of all arguments
sum () {
# Save the IFS value
_OIFS=$IFS
# Set the IFS to + sign
IFS=+
# Expand the arguments with the IFS + sign
# inside an arithmetic expression to get
# the sum of all arguments.
echo "$(($*))"
# Restore the original IFS
IFS=$_OIFS
}
num='10 20 -30'
# shellcheck disable=SC2086 # Intended word splitting of string into arguments
sum $num
More featured version with a join function:
#!/usr/bin/env sh
# Join arguments with the provided delimiter character into a string
# $1: The delimiter character
# $#: The arguments to join
join () {
# Save the IFS value
_OIFS=$IFS
# Set the IFS to the delimiter
IFS=$1
# Shift out the delimiter from the arguments
shift
# Output the joined string
echo "$*"
# Restore the original IFS
IFS=$_OIFS
}
# Computes the sum of all arguments
sum () {
# Join the arguments with a + sign to form a sum expression
sum_expr=$(join + "$#")
# Submit the sum expression to a shell's arithmetic expression
# shellcheck disable=SC2004 # $sum_expr needs its $ to correctly split terms
echo "$(($sum_expr))"
}
num='10 20 -30'
# shellcheck disable=SC2086 # Intended word splitting of string into arguments
sum $num
Simply: whipe last slash:
num="10 20 -30"
echo $((${num// /+}))
0
Some details
*Bash battern substitution has nothing common with so called regular expression. Correct syntax is:
${parameter/pattern/string}
... If pattern begins with /, all matches of pattern are replaced with string.
Normally only the first match is replaced. ...
See: man -Pless\ +/parameter.pattern.string bash
If you try your syntax:
echo ${num// /+/}
10+/20+/-30
Then
echo ${num// /+}
10+20+-30
Or even, to make this pretty:
echo ${num// / + }
10 + 20 + -30
But result will stay same:
echo $(( ${num// / + } ))
0
sum=$num | sed -e 's/-/+/g'
With respect to what's present above, sum=$num and sed become two different commands. It's not grouped together as you wanted, which makes the sed ineffective.
Also, you'd need to echo $num
Solution is to group them together, like:
sum=`echo $num | sed -e 's/-/+/g`
OR
sum=$(echo $num | sed -e 's/-/+/g')
OR Rather, an alternate approach
sum=${num//-/+}

In bash how can I get the last part of a string after the last hyphen [duplicate]

I have this variable:
A="Some variable has value abc.123"
I need to extract this value i.e abc.123. Is this possible in bash?
Simplest is
echo "$A" | awk '{print $NF}'
Edit: explanation of how this works...
awk breaks the input into different fields, using whitespace as the separator by default. Hardcoding 5 in place of NF prints out the 5th field in the input:
echo "$A" | awk '{print $5}'
NF is a built-in awk variable that gives the total number of fields in the current record. The following returns the number 5 because there are 5 fields in the string "Some variable has value abc.123":
echo "$A" | awk '{print NF}'
Combining $ with NF outputs the last field in the string, no matter how many fields your string contains.
Yes; this:
A="Some variable has value abc.123"
echo "${A##* }"
will print this:
abc.123
(The ${parameter##word} notation is explained in ยง3.5.3 "Shell Parameter Expansion" of the Bash Reference Manual.)
Some examples using parameter expansion
A="Some variable has value abc.123"
echo "${A##* }"
abc.123
Longest match on " " space
echo "${A% *}"
Some variable has value
Longest match on . dot
echo "${A%.*}"
Some variable has value abc
Shortest match on " " space
echo "${A%% *}"
some
Read more Shell-Parameter-Expansion
The documentation is a bit painful to read, so I've summarised it in a simpler way.
Note that the '*' needs to swap places with the ' ' depending on whether you use # or %. (The * is just a wildcard, so you may need to take off your "regex hat" while reading.)
${A% *} - remove shortest trailing * (strip the last word)
${A%% *} - remove longest trailing * (strip the last words)
${A#* } - remove shortest leading * (strip the first word)
${A##* } - remove longest leading * (strip the first words)
Of course a "word" here may contain any character that isn't a literal space.
You might commonly use this syntax to trim filenames:
${A##*/} removes all containing folders, if any, from the start of the path, e.g.
/usr/bin/git -> git
/usr/bin/ -> (empty string)
${A%/*} removes the last file/folder/trailing slash, if any, from the end:
/usr/bin/git -> /usr/bin
/usr/bin/ -> /usr/bin
${A%.*} removes the last extension, if any (just be wary of things like my.path/noext):
archive.tar.gz -> archive.tar
How do you know where the value begins? If it's always the 5th and 6th words, you could use e.g.:
B=$(echo "$A" | cut -d ' ' -f 5-)
This uses the cut command to slice out part of the line, using a simple space as the word delimiter.
As pointed out by Zedfoxus here. A very clean method that works on all Unix-based systems. Besides, you don't need to know the exact position of the substring.
A="Some variable has value abc.123"
echo "$A" | rev | cut -d ' ' -f 1 | rev
# abc.123
More ways to do this:
(Run each of these commands in your terminal to test this live.)
For all answers below, start by typing this in your terminal:
A="Some variable has value abc.123"
The array example (#3 below) is a really useful pattern, and depending on what you are trying to do, sometimes the best.
1. with awk, as the main answer shows
echo "$A" | awk '{print $NF}'
2. with grep:
echo "$A" | grep -o '[^ ]*$'
the -o says to only retain the matching portion of the string
the [^ ] part says "don't match spaces"; ie: "not the space char"
the * means: "match 0 or more instances of the preceding match pattern (which is [^ ]), and the $ means "match the end of the line." So, this matches the last word after the last space through to the end of the line; ie: abc.123 in this case.
3. via regular bash "indexed" arrays and array indexing
Convert A to an array, with elements being separated by the default IFS (Internal Field Separator) char, which is space:
Option 1 (will "break in mysterious ways", as #tripleee put it in a comment here, if the string stored in the A variable contains certain special shell characters, so Option 2 below is recommended instead!):
# Capture space-separated words as separate elements in array A_array
A_array=($A)
Option 2 [RECOMMENDED!]. Use the read command, as I explain in my answer here, and as is recommended by the bash shellcheck static code analyzer tool for shell scripts, in ShellCheck rule SC2206, here.
# Capture space-separated words as separate elements in array A_array, using
# a "herestring".
# See my answer here: https://stackoverflow.com/a/71575442/4561887
IFS=" " read -r -d '' -a A_array <<< "$A"
Then, print only the last elment in the array:
# Print only the last element via bash array right-hand-side indexing syntax
echo "${A_array[-1]}" # last element only
Output:
abc.123
Going further:
What makes this pattern so useful too is that it allows you to easily do the opposite too!: obtain all words except the last one, like this:
array_len="${#A_array[#]}"
array_len_minus_one=$((array_len - 1))
echo "${A_array[#]:0:$array_len_minus_one}"
Output:
Some variable has value
For more on the ${array[#]:start:length} array slicing syntax above, see my answer here: Unix & Linux: Bash: slice of positional parameters, and for more info. on the bash "Arithmetic Expansion" syntax, see here:
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Arithmetic-Expansion
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Shell-Arithmetic
You can use a Bash regex:
A="Some variable has value abc.123"
[[ $A =~ [[:blank:]]([^[:blank:]]+)$ ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
Prints:
abc.123
That works with any [:blank:] delimiter in the current local (Usually [ \t]). If you want to be more specific:
A="Some variable has value abc.123"
pat='[ ]([^ ]+)$'
[[ $A =~ $pat ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
echo "Some variable has value abc.123"| perl -nE'say $1 if /(\S+)$/'

The semantics of arrays in bash

Check out the following transcript. With all possible rigor and formality, what is going on at each step?
$> ls -1 #This command prints 3 items. no explanation required.
a
b
c
$> X=$(ls -1) #Capture the output (as what? a string?)
$> Y=($(ls -1)) #Capture it again (as an array now?)
$> echo ${#X[#]} #Why is the length 1?
1
$> echo ${#Y[#]} #This works because Y is an array of the 3 items?
3
$> echo $X #Why are the linefeeds now spaces?
a b c
$> echo $Y #Why does the array echo as its first element
a
$> for x in $X;do echo $x; done #iterate over $X
a
b
c
$> for y in $Y;do echo $y; done #iterating over y doesn't work
a
$> echo ${X[2]} #I can loop over $X but not index into it?
$> echo ${Y[2]} #Why does this work if I can't loop over $Y?
c
I assume bash has well established semantics about how arrays and text variables (if that's even what they're called) work, but the user manual is not organized in an optimal fashion for someone who wants to reason about scripts based on whatever small set of underlying principles the language designer intended.
Let me preface the following with the very strong suggestion that you never use ls to populate an array. The correct code would be
Z=( * )
to create an array with each (non-hidden) file in the current directory as a distinct array element.
$> ls -1 #This command prints 3 items. no explanation required.
a
b
c
Correct. Each file name is printed on a separate line (although, beware of file names containing newlines; the parts before and after each newline would appear as separate file names.)
$> X=$(ls -1) #Capture the output (as what? a string?)
Yes. The output of ls is concatenated by the command substitution into a single string using a single space to separate each line. (The command substitution would be subject to word-splitting if it weren't the right-hand side of an assignment; word-splitting will come up below.)
$> Y=($(ls -1)) #Capture it again (as an array now?)
Same as with X, but now each of the words in the result of the command substitution is treated as a separate array element. As long as none of the output lines contain any characters in the value of IFS, each file name is one word and will be treated as a separate array element.
$> echo ${#X[#]} #Why is the length 1?
1
X, not being a real array, is treated as an array with a single element, namely the value of $X.
$> echo ${#Y[#]} #This works because Y is an array of the 3 items?
3
Correct.
$> echo $X #Why are the linefeeds now spaces?
a b c
When $X is unquoted, the resulting expansion is subject to word-splitting. In this case, the newlines are simply treated the same as any other whitespace, separating the result into a sequence of words that are passed to echo as distinct arguments, which are then displayed separated by a single space each.
$> echo $Y #Why does the array echo as its first element
a
For a true array, $Y is equivalent to ${Y[0]}.
$> for x in $X;do echo $x; done #iterate over $X
a
b
c
This works, but has caveats.
$> for y in $Y;do echo $y; done #iterating over y doesn't work
a
See above; $Y only expands to the first element. You want for y in "${Y[#]}"; do to iterate over all the elements.
$> echo ${X[2]} #I can loop over $X but not index into it?
Correct. X is not an array, but $X expanded to a space-separated list which the for loop could iterate over.
$> echo ${Y[2]} #Why does this work if I can't loop over $Y?
c
Indexing and iteration are two completely different things in shell. You don't actually iterate over an array; you iterate over the resulting sequence of words of a properly expanded array.

how to chop last n bytes of a string in bash string choping?

for example qa_sharutils-2009-04-22-15-20-39, want chop last 20 bytes, and get 'qa_sharutils'.
I know how to do it in sed, but why $A=${A/.\{20\}$/} does not work?
Thanks!
If your string is stored in a variable called $str, then this will get you give you the substring without the last 20 digits in bash
${str:0:${#str} - 20}
basically, string slicing can be done using
${[variableName]:[startIndex]:[length]}
and the length of a string is
${#[variableName]}
EDIT:
solution using sed that works on files:
sed 's/.\{20\}$//' < inputFile
similar to substr('abcdefg', 2-1, 3) in php:
echo 'abcdefg'|tail -c +2|head -c 3
using awk:
echo $str | awk '{print substr($0,1,length($0)-20)}'
or using strings manipulation - echo ${string:position:length}:
echo ${str:0:$((${#str}-20))}
In the ${parameter/pattern/string} syntax in bash, pattern is a path wildcard-style pattern, not a regular expression. In wildcard syntax a dot . is just a literal dot and curly braces are used to match a choice of options (like the pipe | in regular expressions), so that line will simply erase the literal string ".20".
There are several ways to accomplish the basic task.
$ str="qa_sharutils-2009-04-22-15-20-39"
If you want to strip the last 20 characters. This substring selection is zero based:
$ echo ${str::${#str}-20}
qa_sharutils
The "%" and "%%" to strip from the right hand side of the string. For instance, if you want the basename, minus anything that follows the first "-":
$ echo ${str%%-*}
qa_sharutils
only if your last 20 bytes is always date.
$ str="qa_sharutils-2009-04-22-15-20-39"
$ IFS="-"
$ set -- $str
$ echo $1
qa_sharutils
$ unset IFS
or when first dash and beyond are not needed.
$ echo ${str%%-*}
qa_sharutils

Resources