associative array in bash - bash

I found an implementation of the associative array here and would like to understand what the code actually does. Here are the piece of the code I don't understand, and would appreciate the explanation.
'
put() {
if [ "$#" != 3 ]; then exit 1; fi
mapName=$1; key=$2; value=`echo $3 | sed -e "s/ /:SP:/g"` #dont understand
eval map="\"\$$mapName\"" **#dont understand**
map="`echo "$map" | sed -e "s/--$key=[^ ]*//g"` --$key=$value" #dont understand
eval $mapName="\"$map\"" #dont understand
}
get() {
mapName=$1; key=$2
map=${!mapName}
#dont understand
value="$(echo $map |sed -e "s/.*--${key}=\([^ ]*\).*/\1/" -e 's/:SP:/ /g' )"
}
getKeySet() {
if [ "$#" != 1 ];
then
exit 1;
fi
mapName=$1;
eval map="\"\$$mapName\""
keySet=`
echo $map |
sed -e "s/=[^ ]*//g" -e "s/\([ ]*\)--/\1/g" #dont understand
`
}
Thanks.

So first in order of don't understands:
this simply checks you always have 3 arguments to the function and if different number is provided exits with 1(error)
This escapes the space chars by replacing them with :SP: so Hi how are you becomes Hi:SP:how:SP:are:SP:you
The map is stored in a var with name the first argument provided to put. So this line adds to the var the following text --$key=$value and here key is the second argument and value is the escaped third argument
get simply stores in value the value for the key provided as second argument(first one is the name of the map)
To understand better try printing the variable you decided to use for your map after each operation. It's easy you'll see ;)
Hope this makes it clear.

I'll try to explain every line you highlighted:
if [ "$#" != 3 ]; then exit 1; fi #dont understand
If there aren't exactly three arguments, exit with error.
mapName=$1; key=$2; value=`echo $3 | sed -e "s/ /:SP:/g"` #dont understand
Set the variable $mapName with the value of first argument
Set the variable $key with the value of second argument
Set the variable $value with the value of third argument, after replacing spaces with the string :SP:.
map="`echo "$map" | sed -e "s/--$key=[^ ]*//g"` --$key=$value" #dont understand
This will edit the $map variable, by removing the first occurrance of the value of $key followed by = and then by non-space characters, and then append the string -- followed by the value of $key, then by = and finally the value of $value.
eval $mapName="\"$map\"" #dont understand
This will evaluate a string that was generated by that line. Suppose $mapName is myMap and $map is value, the string that bash will evaluate is:
myMap="value"
So it will actually set a variable, for which its name will be passed by a parameter.
map=${!mapName}
This will set the variable $map with the value the variable that has the same name as the value in $mapName. Example: suppose $mapName has a, then $map would end up with the contents of a.
value="$(echo $map |sed -e "s/.*--${key}=\([^ ]*\).*/\1/" -e 's/:SP:/ /g' )"
Here we set the value of the $value variable as the value of the $map variable, after a couple of edits:
Extract only the contents that are specified in the expression between parenthesis in the sed expression, which matches characters that aren't spaces. The text before it specifies where the match should start, so in this case the spaces must start after the string -- followed by the value of the $key variable, followed by =. The `.*' at the start and the end matches the rest of the line, and is used in order to remove them afterwards.
Restore spaces, ie. replace :SP: with actual spaces.
eval map="\"\$$mapName\""
This creates a string with the value "$mapName", ie. a dollar, followed by the string contained in mapName, surrounded by double quotes. When evaluated, this gets the value of the variable whose name is the contents of $mapName.
Hope this helps a little =)

Bash, since version 4 and upwards, has builtin associative arrays.
To use them, you need to declare one with declare -A
#!/usr/bin/env bash
declare -A arr # 'arr' will be an associative array
# now fill it with: arr['key']='value'
arr=(
[foo]="bar"
["nyan"]="cat"
["a b"]="c d"
["cookie"]="yes please"
["$USER"]="$(id "$LOGNAME")"
)
# -- cmd -- -- output --
echo "${arr[foo]}" # bar
echo "${arr["a b"]}" # c d
user="$USER"
echo "${arr["$user"]}" # uid=1000(c00kiemon5ter) gid=100...
echo "${arr[cookie]}" # yes please
echo "${arr[cookies]}" # <no output - empty value> :(
# keys and values
printf '%s\n' "${arr[#]}" # print all elements, each on a new line
printf '%s\n' "${!arr[#]}" # print all keys, each on a new line
# loop over keys - print <key :: value>
for key in "${!arr[#]}"
do printf '%s :: %s\n' "$key" "${arr["$key"]}"
done
# you can combine those with bash parameter expansions, like
printf '%s\n' "${arr[#]:0:2}" # print only first two elements
echo "${arr[cookie]^^}" # YES PLEASE

Related

Reading a particular Digit from a given number in shell [duplicate]

I have a string in a Bash shell script that I want to split into an array of characters, not based on a delimiter but just one character per array index. How can I do this? Ideally it would not use any external programs. Let me rephrase that. My goal is portability, so things like sed that are likely to be on any POSIX compatible system are fine.
Try
echo "abcdefg" | fold -w1
Edit: Added a more elegant solution suggested in comments.
echo "abcdefg" | grep -o .
You can access each letter individually already without an array conversion:
$ foo="bar"
$ echo ${foo:0:1}
b
$ echo ${foo:1:1}
a
$ echo ${foo:2:1}
r
If that's not enough, you could use something like this:
$ bar=($(echo $foo|sed 's/\(.\)/\1 /g'))
$ echo ${bar[1]}
a
If you can't even use sed or something like that, you can use the first technique above combined with a while loop using the original string's length (${#foo}) to build the array.
Warning: the code below does not work if the string contains whitespace. I think Vaughn Cato's answer has a better chance at surviving with special chars.
thing=($(i=0; while [ $i -lt ${#foo} ] ; do echo ${foo:$i:1} ; i=$((i+1)) ; done))
As an alternative to iterating over 0 .. ${#string}-1 with a for/while loop, there are two other ways I can think of to do this with only bash: using =~ and using printf. (There's a third possibility using eval and a {..} sequence expression, but this lacks clarity.)
With the correct environment and NLS enabled in bash these will work with non-ASCII as hoped, removing potential sources of failure with older system tools such as sed, if that's a concern. These will work from bash-3.0 (released 2005).
Using =~ and regular expressions, converting a string to an array in a single expression:
string="wonkabars"
[[ "$string" =~ ${string//?/(.)} ]] # splits into array
printf "%s\n" "${BASH_REMATCH[#]:1}" # loop free: reuse fmtstr
declare -a arr=( "${BASH_REMATCH[#]:1}" ) # copy array for later
The way this works is to perform an expansion of string which substitutes each single character for (.), then match this generated regular expression with grouping to capture each individual character into BASH_REMATCH[]. Index 0 is set to the entire string, since that special array is read-only you cannot remove it, note the :1 when the array is expanded to skip over index 0, if needed.
Some quick testing for non-trivial strings (>64 chars) shows this method is substantially faster than one using bash string and array operations.
The above will work with strings containing newlines, =~ supports POSIX ERE where . matches anything except NUL by default, i.e. the regex is compiled without REG_NEWLINE. (The behaviour of POSIX text processing utilities is allowed to be different by default in this respect, and usually is.)
Second option, using printf:
string="wonkabars"
ii=0
while printf "%s%n" "${string:ii++:1}" xx; do
((xx)) && printf "\n" || break
done
This loop increments index ii to print one character at a time, and breaks out when there are no characters left. This would be even simpler if the bash printf returned the number of character printed (as in C) rather than an error status, instead the number of characters printed is captured in xx using %n. (This works at least back as far as bash-2.05b.)
With bash-3.1 and printf -v var you have slightly more flexibility, and can avoid falling off the end of the string should you be doing something other than printing the characters, e.g. to create an array:
declare -a arr
ii=0
while printf -v cc "%s%n" "${string:(ii++):1}" xx; do
((xx)) && arr+=("$cc") || break
done
If your string is stored in variable x, this produces an array y with the individual characters:
i=0
while [ $i -lt ${#x} ]; do y[$i]=${x:$i:1}; i=$((i+1));done
The most simple, complete and elegant solution:
$ read -a ARRAY <<< $(echo "abcdefg" | sed 's/./& /g')
and test
$ echo ${ARRAY[0]}
a
$ echo ${ARRAY[1]}
b
Explanation: read -a reads the stdin as an array and assigns it to the variable ARRAY treating spaces as delimiter for each array item.
The evaluation of echoing the string to sed just add needed spaces between each character.
We are using Here String (<<<) to feed the stdin of the read command.
I have found that the following works the best:
array=( `echo string | grep -o . ` )
(note the backticks)
then if you do: echo ${array[#]} ,
you get: s t r i n g
or: echo ${array[2]} ,
you get: r
Pure Bash solution with no loop:
#!/usr/bin/env bash
str='The quick brown fox jumps over a lazy dog.'
# Need extglob for the replacement pattern
shopt -s extglob
# Split string characters into array (skip first record)
# Character 037 is the octal representation of ASCII Record Separator
# so it can capture all other characters in the string, including spaces.
IFS= mapfile -s1 -t -d $'\37' array <<<"${str//?()/$'\37'}"
# Strip out captured trailing newline of here-string in last record
array[-1]="${array[-1]%?}"
# Debug print array
declare -p array
string=hello123
for i in $(seq 0 ${#string})
do array[$i]=${string:$i:1}
done
echo "zero element of array is [${array[0]}]"
echo "entire array is [${array[#]}]"
The zero element of array is [h]. The entire array is [h e l l o 1 2 3 ].
Yet another on :), the stated question simply says 'Split string into character array' and don't say much about the state of the receiving array, and don't say much about special chars like and control chars.
My assumption is that if I want to split a string into an array of chars I want the receiving array containing just that string and no left over from previous runs, yet preserve any special chars.
For instance the proposed solution family like
for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done
Have left overs in the target array.
$ y=(1 2 3 4 5 6 7 8)
$ x=abc
$ for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done
$ printf '%s ' "${y[#]}"
a b c 4 5 6 7 8
Beside writing the long line each time we want to split a problem, so why not hide all this into a function we can keep is a package source file, with a API like
s2a "Long string" ArrayName
I got this one that seems to do the job.
$ s2a()
> { [ "$2" ] && typeset -n __=$2 && unset $2;
> [ "$1" ] && __+=("${1:0:1}") && s2a "${1:1}"
> }
$ a=(1 2 3 4 5 6 7 8 9 0) ; printf '%s ' "${a[#]}"
1 2 3 4 5 6 7 8 9 0
$ s2a "Split It" a ; printf '%s ' "${a[#]}"
S p l i t I t
If the text can contain spaces:
eval a=( $(echo "this is a test" | sed "s/\(.\)/'\1' /g") )
$ echo hello | awk NF=NF FS=
h e l l o
Or
$ echo hello | awk '$0=RT' RS=[[:alnum:]]
h
e
l
l
o
I know this is a "bash" question, but please let me show you the perfect solution in zsh, a shell very popular these days:
string='this is a string'
string_array=(${(s::)string}) #Parameter expansion. And that's it!
print ${(t)string_array} -> type array
print $#string_array -> 16 items
This is an old post/thread but with a new feature of bash v5.2+ using the shell option patsub_replacement and the =~ operator for regex. More or less same with #mr.spuratic post/answer.
str='There can be only one, the Highlander.'
regexp="${str//?/(&)}"
[[ "$str" =~ $regexp ]] &&
printf '%s\n' "${BASH_REMATCH[#]:1}"
Or by just: (which includes the whole string at index 0)
declare -p BASH_REMATCH
If that is not desired, one can remove the value of the first index (index 0), with
unset -v 'BASH_REMATCH[0]'
instead of using printf or echo to print the value of the array BASH_REMATCH
One can check/see the value of the variable "$regexp" with either
declare -p regexp
Output
declare -- regexp="(T)(h)(e)(r)(e)( )(c)(a)(n)( )(b)(e)( )(o)(n)(l)(y)( )(o)(n)(e)(,)( )(t)(h)(e)( )(H)(i)(g)(h)(l)(a)(n)(d)(e)(r)(.)"
or
echo "$regexp"
Using it in a script, one might want to test if the shopt is enabled or not, although the manual says it is on/enabled by default.
Something like.
if ! shopt -q patsub_replacement; then
shopt -s patsub_replacement
fi
But yeah, check the bash version too! If you're not sure which version of bash is in use.
if ! ((BASH_VERSINFO[0] >= 5 && BASH_VERSINFO[1] >= 2)); then
printf 'No dice! bash version 5.2+ is required!\n' >&2
exit 1
fi
Space can be excluded from regexp variable, change it from
regexp="${str//?/(&)}"
To
regexp="${str//[! ]/(&)}"
and the output is:
declare -- regexp="(T)(h)(e)(r)(e) (c)(a)(n) (b)(e) (o)(n)(l)(y) (o)(n)(e) (t)(h)(e) (H)(i)(g)(h)(l)(a)(n)(d)(e)(r)(.)"
Maybe not as efficient as the other post/answer but it is still a solution/option.
If you want to store this in an array, you can do this:
string=foo
unset chars
declare -a chars
while read -N 1
do
chars[${#chars[#]}]="$REPLY"
done <<<"$string"x
unset chars[$((${#chars[#]} - 1))]
unset chars[$((${#chars[#]} - 1))]
echo "Array: ${chars[#]}"
Array: f o o
echo "Array length: ${#chars[#]}"
Array length: 3
The final x is necessary to handle the fact that a newline is appended after $string if it doesn't contain one.
If you want to use NUL-separated characters, you can try this:
echo -n "$string" | while read -N 1
do
printf %s "$REPLY"
printf '\0'
done
AWK is quite convenient:
a='123'; echo $a | awk 'BEGIN{FS="";OFS=" "} {print $1,$2,$3}'
where FS and OFS is delimiter for read-in and print-out
For those who landed here searching how to do this in fish:
We can use the builtin string command (since v2.3.0) for string manipulation.
↪ string split '' abc
a
b
c
The output is a list, so array operations will work.
↪ for c in (string split '' abc)
echo char is $c
end
char is a
char is b
char is c
Here's a more complex example iterating over the string with an index.
↪ set --local chars (string split '' abc)
for i in (seq (count $chars))
echo $i: $chars[$i]
end
1: a
2: b
3: c
zsh solution: To put the scalar string variable into arr, which will be an array:
arr=(${(ps::)string})
If you also need support for strings with newlines, you can do:
str2arr(){ local string="$1"; mapfile -d $'\0' Chars < <(for i in $(seq 0 $((${#string}-1))); do printf '%s\u0000' "${string:$i:1}"; done); printf '%s' "(${Chars[*]#Q})" ;}
string=$(printf '%b' "apa\nbepa")
declare -a MyString=$(str2arr "$string")
declare -p MyString
# prints declare -a MyString=([0]="a" [1]="p" [2]="a" [3]=$'\n' [4]="b" [5]="e" [6]="p" [7]="a")
As a response to Alexandro de Oliveira, I think the following is more elegant or at least more intuitive:
while read -r -n1 c ; do arr+=("$c") ; done <<<"hejsan"
declare -r some_string='abcdefghijklmnopqrstuvwxyz'
declare -a some_array
declare -i idx
for ((idx = 0; idx < ${#some_string}; ++idx)); do
some_array+=("${some_string:idx:1}")
done
for idx in "${!some_array[#]}"; do
echo "$((idx)): ${some_array[idx]}"
done
Pure bash, no loop.
Another solution, similar to/adapted from Léa Gris' solution, but using read -a instead of readarray/mapfile :
#!/usr/bin/env bash
str='azerty'
# Need extglob for the replacement pattern
shopt -s extglob
# Split string characters into array
# ${str//?()/$'\x1F'} replace each character "c" with "^_c".
# ^_ (Control-_, 0x1f) is Unit Separator (US), you can choose another
# character.
IFS=$'\x1F' read -ra array <<< "${str//?()/$'\x1F'}"
# now, array[0] contains an empty string and the rest of array (starting
# from index 1) contains the original string characters :
declare -p array
# Or, if you prefer to keep the array "clean", you can delete
# the first element and pack the array :
unset array[0]
array=("${array[#]}")
declare -p array
However, I prefer the shorter (and easier to understand for me), where we remove the initial 0x1f before assigning the array :
#!/usr/bin/env bash
str='azerty'
shopt -s extglob
tmp="${str//?()/$'\x1F'}" # same as code above
tmp=${tmp#$'\x1F'} # remove initial 0x1f
IFS=$'\x1F' read -ra array <<< "$tmp" # assign array
declare -p array # verification

Extract variables values from command output Bash Shell

In bash shell I have a command that prints variable name and values such as:
Hello mate this is you variables:
MY_VAR1= this is the value of the first var
MY_VAR2= subvar1=27, subvar2=hello1
Have a good day!
The value of the variables in the output can contain characters as =,commas,;,:.., but I expect to find new line at the end of each variable value.
I need to create a short script which reads the values of MY_VAR1 and MY_VAR2.
So I need to end up with 2 variables as follows:
MY_VAR1 = this is the value of the first var
MY_VAR2 = subvar1=27, subvar2=hello1
I have a basic installation of CentOS 7, and I can't install additional stuff in that machine.
How can I achieve it?
I assume that the variable names are everything from the beginning of the lines, up to the = sign (excluded) with leading and trailing blanks removed (you cannot have blanks in your variable names).
If all you want is print the output you show, you can use something like:
while read line; do
if [[ $line =~ ^[[:blank:]]*([^[:blank:]]+)[[:blank:]]*=(.*)$ ]]; then
var="${BASH_REMATCH[1]}"
val="${BASH_REMATCH[2]}"
echo "$var = $val"
fi
done < <( my_command )
The =~ operator is a pattern matching with regular expressions. The regular expression ^[[:blank:]]*([^[:blank:]]+)[[:blank:]]*=(.*)$ models an output line of your command with variable assignment. It matches even if the variable name is surrounded with blanks. Two sub-expressions (enclosed in ()) are isolated: the variable name and the value. The BASH_REMATCH array contains the patterns that matched the sub-expressions in cells 1 and 2, respectively.
If you also want to assign the variables:
while read line; do
if [[ $line =~ ^[[:blank:]]*([^[:blank:]]+)[[:blank:]]*=(.*)$ ]]; then
var="${BASH_REMATCH[1]}"
val="${BASH_REMATCH[2]}"
echo "$var = $val"
declare $var="$val"
fi
done < <( my_command )

Bash shell test if all characters in one string are in another string

I have two strings which I want to compare for equal chars, the strings must contain the exact chars but mychars can have extra chars.
mychars="abcdefg"
testone="abcdefgh" # false h is not in mychars
testtwo="abcddabc" # true all char in testtwo are in mychars
function test() {
if each char in $1 is in $2 # PSEUDO CODE
then
return 1
else
return 0
fi
}
if test $testone $mychars; then
echo "All in the string" ;
else ; echo "Not all in the string" ; fi
# should echo "Not all in the string" because the h is not in the string mychars
if test $testtwo $mychars; then
echo "All in the string" ;
else ; echo "Not all in the string" ; fi
# should echo 'All in the string'
What is the best way to do this? My guess is to loop over all the chars in the first parameter.
You can use tr to replace any char from mychars with a symbol, then you can test if the resulting string is any different from the symbol, p.e.,:
tr -s "[$mychars]" "." <<< "ggaaabbbcdefg"
Outputs:
.
But:
tr -s "[$mychars]" "." <<< "xxxggaaabbbcdefgxxx"
Prints:
xxx.xxx
So, your function could be like the following:
function test() {
local dictionary="$1"
local res=$(tr -s "[$dictionary]" "." <<< "$2")
if [ "$res" == "." ]; then
return 1
else
return 0
fi
}
Update: As suggested by #mklement0, the whole function could be shortened (and the logic fixed) by the following:
function test() {
local dictionary="$1"
[[ '.' == $(tr -s "[$dictionary]" "." <<< "$2") ]]
}
The accepted answer's solution is short, clever, and efficient.
Here's a less efficient alternative, which may be of interest if you want to know which characters are unique to the 1st string, returned as a sorted, distinct list:
charTest() {
local charsUniqueToStr1
# Determine which chars. in $1 aren't in $2.
# This returns a sorted, distinct list of chars., each on its own line.
charsUniqueToStr1=$(comm -23 \
<(sed 's/\(.\)/\1\'$'\n''/g' <<<"$1" | sort -u) \
<(sed 's/\(.\)/\1\'$'\n''/g' <<<"$2" | sort -u))
# The test succeeds if there are no chars. in $1 that aren't also in $2.
[[ -z $charsUniqueToStr1 ]]
}
mychars="abcdefg" # define reference string
charTest "abcdefgh" "$mychars"
echo $? # print exit code: 1 - 'h' is not in reference string
charTest "abcddabc" "$mychars"
echo $? # print exit code: 0 - all chars. are in reference string
Note that I've renamed test() to charTest() to avoid a name collision with the test builtin/utility.
sed 's/\(.\)/\1\'$'\n''/g' splits the input into individual characters by placing each on a separate line.
Note that the command creates an extra empty line at the end, but that doesn't matter in this case; to eliminate it, append ; ${s/\n$//;} to the sed script.
The command is written in a POSIX-compliant manner, which complicates it, due to having to splice in an \-escaped actual newline (via an ANSI C-quoted string, $\n'); if you have GNU sed, you can simplify to sed -r 's/(.)/\1\n/g
sort -u then sorts the resulting list of characters and weeds out duplicates (-u).
comm -23 compares the distinct set of sorted characters in both strings and prints those unique to the 1st string (comm uses a 3-column layout, with the 1st column containing lines unique to the 1st file, the 2nd column containing lines unique to the 2nd column, and the 3rd column printing lines the two input files have in common; -23 suppresses the 2nd and 3rd columns, effectively only printing the lines that are unique to the 1st input).
[[ -z $charsUniqueToStr1 ]] then tests if $charsUniqueToStr1 is empty (-z);
in other words: success (exit code 0) is indicated, if the 1st string contains no chars. that aren't also contained in the 2nd string; otherwise, failure (exit code 1); by virtue of the conditional ([[ .. ]]) being the last statement in the function, its exit code also becomes the function's exit code.

How to split one string into multiple strings separated by at least one space in bash shell?

I have a string containing many words with at least one space between each two. How can I split the string into individual words so I can loop through them?
The string is passed as an argument. E.g. ${2} == "cat cat file". How can I loop through it?
Also, how can I check if a string contains spaces?
I like the conversion to an array, to be able to access individual elements:
sentence="this is a story"
stringarray=($sentence)
now you can access individual elements directly (it starts with 0):
echo ${stringarray[0]}
or convert back to string in order to loop:
for i in "${stringarray[#]}"
do
:
# do whatever on $i
done
Of course looping through the string directly was answered before, but that answer had the the disadvantage to not keep track of the individual elements for later use:
for i in $sentence
do
:
# do whatever on $i
done
See also Bash Array Reference.
Did you try just passing the string variable to a for loop? Bash, for one, will split on whitespace automatically.
sentence="This is a sentence."
for word in $sentence
do
echo $word
done
This
is
a
sentence.
Probably the easiest and most secure way in BASH 3 and above is:
var="string to split"
read -ra arr <<<"$var"
(where arr is the array which takes the split parts of the string) or, if there might be newlines in the input and you want more than just the first line:
var="string to split"
read -ra arr -d '' <<<"$var"
(please note the space in -d ''; it cannot be omitted), but this might give you an unexpected newline from <<<"$var" (as this implicitly adds an LF at the end).
Example:
touch NOPE
var="* a *"
read -ra arr <<<"$var"
for a in "${arr[#]}"; do echo "[$a]"; done
Outputs the expected
[*]
[a]
[*]
as this solution (in contrast to all previous solutions here) is not prone to unexpected and often uncontrollable shell globbing.
Also this gives you the full power of IFS as you probably want:
Example:
IFS=: read -ra arr < <(grep "^$USER:" /etc/passwd)
for a in "${arr[#]}"; do echo "[$a]"; done
Outputs something like:
[tino]
[x]
[1000]
[1000]
[Valentin Hilbig]
[/home/tino]
[/bin/bash]
As you can see, spaces can be preserved this way, too:
IFS=: read -ra arr <<<' split : this '
for a in "${arr[#]}"; do echo "[$a]"; done
outputs
[ split ]
[ this ]
Please note that the handling of IFS in BASH is a subject on its own, so do your tests; some interesting topics on this:
unset IFS: Ignores runs of SPC, TAB, NL and on line starts and ends
IFS='': No field separation, just reads everything
IFS=' ': Runs of SPC (and SPC only)
Some last examples:
var=$'\n\nthis is\n\n\na test\n\n'
IFS=$'\n' read -ra arr -d '' <<<"$var"
i=0; for a in "${arr[#]}"; do let i++; echo "$i [$a]"; done
outputs
1 [this is]
2 [a test]
while
unset IFS
var=$'\n\nthis is\n\n\na test\n\n'
read -ra arr -d '' <<<"$var"
i=0; for a in "${arr[#]}"; do let i++; echo "$i [$a]"; done
outputs
1 [this]
2 [is]
3 [a]
4 [test]
BTW:
If you are not used to $'ANSI-ESCAPED-STRING' get used to it; it's a timesaver.
If you do not include -r (like in read -a arr <<<"$var") then read does backslash escapes. This is left as exercise for the reader.
For the second question:
To test for something in a string I usually stick to case, as this can check for multiple cases at once (note: case only executes the first match, if you need fallthrough use multiple case statements), and this need is quite often the case (pun intended):
case "$var" in
'') empty_var;; # variable is empty
*' '*) have_space "$var";; # have SPC
*[[:space:]]*) have_whitespace "$var";; # have whitespaces like TAB
*[^-+.,A-Za-z0-9]*) have_nonalnum "$var";; # non-alphanum-chars found
*[-+.,]*) have_punctuation "$var";; # some punctuation chars found
*) default_case "$var";; # if all above does not match
esac
So you can set the return value to check for SPC like this:
case "$var" in (*' '*) true;; (*) false;; esac
Why case? Because it usually is a bit more readable than regex sequences, and thanks to Shell metacharacters it handles 99% of all needs very well.
Just use the shells "set" built-in. For example,
set $text
After that, individual words in $text will be in $1, $2, $3, etc. For robustness, one usually does
set -- junk $text
shift
to handle the case where $text is empty or start with a dash. For example:
text="This is a test"
set -- junk $text
shift
for word; do
echo "[$word]"
done
This prints
[This]
[is]
[a]
[test]
$ echo "This is a sentence." | tr -s " " "\012"
This
is
a
sentence.
For checking for spaces, use grep:
$ echo "This is a sentence." | grep " " > /dev/null
$ echo $?
0
$ echo "Thisisasentence." | grep " " > /dev/null
$ echo $?
1
echo $WORDS | xargs -n1 echo
This outputs every word, you can process that list as you see fit afterwards.
(A) To split a sentence into its words (space separated) you can simply use the default IFS by using
array=( $string )
Example running the following snippet
#!/bin/bash
sentence="this is the \"sentence\" 'you' want to split"
words=( $sentence )
len="${#words[#]}"
echo "words counted: $len"
printf "%s\n" "${words[#]}" ## print array
will output
words counted: 8
this
is
the
"sentence"
'you'
want
to
split
As you can see you can use single or double quotes too without any problem
Notes:
-- this is basically the same of mob's answer, but in this way you store the array for any further needing. If you only need a single loop, you can use his answer, which is one line shorter :)
-- please refer to this question for alternate methods to split a string based on delimiter.
(B) To check for a character in a string you can also use a regular expression match.
Example to check for the presence of a space character you can use:
regex='\s{1,}'
if [[ "$sentence" =~ $regex ]]
then
echo "Space here!";
fi
For checking spaces just with bash:
[[ "$str" = "${str% *}" ]] && echo "no spaces" || echo "has spaces"
$ echo foo bar baz | sed 's/ /\n/g'
foo
bar
baz
For my use case, the best option was:
grep -oP '\w+' file
Basically this is a regular expression that matches contiguous non-whitespace characters. This means that any type and any amount of whitespace won't match. The -o parameter outputs each word matches on a different line.
Another take on this (using Perl):
$ echo foo bar baz | perl -nE 'say for split /\s/'
foo
bar
baz

Passing a string with spaces as a function argument in Bash

I'm writing a Bash script where I need to pass a string containing spaces to a function in my Bash script.
For example:
#!/bin/bash
myFunction
{
echo $1
echo $2
echo $3
}
myFunction "firstString" "second string with spaces" "thirdString"
When run, the output I'd expect is:
firstString
second string with spaces
thirdString
However, what's actually output is:
firstString
second
string
Is there a way to pass a string with spaces as a single argument to a function in Bash?
You should add quotes and also, your function declaration is wrong.
myFunction()
{
echo "$1"
echo "$2"
echo "$3"
}
And like the others, it works for me as well.
Another solution to the issue above is to set each string to a variable, call the function with variables denoted by a literal dollar sign \$. Then in the function use eval to read the variable and output as expected.
#!/usr/bin/ksh
myFunction()
{
eval string1="$1"
eval string2="$2"
eval string3="$3"
echo "string1 = ${string1}"
echo "string2 = ${string2}"
echo "string3 = ${string3}"
}
var1="firstString"
var2="second string with spaces"
var3="thirdString"
myFunction "\${var1}" "\${var2}" "\${var3}"
exit 0
Output is then:
string1 = firstString
string2 = second string with spaces
string3 = thirdString
In trying to solve a similar problem to this, I was running into the issue of UNIX thinking my variables were space delimeted. I was trying to pass a pipe delimited string to a function using awk to set a series of variables later used to create a report. I initially tried the solution posted by ghostdog74 but could not get it to work as not all of my parameters were being passed in quotes. After adding double-quotes to each parameter it then began to function as expected.
Below is the before state of my code and fully functioning after state.
Before - Non Functioning Code
#!/usr/bin/ksh
#*******************************************************************************
# Setup Function To Extract Each Field For The Error Report
#*******************************************************************************
getField(){
detailedString="$1"
fieldNumber=$2
# Retrieves Column ${fieldNumber} From The Pipe Delimited ${detailedString}
# And Strips Leading And Trailing Spaces
echo ${detailedString} | awk -F '|' -v VAR=${fieldNumber} '{ print $VAR }' | sed 's/^[ \t]*//;s/[ \t]*$//'
}
while read LINE
do
var1="$LINE"
# Below Does Not Work Since There Are Not Quotes Around The 3
iputId=$(getField "${var1}" 3)
done<${someFile}
exit 0
After - Functioning Code
#!/usr/bin/ksh
#*******************************************************************************
# Setup Function To Extract Each Field For The Report
#*******************************************************************************
getField(){
detailedString="$1"
fieldNumber=$2
# Retrieves Column ${fieldNumber} From The Pipe Delimited ${detailedString}
# And Strips Leading And Trailing Spaces
echo ${detailedString} | awk -F '|' -v VAR=${fieldNumber} '{ print $VAR }' | sed 's/^[ \t]*//;s/[ \t]*$//'
}
while read LINE
do
var1="$LINE"
# Below Now Works As There Are Quotes Around The 3
iputId=$(getField "${var1}" "3")
done<${someFile}
exit 0
A more dynamic way would be:
function myFunction {
for i in "$*"; do echo "$i"; done;
}
The simplest solution to this problem is that you just need to use \" for space separated arguments when running a shell script:
#!/bin/bash
myFunction() {
echo $1
echo $2
echo $3
}
myFunction "firstString" "\"Hello World\"" "thirdString"
Your definition of myFunction is wrong. It should be:
myFunction()
{
# same as before
}
or:
function myFunction
{
# same as before
}
Anyway, it looks fine and works fine for me on Bash 3.2.48.
Simple solution that worked for me -- quoted $#
Test(){
set -x
grep "$#" /etc/hosts
set +x
}
Test -i "3 rb"
+ grep -i '3 rb' /etc/hosts
I could verify the actual grep command (thanks to set -x).
You could have an extension of this problem in case of your initial text was set into a string type variable, for example:
function status(){
if [ $1 != "stopped" ]; then
artist="ABC";
track="CDE";
album="DEF";
status_message="The current track is $track at $album by $artist";
echo $status_message;
read_status $1 "$status_message";
fi
}
function read_status(){
if [ $1 != "playing" ]; then
echo $2
fi
}
In this case if you don't pass the status_message variable forward as string (surrounded by "") it will be split in a mount of different arguments.
"$variable": The current track is CDE at DEF by ABC
$variable: The
I had the same kind of problem and in fact the problem was not the function nor the function call, but what I passed as arguments to the function.
The function was called from the body of the script - the 'main' - so I passed "st1 a b" "st2 c d" "st3 e f" from the command line and passed it over to the function using myFunction $*
The $* causes the problem as it expands into a set of characters which will be interpreted in the call to the function using whitespace as a delimiter.
The solution was to change the call to the function in explicit argument handling from the 'main' towards the function: the call would then be myFunction "$1" "$2" "$3" which will preserve the whitespace inside strings as the quotes will delimit the arguments...
So if a parameter can contain spaces, it should be handled explicitly throughout all calls of functions.
As this may be the reason for long searches to problems, it may be wise never to use $* to pass arguments...

Resources