bash - split string into array WITH empty values - bash

pattern="::a::b::"
oldIFS=$IFS
IFS="::"
read -r -a extractees <<< $pattern
IFS=$oldIFS
this results in
{"a","b"}
however, I need to maintain the indices, so I want
{"","a","b",""}
(for comparison, if I wanted {"a","b"}, I would have written "a::b".
Why? because these elements are later split again (on a different delimiter) and the empty "" values should result in an empty list then.
How do I achieve this?

No field separator can be longer than 1 character, unfortunately, so '::' → ':'.
Aside of that, globbing should be explicitly turned off to prevent potential filename expansion in an unquoted variable.
set -f # disable globbing
pattern=":a:b c:"
oldIFS=$IFS
IFS=":"
extractees=($pattern)
IFS=$oldIFS
echo "'${extractees[0]}'"
echo "'${extractees[1]}'"
echo "'${extractees[2]}'"
echo "'${extractees[3]}'"

Related

echo inside a for loop lists files matching the output pattern

I have a problem with the following for loop:
X="*back* OLD"
for P in $X
do
echo "-$P"
done
I need it to output just:
-*back*
-OLD
However, it lists all files in the current directory matching the *back* pattern. For example it gives the following:
-backup.bkp
-backup_new.bkp
-backup_X
-OLD
How to force it to output the exact pattern?
Use an array, as unquoted parameter expansions are still subject to globbing.
X=( "*back*" OLD )
for P in "${X[#]}"; do
printf '%s\n' "$P"
done
(Use printf, as echo could try to interpret an argument as an option, for example, if you had n in the value of X.)
Use set -o noglob before your loop and set +o noglob after to disable and enable globbing.
To prevent filename expansion you could read in the string as a Here String.
To iterate over the items, you could turn them into lines using parameter expansion and read them linewise using read. In order to be able to put a - sign as the first character, use printf instead of echo.
X="*back* OLD"
while read -r x
do printf -- '-%s\n' "$x"
done <<< "${X/ /$'\n'}"
Another way could be to use tr to transform the string into lines, then use paste with the - sign as delimiter and "nothing" from /dev/null as first column.
X="*back* OLD"
tr ' ' '\n' <<< "$X" | paste -d- /dev/null -
Both should output:
-*back*
-OLD

bash: IFS "stuck" after temporarily changing it for array building

I've encountered a strange problem after temporarily changing IFS for the purpose of array building:
$ echo "1 2 3" |while read myVar1 myVar2; do echo "myVar1: ${myVar1}"; echo "myVar2: ${myVar2}"; done
myVar1: 1
myVar2: 2 3
$ IFS=':' myPaths=( ${PATH} ) # this works: I have /home/morgwai/bin on ${myPaths[0]} , /usr/local/sbin on ${myPaths[1]} and so on
$ echo "1 2 3" |while read myVar1 myVar2; do echo "myVar1: ${myVar1}"; echo "myVar2: ${myVar2}"; done
myVar1: 1 2 3
myVar2:
$ echo $IFS
$ echo "1:2:3" |while read myVar1 myVar2; do echo "myVar1: ${myVar1}"; echo "myVar2: ${myVar2}"; done ;
myVar1: 1
myVar2: 2:3
Normally when I change IFS temporarily for any other command than array building (for example IFS=',' echo whatever) its value is changed only during the execution of that, however here it seems as if IFS got permanently changed to a colon (although echo $IFS doesn't show this, which is even more strange...).
Is this a bug or somehow an expected behavior that I don't understand?
I'm using bash version 4.4.18 if it matters...
Note: I know that I can build the same array using IFS=':' read -a myPaths <<< ${PATH} and then IFS gets reverted to the default value normally, but that's not the point: I'm trying to understand what actually happens in the example I gave above.
Thanks!
You're just setting variables, not setting a variable followed by executing a command (ie, the way you build array is a pure variable assignment, not a command, hence both assignments become permanent).
The issue with an IFS of : not showing up in echo $IFS is caused by shell parameter expansion and word splitting.
Consider:
$ IFS=:
$ echo $IFS
$ echo "$IFS"
:
When a parameter expansion is not quoted, it undergoes word splitting afterwords. From the manual:
The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.
and
The shell treats each character of $IFS as a delimiter, and splits the results of the other expansions into words using these characters as field terminators. If IFS is unset, or its value is exactly <space><tab><newline>, the default, then sequences of <space>, <tab>, and <newline> at the beginning and end of the results of the previous expansions are ignored, and any sequence of IFS characters not at the beginning or end serves to delimit words.
So when IFS is a colon, splitting a word consisting of just a colon results in a (single) empty word. Always quote your variables to prevent unexpected gotchas like this.

Add substring to a string in bash

I have the following array:
SPECIFIC_FILES=('resources/logo.png' 'resources/splash.png' 'www/img/logo.png' 'www/manifest.json')
And the following variable:
CUSTOMER=default
How can I loop through my array and generate strings that would look like
resources/logo_default.png
depending on the variable.
The below uses parameter expansion to extract the relevant substrings, as also described in BashFAQ #100:
specific_files=('resources/logo.png' 'resources/splash.png' 'www/img/logo.png' 'www/manifest.json')
customer=default
for file in "${specific_files[#]}"; do
[[ $file = *.* ]] || continue # skip files without extensions
prefix=${file%.*} # trim everything including and after last "."
suffix=${file##*.} # trim everything up to and including last "."
printf '%s\n' "${prefix}_$customer.$suffix" # concatenate results of those operations
done
Lower-case variable names are used here in keeping with POSIX-specified conventions (all-caps names are used for variables meaningful to the operating system or shell, whereas variables with at least one lower-case character are reserved for application use; setting a regular shell variable overwrites any like-named environment variable, so the conventions apply to both classes).
Here's a solution with sed:
for f in "${SPECIFIC_FILES[#]}"; do
echo "$f" | sed "s/\(.*\)\.\([^.]*\)/\1_${CUSTOMER}.\2/p"
done
If you know that there is only one period per filename, you can use expansion on each element directly:
$ printf '%s\n' "${SPECIFIC_FILES[#]/./_"$CUSTOMER".}"
resources/logo_default.png
resources/splash_default.png
www/img/logo_default.png
www/manifest_default.json
If you don't, Charles' answer is the robust one covering all cases.

What is the difference between ${#:2} and ${*:2} in these cases?

Given the following code in bash:
filename=${1}
content1=${#:2}
content2="${#:2}"
content3=${*:2}
content4="${*:2}"
echo "${content1}" > "${filename}"
echo "${content2}" >> "${filename}"
echo "${content3}" >> "${filename}"
echo "${content4}" >> "${filename}"
What is the difference between the "contents" ? When I will see the difference? What is the better way to save the content that I get and why?
In an assignment, the right-hand side doesn't have to be quoted to prevent word splitting and globbing, unless it contains a blank.
These two are the same:
myvar=$var
myvar="$var"
but these are not:
myvar='has space'
myvar=has space
The last one tries to run a command space with the environment variable myvar set to value has.
This means that content1 is the same as content2, and content3 is the same as content4, as they only differ in RHS quoting.
The differences thus boil down to the difference between $# and $*; the fact that subarrays are used doesn't matter. Quoting from the manual:
(for $*):
When the expansion occurs within double quotes, it expands to a single word with the value of each parameter separated by the first character of the IFS special variable.
(for $#):
When the expansion occurs within double quotes, each parameter expands to a separate word.
Since you're using quotes (as you almost always should when using $# or $*), the difference is that "${*:2}" is a single string, separated by the first character of IFS, and "${#:2}" expands to separate words, blank separated.
Example:
$ set -- par1 par2 par3 # Set $1, $2, and $3
$ printf '<%s>\n' "${#:2}" # Separate words
<par2>
<par3>
$ printf '<%s>\n' "${*:2}" # Single word, blank separated (default IFS)
<par2 par3>
$ IFS=, # Change IFS
$ printf '<%s>\n' "${*:2}" # Single word, comma separated (new IFS)
<par2,par3>
As for when to use $# vs. $*: as a rule of thumb, you almost always want "$#" and almost never the unquoted version of either, as the latter is subject to word splitting and globbing.
"$*" is useful if you want to join array elements into a single string as in the example, but the most common case, I guess, is iterating over positional parameters in a script or function, and that's what "$#" is for. See also this Bash Pitfall.
The only difference is that using $* will cause the arguments to be concatenated with the first character of IFS, while $# will cause the arguments to be concatenated with a single space (regardless of the value of IFS):
$ set a b c
$ IFS=-
$ c1="$*"
$ c2="$#"
$ echo "$c1"
a-b-c
$ echo "$c2"
a b c
The quotes aren't particularly important here, as expansions on the right-hand side of an assignment aren't subject to word-splitting or pathname expansion; content1 and content2 should always be identical, as should be content3 and content4.
Which is better depends on what you want.

linux shell script: split string, put them in an array then loop through them [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Split string based on delimiter in Bash?
In a bash script how do I split string with a separator like ; and loop through the resulting array?
You can probably skip the step of explicitly creating an array...
One trick that I like to use is to set the inter-field separator (IFS) to the delimiter character. This is especially handy for iterating through the space or return delimited results from the stdout of any of a number of unix commands.
Below is an example using semicolons (as you had mentioned in your question):
export IFS=";"
sentence="one;two;three"
for word in $sentence; do
echo "$word"
done
Note: in regular Bourne-shell scripting setting and exporting the IFS would occur on two separate lines (IFS='x'; export IFS;).
If you don't wish to mess with IFS (perhaps for the code within the loop) this might help.
If know that your string will not have whitespace, you can substitute the ';' with a space and use the for/in construct:
#local str
for str in ${STR//;/ } ; do
echo "+ \"$str\""
done
But if you might have whitespace, then for this approach you will need to use a temp variable to hold the "rest" like this:
#local str rest
rest=$STR
while [ -n "$rest" ] ; do
str=${rest%%;*} # Everything up to the first ';'
# Trim up to the first ';' -- and handle final case, too.
[ "$rest" = "${rest/;/}" ] && rest= || rest=${rest#*;}
echo "+ \"$str\""
done
Here's a variation on ashirazi's answer which doesn't rely on $IFS. It does have its own issues which I ouline below.
sentence="one;two;three"
sentence=${sentence//;/$'\n'} # change the semicolons to white space
for word in $sentence
do
echo "$word"
done
Here I've used a newline, but you could use a tab "\t" or a space. However, if any of those characters are in the text it will be split there, too. That's the advantage of $IFS - it can not only enable a separator, but disable the default ones. Just make sure you save its value before you change it - as others have suggested.
Here is an example code that you may use:
$ STR="String;1;2;3"
$ for EACH in `echo "$STR" | grep -o -e "[^;]*"`; do
echo "Found: \"$EACH\"";
done
grep -o -e "[^;]*" will select anything that is not ';', therefore spliting the string by ';'.
Hope that help.
sentence="one;two;three"
a="${sentence};"
while [ -n "${a}" ]
do
echo ${a%%;*}
a=${a#*;}
done

Resources