Add substring to a string in bash - bash

I have the following array:
SPECIFIC_FILES=('resources/logo.png' 'resources/splash.png' 'www/img/logo.png' 'www/manifest.json')
And the following variable:
CUSTOMER=default
How can I loop through my array and generate strings that would look like
resources/logo_default.png
depending on the variable.

The below uses parameter expansion to extract the relevant substrings, as also described in BashFAQ #100:
specific_files=('resources/logo.png' 'resources/splash.png' 'www/img/logo.png' 'www/manifest.json')
customer=default
for file in "${specific_files[#]}"; do
[[ $file = *.* ]] || continue # skip files without extensions
prefix=${file%.*} # trim everything including and after last "."
suffix=${file##*.} # trim everything up to and including last "."
printf '%s\n' "${prefix}_$customer.$suffix" # concatenate results of those operations
done
Lower-case variable names are used here in keeping with POSIX-specified conventions (all-caps names are used for variables meaningful to the operating system or shell, whereas variables with at least one lower-case character are reserved for application use; setting a regular shell variable overwrites any like-named environment variable, so the conventions apply to both classes).

Here's a solution with sed:
for f in "${SPECIFIC_FILES[#]}"; do
echo "$f" | sed "s/\(.*\)\.\([^.]*\)/\1_${CUSTOMER}.\2/p"
done

If you know that there is only one period per filename, you can use expansion on each element directly:
$ printf '%s\n' "${SPECIFIC_FILES[#]/./_"$CUSTOMER".}"
resources/logo_default.png
resources/splash_default.png
www/img/logo_default.png
www/manifest_default.json
If you don't, Charles' answer is the robust one covering all cases.

Related

Multiple elements instead of one in bash script for loop

I have been following the answers given in these questions
Shellscript Looping Through All Files in a Folder
How to iterate over files in a directory with Bash?
to write a bash script which goes over files inside a folder and processes them. So, here is the code I have:
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in "$INFOLDER$YEAR*.mdb";
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done
I am receiving this error: basename: extra operand.
I added echo $f and I realized that f contains all the filenames separated by space. But I expected to get one at a time. What could be the problem here?
You're running into problems with quoting. In the shell, double-quotes prevent word splitting and wildcard expansion; generally, you don't want these things to happen to variable's values, so you should double-quote variable references. But when you have something that should be word-split or wildcard-expanded, it cannot be double-quoted. In your for statement, you have the entire file pattern in double-quotes:
for f in "$INFOLDER$YEAR*.mdb";
...which prevents word-splitting and wildcard expansion on the variables' values (good) but also prevents it on the * which you need expanded (that's the point of the loop). So you need to quote selectively, with the variables inside quotes and the wildcard outside them:
for f in "$INFOLDER$YEAR"*.mdb;
And then inside the loop, you should double-quote the references to $f in case any filenames contain whitespace or wildcards (which are completely legal in filenames):
echo "$f"
absname="$INFOLDER$YEAR$(basename "$f")"
(Note: the double-quotes around the assignment to absname aren't actually needed -- the right side of an assignment is one of the few places in the shell where it's safe to skip them -- but IMO it's easier and safer to just double-quote all variable references and $( ) expressions than to try to keep track of where it's safe and where it's not.)
Just quote your shell variables if they are supposed to contain strings with spaces in between.
basename "$f"
Not doing so will lead to splitting of the string into separate characters (see WordSplitting in bash), thereby messing up the basename command which expects one string argument rather than multiple.
Also it would be a wise to include the * outside the double-quotes as shell globbing wouldn't work inside them (single or double-quote).
#!/bin/bash
# good practice to lower-case variable names to distinguish them from
# shell environment variables
year="2002/"
in_folder="/local/data/datasets/Convergence/"
for file in "${in_folder}${year}"*.mdb; do
# break the loop gracefully if no files are found
[ -e "$file" ] || continue
echo "$file"
# Worth noting here, the $file returns the name of the file
# with absolute path just as below. You don't need to
# construct in manually
absname=${in_folder}${year}$(basename "$file")
done
just remove "" from this line
for f in "$INFOLDER$YEAR*.mdb";
so it looks like this
#!/bin/bash
YEAR="2002/"
INFOLDER="/local/data/datasets/Convergence/"
for f in $INFOLDER$YEAR*.mdb;
do
echo $f
absname=$INFOLDER$YEAR$(basename $f)
# ... the rest of the script ...
done

bash - split string into array WITH empty values

pattern="::a::b::"
oldIFS=$IFS
IFS="::"
read -r -a extractees <<< $pattern
IFS=$oldIFS
this results in
{"a","b"}
however, I need to maintain the indices, so I want
{"","a","b",""}
(for comparison, if I wanted {"a","b"}, I would have written "a::b".
Why? because these elements are later split again (on a different delimiter) and the empty "" values should result in an empty list then.
How do I achieve this?
No field separator can be longer than 1 character, unfortunately, so '::' → ':'.
Aside of that, globbing should be explicitly turned off to prevent potential filename expansion in an unquoted variable.
set -f # disable globbing
pattern=":a:b c:"
oldIFS=$IFS
IFS=":"
extractees=($pattern)
IFS=$oldIFS
echo "'${extractees[0]}'"
echo "'${extractees[1]}'"
echo "'${extractees[2]}'"
echo "'${extractees[3]}'"

Correctly allow word splitting of command substitution in bash

I write, maintain and use a healthy amount of bash scripts. I would consider myself a bash hacker and strive to someday be a bash ninja ( need to learn more awk first ). One of the most important feature/frustrations of bash to understand is how quotes, and subsequent parameter expansion, work. This is well documented, and for a good reason, many pitfalls, bugs and newbie-traps exist in the mysterious world of quoted parameter expansion and word splitting. For this reason, the advice is to "Double quote everything," but what if I want word splitting to occur?
In multiple style guides I can not find an example of safe and proper use of word splitting after command substitution.
What is the correct way to use unquoted command substitution?
Example:
I don't need help getting this command working, but it seems to be a violation of established patterns, if you would like to give feedback on this command, please keep it in comments
docker stats $(docker ps | awk '{print $NF}' | grep -v NAMES)
The command substitute returns output such as:
container-1 container-3 excitable-newton
This one-liner uses the command substitution to spit out the names of each of my running docker containers and the feeds them, with word splitting, as separate inputs to the docker stats command, which takes an arbitrary length list of container names and gives back some info about them.
If I used:
docker stats "$(docker ps | awk '{print $NF}' | grep -v NAMES)"
There would be one string of newline separated container names passed to docker stats.
This seems like a perfect example of when I would want word splitting, but shellcheck disagrees, is this somehow unsafe? Is there an established pattern for using word-splitting after expansion or substitution?
The safe way to capture output from one command and pass it to another is to temporarily capture the output in an array. This allows splitting on arbitrary delimiters and prevents unintentional splitting or globbing while capturing output as more than one string to be passed on to another command.
If you want to read a space-separated string into an array, use read -a:
read -r -a names < <(docker ps | awk '{print $NF}' | grep -v NAMES)
printf 'Found name: %s\n' "${names[#]}"
Unlike the unquoted-expansion approach, this doesn't expand globs. Thus, foo[bar] can't be replaced with a filesystem entry named foob, or with an empty string if no such filesystem entry exists and the nullglob shell option is set. (Likewise, * will no longer be replaced with a list of files in the current directory).
To go into detail regarding behavior: read -r -a reads up to a delimiter passed as the first character of the option argument following -d (if given), or a NUL if that option argument is 0 bytes, and splits the results into fields based on characters within IFS -- a set which, by default, contains the newline, the tab, and the space; it then assigns those split results to an array.
This behavior does not meaningfully vary based on shell-local configuration, except for IFS, which can be modified scoped to the single command.
mapfile -t and readarray -t are similarly consistent in behavior, and likewise recommended if portability constraints do not prevent their use.
By contrast, array=( $string ) is much more dependent on the shell's configuration and settings, and will behave badly if the shell's configuration is left at defaults:
When using array=( $string ), if set -f is not set, each word created by splitting $string is evaluated as a glob, with further variances based in behavior depending on the shopt settings nullglob (which would cause a pattern which didn't expand to any contents to result in an empty set, rather than the default of expanding to the glob expression itself), failglob (which would cause a pattern which didn't expand to any contents to result in a failure), extglob, dotglob and others.
When using array=( $string ), the value of IFS used for the split operation cannot be easily and reliably altered in a manner scoped to this single operation. By contrast, one can run IFS=: read to force read to split only on :s without modifying the value of IFS outside the scope of that single value; no equivalent for array=( $string ) exists without storing and re-setting IFS (which is an error-prone operation; some common idioms [such as assignment to oIFS or a similar variable name] operate contrary to intent in common scenarios, such as failing to reproduce an unset or empty IFS at the end of the block to which the temporary modification is intended to apply).
Thanks to #I'L'I's pointing to an example of a valid exception to the "Quote Everything" rule, my code does appear to be a exception to the rule.
In my particular use case, using docker container names, the risk of accidental globbing or expansion is low due to the constraints on container names. However #Charles Duffy provided a surefire and safe way to go about word splitting one command output before feeding it into the next command by reading the first output into an array using bash built-in read ( I found readarray better suited my case ).
readarray -t names < <(docker ps | awk '{print $NF}' | grep -v NAMES)
docker stats "${names[#]}"
This pattern allows for the output from the first command to be fed to the second command as properly split, separate arguments while avoiding unwanted globbing or splitting. Unfortunately my slick one-liner will perish in favor of safety.

Reading an array from a file in bash - "not found" errors using cat

I have a text file with a few basic words:
-banana
-mango
-sleep
When I run my script:
#!/bin/sh
WORD_FILE="testwoord.txt"
WORDS_ARRAY=cat $WORD_FILE
The output is like:
/home/username/bin/testword.txt: 1 /home/username/bin/restword.txt: banana: not found
/home/username/bin/testword.txt: 1 /home/username/bin/restword.txt: mango: not found
/home/username/bin/testword.txt: 1 /home/username/bin/restword.txt: sleep: not found
Why is it doing this? What I actually want is a script that reads words from a .txt file and puts it in an array.
To explain why this doesn't work:
WORDS_ARRAY=cat $WORD_FILE
runs the command generated by expanding, string-splitting, and glob-expanding $WORD_FILE with the variable WORDS_ARRAY exported in the environment with the value cat.
Instead, consider:
#!/bin/bash
# ^^ -- important that this is bash, not sh: POSIX sh doesn't have arrays!
WORD_FILE=testword.txt
readarray -t WORDS_ARRAY <"$WORD_FILE"
printf 'Read a word: %q\n' "${WORDS_ARRAY[#]}"
...which will create an actual array, not a string variable containing whitespace (as WORDS_ARRAY=$(cat $WORD_FILE) would).
By the way, using all-upper-case variable names is bad form here. To quote the POSIX spec:
Environment variable names used by the utilities in the Shell and Utilities volume of POSIX.1-2008 consist solely of uppercase letters, digits, and the ( '_' ) from the characters defined in Portable Character Set and do not begin with a digit. Other characters may be permitted by an implementation; applications shall tolerate the presence of such names. Uppercase and lowercase letters shall retain their unique identities and shall not be folded together. The name space of environment variable names containing lowercase letters is reserved for applications. Applications can define any environment variables with names from this name space without modifying the behavior of the standard utilities.
To complement Charles Duffy's helpful answer:
Note that the variable names were changed to lowercase, as per Charles' recommendation.
Here a bash 3.x (and above) version of the command for reading lines into a bash array (readarray requires bash 4.x):
IFS=$'\n' read -d '' -ra words_array < "$word_file"
If you want to store individual words (across lines), use:
read -d '' -ra words_array < "$word_file"
To print the resulting array:
for ((i=0; i<"${#words_array[#]}"; i++)); do echo "word #$i: "${words_array[i]}""; done

Extract parts of file path and concatenate in bash

I am new to bash scripting and my dir structure looks like below.
"/ABC/DEF/GHI/JKL/2015/01/01"
I am trying to produce the output like this - "JKL_2015-01-01".
I am trying using sed and cut and might take a while but this is needed immediately and any help is appreciated. Thanks.
i=/ABC/DEF/GHI/JKL/2015/01/01
o=`echo $i | sed -r 's|^.+/([^/]+)/([0-9]+)/([0-9]+)/([0-9]+)$|\1_\2-\3-\4|'`
i=xxx is a variable assignment, no whitespace around = allowed!
`command`
enclosed by backticks is a command substitution, which captures the standard output of the command inside as a string.
And sed is the stream editor, applying mostly regex based operations to each line from standard input, and emitting the result on standard output.
sed's s||| operation is regex based substitution. I capture 4 character groups with parens (): (non-slashes), slash, (numbers), slash, (numbers), slash, (numbers), end-of-string $. Then in the second part of the subst I print the for captured groups, separated by an underscore and 2 dashes, respectively.
There's no need to use tools that aren't built into bash for this -- using builtins is far more efficient than external tools like sed.
s="/ABC/DEF/GHI/JKL/2015/01/01"
s_re='/([^/]+)/([^/]+)/([^/]+)/([^/]+)$'
if [[ $s =~ $s_re ]]; then
name="${BASH_REMATCH[1]}_${BASH_REMATCH[2]}-${BASH_REMATCH[3]}-${BASH_REMATCH[4]}"
echo "$name"
fi
Alternately, and perhaps more readably (using string manipulation techniques documented in BashFAQ #100):
s="/ABC/DEF/GHI/JKL/2015/01/01"
s_prefix=${s%/*/*/*/*} # find the content we don't care about
s_suffix=${s#"$s_prefix"/} # strip that content
# read the rest into named variables
IFS=/ read -r category year month day <<<"$s_suffix"
# assemble those named variables into the string we care about
echo "${category}_${year}-${month}-${day}"

Resources