Read a single word from a string and set as a variable value in bash - bash

I have a need for a simple function to do the following:
#!/bin/bash
select_word() {
echo "Enter a string."
read STRING
## User enters "This is a string."
## Here will be the command to set each word to the below variables.
WORD_1=
WORD_2=
WORD_3=
WORD_4=
echo -e "Word 1 is: $WORD_1.\n Word 2 is: $WORD_2.\n"
echo -e "Word 3 is: $WORD_3.\n Word 4 is: $WORD_4.\n"
}
I am wanting to avoid using external tools such as sed or awk. Looking for bash builtin functions to use in order to pull each word from the string and set that word as a variable value. I will later use "wc" to count the number of characters in each word. I already know how to do that, I just need to know the bash method to pulling a word from user input strings.
If this question is a duplicate, I apologize as I could not find this specific question.

read can split the string itself.
read -r WORD1 WORD2 WORD3 WORD4
If you enter fewer than 4 words, the last variable(s) will be set to empty strings. If you enter more than 4 words, WORD4 will be the rest of the string, not just the 4th word.
You can also split the string into an array, if you don't know how many words will be entered ahead of time.
read -a words
WORD1=${words[0]}
WORD2=${words[1]}
WORD3=${words[2]}
WORD4=${words[3]}

You can convert the space delimited $STRING into an array, and then reference each array element in $WORD_n variables:
WORDS=($STRING)
WORD_1=${WORDS[0]}
WORD_2=${WORDS[1]}
Of with set -f, which will disable globing (e.g. changing * to list of files in current directory):
set -f
WORDS=($STRING)
set +f
WORD_1=${WORDS[0]}
WORD_2=${WORDS[1]}

If you need to be independent of the number of words typed, you can do:
#!/bin/bash
select_word() {
echo "Enter a string."
read STRING
## User enters "This is a string."
GENERAL_VAR_NAME="WORD_"
## Here will be the command to set each word to the below variables.
# Split string into an array, default delimiter whitespace
STRING=( $STRING )
# Number of array elements: ${#STRING[#]}
for (( c=1; c<=${#STRING[#]}; c++ ))
do
#echo -e "WORD $c is: ${STRING[$c]}"
NEW_WORD=`echo -e $GENERAL_VAR_NAME${c}`
printf -v $NEW_WORD "${STRING[$c]}"
done
# Check the output
for (( i=1; i<=${#STRING[#]}; i++ )); do
echo "WORD_$i is $WORD_$i"
done
}
select_word

Related

In bash how can I get the last part of a string after the last hyphen [duplicate]

I have this variable:
A="Some variable has value abc.123"
I need to extract this value i.e abc.123. Is this possible in bash?
Simplest is
echo "$A" | awk '{print $NF}'
Edit: explanation of how this works...
awk breaks the input into different fields, using whitespace as the separator by default. Hardcoding 5 in place of NF prints out the 5th field in the input:
echo "$A" | awk '{print $5}'
NF is a built-in awk variable that gives the total number of fields in the current record. The following returns the number 5 because there are 5 fields in the string "Some variable has value abc.123":
echo "$A" | awk '{print NF}'
Combining $ with NF outputs the last field in the string, no matter how many fields your string contains.
Yes; this:
A="Some variable has value abc.123"
echo "${A##* }"
will print this:
abc.123
(The ${parameter##word} notation is explained in ยง3.5.3 "Shell Parameter Expansion" of the Bash Reference Manual.)
Some examples using parameter expansion
A="Some variable has value abc.123"
echo "${A##* }"
abc.123
Longest match on " " space
echo "${A% *}"
Some variable has value
Longest match on . dot
echo "${A%.*}"
Some variable has value abc
Shortest match on " " space
echo "${A%% *}"
some
Read more Shell-Parameter-Expansion
The documentation is a bit painful to read, so I've summarised it in a simpler way.
Note that the '*' needs to swap places with the ' ' depending on whether you use # or %. (The * is just a wildcard, so you may need to take off your "regex hat" while reading.)
${A% *} - remove shortest trailing * (strip the last word)
${A%% *} - remove longest trailing * (strip the last words)
${A#* } - remove shortest leading * (strip the first word)
${A##* } - remove longest leading * (strip the first words)
Of course a "word" here may contain any character that isn't a literal space.
You might commonly use this syntax to trim filenames:
${A##*/} removes all containing folders, if any, from the start of the path, e.g.
/usr/bin/git -> git
/usr/bin/ -> (empty string)
${A%/*} removes the last file/folder/trailing slash, if any, from the end:
/usr/bin/git -> /usr/bin
/usr/bin/ -> /usr/bin
${A%.*} removes the last extension, if any (just be wary of things like my.path/noext):
archive.tar.gz -> archive.tar
How do you know where the value begins? If it's always the 5th and 6th words, you could use e.g.:
B=$(echo "$A" | cut -d ' ' -f 5-)
This uses the cut command to slice out part of the line, using a simple space as the word delimiter.
As pointed out by Zedfoxus here. A very clean method that works on all Unix-based systems. Besides, you don't need to know the exact position of the substring.
A="Some variable has value abc.123"
echo "$A" | rev | cut -d ' ' -f 1 | rev
# abc.123
More ways to do this:
(Run each of these commands in your terminal to test this live.)
For all answers below, start by typing this in your terminal:
A="Some variable has value abc.123"
The array example (#3 below) is a really useful pattern, and depending on what you are trying to do, sometimes the best.
1. with awk, as the main answer shows
echo "$A" | awk '{print $NF}'
2. with grep:
echo "$A" | grep -o '[^ ]*$'
the -o says to only retain the matching portion of the string
the [^ ] part says "don't match spaces"; ie: "not the space char"
the * means: "match 0 or more instances of the preceding match pattern (which is [^ ]), and the $ means "match the end of the line." So, this matches the last word after the last space through to the end of the line; ie: abc.123 in this case.
3. via regular bash "indexed" arrays and array indexing
Convert A to an array, with elements being separated by the default IFS (Internal Field Separator) char, which is space:
Option 1 (will "break in mysterious ways", as #tripleee put it in a comment here, if the string stored in the A variable contains certain special shell characters, so Option 2 below is recommended instead!):
# Capture space-separated words as separate elements in array A_array
A_array=($A)
Option 2 [RECOMMENDED!]. Use the read command, as I explain in my answer here, and as is recommended by the bash shellcheck static code analyzer tool for shell scripts, in ShellCheck rule SC2206, here.
# Capture space-separated words as separate elements in array A_array, using
# a "herestring".
# See my answer here: https://stackoverflow.com/a/71575442/4561887
IFS=" " read -r -d '' -a A_array <<< "$A"
Then, print only the last elment in the array:
# Print only the last element via bash array right-hand-side indexing syntax
echo "${A_array[-1]}" # last element only
Output:
abc.123
Going further:
What makes this pattern so useful too is that it allows you to easily do the opposite too!: obtain all words except the last one, like this:
array_len="${#A_array[#]}"
array_len_minus_one=$((array_len - 1))
echo "${A_array[#]:0:$array_len_minus_one}"
Output:
Some variable has value
For more on the ${array[#]:start:length} array slicing syntax above, see my answer here: Unix & Linux: Bash: slice of positional parameters, and for more info. on the bash "Arithmetic Expansion" syntax, see here:
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Arithmetic-Expansion
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Shell-Arithmetic
You can use a Bash regex:
A="Some variable has value abc.123"
[[ $A =~ [[:blank:]]([^[:blank:]]+)$ ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
Prints:
abc.123
That works with any [:blank:] delimiter in the current local (Usually [ \t]). If you want to be more specific:
A="Some variable has value abc.123"
pat='[ ]([^ ]+)$'
[[ $A =~ $pat ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
echo "Some variable has value abc.123"| perl -nE'say $1 if /(\S+)$/'

How to avoid the read command cutting the user input which is a string by space

I wrote a bash script to read multiple inputs from the user
Here is the command:
read -a choice
In this way, I can put all the inputs in the choice variable as an array so that I can extract them using an index.
The problem is that when one of the inputs, which is a string has space in it, like
user1 google.com "login: myLogin\npassword: myPassword"
the read command will split the quoted string into 3 words. How can I stop this from happening?
bash doesn't process quotes in user input. The only thing I can think of is to use eval to execute an array assignment.
IFS= read -r input
eval "choice=($input)"
Unfortunately this is dangerous -- if the input contains executable code, it will be executed by eval.
You can use a tab instead of space as a field delimiter. For instance :
$ IFS=$'\t' read -a choice
value1 value2 a value with many words ## This is typed
$ echo ${choice[2]}
a value with many words
Regards!
Given risk of using eval, and the fact the input seems to have only two types of tokens: unquoted, and quoted, consider using scripting engine that will put all text into proper format that will be easy to read.
It's not clear from the example what other quoting rules are used. Example assume 'standard' escaped that can be processed with bash #E processor.
The following uses Perl one liner to generate TAB delimited tokens (hopefully, raw tabs can not be part of the input, but other character can be used instead).
input='user1 google.com "login: myLogin\npassword: myPassword"'
tsv_input=$(perl -e '$_ = " $ARGV[0]" ; print $2 // $3, "\t" while ( /\s+("([^"]*)"|(\S*))/g) ;' "$input")
IFS=$'\t' read -d '' id domain values <<< $(echo -e "${tsv_input#E}")
Or using a function to get more readable code
function data_to_tsv {
# Translate to TSV
local tsv_input=$(perl -e '$_ = " $ARGV[0]" ; print $2 // $3, "\t" while ( /\s+("([^"]*)"|(\S*))/g) ;' "$1")
# Process escapes
echo -n "${tsv_input#E}"
}
input='user1 google.com "login: myLogin\npassword: myPassword"'
IFS=$'\t' read -d '' id domain values <<< $(data_to_tsv "$input")

Shell script to find possible string sequences

I have a text file in the following format:
A Apple
A Ant
B Bat
B Ball
The number of definitions of each character can be any number.
I am writing a shell script which will receive inputs like "A B". The output of the shell script I am expecting is the possible string sequences which can be created.
For input "A B", the outputs will be:
Apple Bat
Apple Ball
Ant Bat
Ant Ball
I tried arrays, It is not working as expected. Can anyone help with some ideas on how to solve this issue?
Use associative arrays to accomplish this:
#!/usr/bin/env bash
first_letter=$1
second_letter=$2
declare -A words # declare associative array
while read -r alphabet word; do # read ignores blank lines in input file
words+=(["$word"]="$alphabet") # key = word, value = alphabet
done < words.txt
for word1 in "${!words[#]}"; do
alphabet1="${words[$word1]}"
[[ $alphabet1 != $first_letter ]] && continue
for word2 in "${!words[#]}"; do
alphabet2="${words[$word2]}"
[[ $alphabet2 != $second_letter ]] && continue
printf "$word1 $word2\n" # print matching word pairs
done
done
Output with A B passed in as arguments (with the content in your question):
Apple Ball
Apple Bat
Ant Ball
Ant Bat
You may want to refer to this post for more info on associative arrays:
Appending to a hash table in Bash

Set bash variable equal to result of string where newlines are replaced by spaces

I have a variable equal to a string, which is a series of key/value pairs separated by newlines.
I want to then replace these newline characters with spaces, and set a new variable equal to the result
From various answers on the internet I've arrived at the following:
#test.txt has the content:
#test=example
#what=s0omething
vars="$(cat ./test.txt)"
formattedVars= $("$vars" | tr '\n' ' ')
echo "$taliskerEnvVars"
Problem is when I try to set formattedVars it tries to execute the second line:
script.sh: line 7: test=example
what=s0omething: command not found
I just want formattedVars to equal test=example what=s0omething
What trick am I missing?
Change your line to:
formattedVars=$(tr '\n' ' ' <<< "$secretsContent")
Notice the space of = in your code, which is not permitted in assignment statements.
I see that you are not setting secretsContent in your code, you are setting vars instead.
If possible, use an array to hold contents of the file:
readarray -t vars < ./test.txt # bash 4
or
# bash 3.x
declare -a vars
while IFS= read -r line; do
vars+=( "$line" )
done < ./test.txt
Then you can do what you need with the array. You can make your space-separated list with
formattedVars="${vars[*]}"
, but consider whether you need to. If the goal is to use them as a pre-command modifier, use, for instance,
"${vars[#]}" my_command arg1 arg2

UNIX:Create array from space delimited string while ignoring space in quotes

I'm trying to create an array from a space delimited string, this works fine till i have to ignore the space within double quotes for splitting the string.
I Tried:
inp='ROLE_NAME="Business Manager" ROLE_ID=67686'
arr=($(echo $inp | awk -F" " '{$1=$1; print}'))
This splits the array like:
${arr[0]}: ROLE_NAME=Business
${arr[1]}: Manager
${arr[2]}: ROLE_ID=67686
when actually i want it:
${arr[0]}: ROLE_NAME=Business Manager
${arr[1]}: ROLE_ID=67686
Im not really good with awk so can't figure out how to fix it.
Thanks
This is bash specific, may work with ksh/zsh
inp='ROLE_NAME="Business Manager" ROLE_ID=67686'
set -- $inp
arr=()
while (( $# > 0 )); do
word=$1
shift
# count the number of quotes
tmp=${word//[^\"]/}
if (( ${#tmp}%2 == 1 )); then
# if the word has an odd number of quotes, join it with the next
# word, re-set the positional parameters and keep looping
word+=" $1"
shift
set -- "$word" "$#"
else
# this word has zero or an even number of quotes.
# add it to the array and continue with the next word
arr+=("$word")
fi
done
for i in ${!arr[#]}; do printf "%d\t%s\n" $i "${arr[i]}"; done
0 ROLE_NAME="Business Manager"
1 ROLE_ID=67686
This specifically breaks words on arbitrary whitespace but joins with a single space, so your custom whitespace within quotes will be lost.

Resources