Shell script to find possible string sequences - bash

I have a text file in the following format:
A Apple
A Ant
B Bat
B Ball
The number of definitions of each character can be any number.
I am writing a shell script which will receive inputs like "A B". The output of the shell script I am expecting is the possible string sequences which can be created.
For input "A B", the outputs will be:
Apple Bat
Apple Ball
Ant Bat
Ant Ball
I tried arrays, It is not working as expected. Can anyone help with some ideas on how to solve this issue?

Use associative arrays to accomplish this:
#!/usr/bin/env bash
first_letter=$1
second_letter=$2
declare -A words # declare associative array
while read -r alphabet word; do # read ignores blank lines in input file
words+=(["$word"]="$alphabet") # key = word, value = alphabet
done < words.txt
for word1 in "${!words[#]}"; do
alphabet1="${words[$word1]}"
[[ $alphabet1 != $first_letter ]] && continue
for word2 in "${!words[#]}"; do
alphabet2="${words[$word2]}"
[[ $alphabet2 != $second_letter ]] && continue
printf "$word1 $word2\n" # print matching word pairs
done
done
Output with A B passed in as arguments (with the content in your question):
Apple Ball
Apple Bat
Ant Ball
Ant Bat
You may want to refer to this post for more info on associative arrays:
Appending to a hash table in Bash

Related

How to get output values in bash array by calling other program from bash?

I am stuck with a peculiar situation, where in from python I am printing two strings one by one and reading it in bash script (which calls the python code piece)
I am expecting array size to be 2, but somehow, bash considers spaces also as a element separator and return me size of 3.
Example scripts
multi_line_return.py file has following content
print("foo bar")
print(5)
multi_line_call.sh has following content
#!/bin/bash
PYTHON_EXE="ABSOLUTE_PATH TO PYTHON EXECUTABLE IN LINUX"
CURR_DIR=$(cd $(dirname ${BASH_SOURCE[0]}) && pwd)/
array=()
while read line ; do
array+=($line)
done < <(${PYTHON_EXE} ${CURR_DIR}multi_line_return.py)
echo "array length --> ${#array[#]}"
echo "each variable in new line"
for i in "${array[#]}"
do
printf $i
printf "\n"
done
Now keep both of the above file in same directory and make following call to see result.
bash multi_line_call.sh
As you can see in result,
I am getting
array length = 3
1.foo, 2.bar & 3. 5
The expectation is
One complete line of python output (stdout) as one element of bash array
array length = 2
1. foo bar & 2. 5
Put quotes around $line to prevent it from being split:
array+=("$line")
You can also do it without a loop using readarray:
readarray array < <(${PYTHON_EXE} ${CURR_DIR}multi_line_return.py)

Read a single word from a string and set as a variable value in bash

I have a need for a simple function to do the following:
#!/bin/bash
select_word() {
echo "Enter a string."
read STRING
## User enters "This is a string."
## Here will be the command to set each word to the below variables.
WORD_1=
WORD_2=
WORD_3=
WORD_4=
echo -e "Word 1 is: $WORD_1.\n Word 2 is: $WORD_2.\n"
echo -e "Word 3 is: $WORD_3.\n Word 4 is: $WORD_4.\n"
}
I am wanting to avoid using external tools such as sed or awk. Looking for bash builtin functions to use in order to pull each word from the string and set that word as a variable value. I will later use "wc" to count the number of characters in each word. I already know how to do that, I just need to know the bash method to pulling a word from user input strings.
If this question is a duplicate, I apologize as I could not find this specific question.
read can split the string itself.
read -r WORD1 WORD2 WORD3 WORD4
If you enter fewer than 4 words, the last variable(s) will be set to empty strings. If you enter more than 4 words, WORD4 will be the rest of the string, not just the 4th word.
You can also split the string into an array, if you don't know how many words will be entered ahead of time.
read -a words
WORD1=${words[0]}
WORD2=${words[1]}
WORD3=${words[2]}
WORD4=${words[3]}
You can convert the space delimited $STRING into an array, and then reference each array element in $WORD_n variables:
WORDS=($STRING)
WORD_1=${WORDS[0]}
WORD_2=${WORDS[1]}
Of with set -f, which will disable globing (e.g. changing * to list of files in current directory):
set -f
WORDS=($STRING)
set +f
WORD_1=${WORDS[0]}
WORD_2=${WORDS[1]}
If you need to be independent of the number of words typed, you can do:
#!/bin/bash
select_word() {
echo "Enter a string."
read STRING
## User enters "This is a string."
GENERAL_VAR_NAME="WORD_"
## Here will be the command to set each word to the below variables.
# Split string into an array, default delimiter whitespace
STRING=( $STRING )
# Number of array elements: ${#STRING[#]}
for (( c=1; c<=${#STRING[#]}; c++ ))
do
#echo -e "WORD $c is: ${STRING[$c]}"
NEW_WORD=`echo -e $GENERAL_VAR_NAME${c}`
printf -v $NEW_WORD "${STRING[$c]}"
done
# Check the output
for (( i=1; i<=${#STRING[#]}; i++ )); do
echo "WORD_$i is $WORD_$i"
done
}
select_word

Loop two variables through one command in shell

I want to run a shell script that can simultaneously loop through two variables.
So that I can have an input and output file name. I feel like this isn't too hard of a concept but any help is appreciated.
Files = "File1,
File2,
...
FileN
"
Output = OutFile1,
Outfile2,
...
OutfileN
"
and I would in theory my code would be:
for File in $Files
do
COMMAND --file $File --ouput $Output
done
Obviously, there needs to be another loop but I'm stuck, any help is appreciated.
You don't really need to loop 2 variables, just use 2 BASH arrays:
input=("File1" "File2" "File3")
output=("OutFile1" "OutFile2" "OutFile3")
for ((i=0; i<${#input[#]}; i++)); do
echo "Processing input=${input[$i]} and output=${output[$i]}"
done
zsh enables multiple loop variables before the list.
#!/bin/zsh
input2output=(
'File1' 'Outfile1'
'File2' 'Outfile2'
)
for input ouput in $input2output
do
echo "[$input] --> [$ouput]"
done
quotes from zsh(5.9) manual or man zshmisc
for name ... [ in word ... ] term do list done
More than one parameter name can appear before the list of words. If N names are given, then on each execution of the loop the next N words are assigned to the corresponding parameters. If there are more names than remaining words, the remaining parameters are each set to the empty string.

Save a newline separated list into several bash variables

I'm relatively new to shell scripting and am writing a script to organize my music library. I'm using awk to parse the id3 tag info and am generating a newline separated list like so:
Kanye West
College Dropout
All Falls Down
I want to store each field in a separate variable so I can easily compose some mkdir and mv commands. I've tried piping the output to IFS=$'\n' read artist album title but each variable remains empty. I'm open to producing a different output from awk, but I still want to know how to parse a newline separated list using bash.
Edit:
It turns out that by piping directly to read by doing:
id3info "$filename" | awk "$awkscript" | {read artist; read album; read title;}
WILL NOT WORK. It results in the variables existing in a different scope. I found that using a herestring works best:
{read artist; read album; read title;} <<< "$(id3info "$filename" | awk "$awkscript")"
read normally reads one line at a time. So, if your id3 info is in the file testfile.txt, you can read it in as follows:
{ read artist ; read album ; read song ; } <testfile.txt
echo "artist='$artist' album='$album' song='$song'"
# insert your mkdir and mv commands....
When run on your test file, the above outputs:
artist='Kanye West' album='College Dropout' song='All Falls Down'
You can just read the file into a bash array and loop through the array like so:
IFS=$'\r\n' content=($(cat ${filepath}))
for ((idx = 0; idx < ${#content[#]}; idx+=3)); do
artist=${content[idx]}
album=${content[idx+1]}
title=${content[idx+2]}
done
Or read three lines in a loop.
yourscript |
while read artist; do # read first line of input
read album # read second line of input
read song # read third line of input
: self-destruct if the genre is rap
done
This loop will consume input lines in groups of three. If there is not an even multiple of three lines of input, the reads after that inside the loop will simply fail and the variables will be empty.
You can read the output from awk into an array. E.g.
readarray -t array <<< "$(printf '%s\n' 'Kanye West' 'College Dropout' 'All Falls Down')"
for ((i=0; i<${#array[#]}; i++ )) ; do
echo "array[$i]=${array[$i]}"
done
Produces:
array[0]=Kanye West
array[1]=College Dropout
array[2]=All Falls Down

Capturing multiple line output into a Bash variable

I've got a script 'myscript' that outputs the following:
abc
def
ghi
in another script, I call:
declare RESULT=$(./myscript)
and $RESULT gets the value
abc def ghi
Is there a way to store the result either with the newlines, or with '\n' character so I can output it with 'echo -e'?
Actually, RESULT contains what you want — to demonstrate:
echo "$RESULT"
What you show is what you get from:
echo $RESULT
As noted in the comments, the difference is that (1) the double-quoted version of the variable (echo "$RESULT") preserves internal spacing of the value exactly as it is represented in the variable — newlines, tabs, multiple blanks and all — whereas (2) the unquoted version (echo $RESULT) replaces each sequence of one or more blanks, tabs and newlines with a single space. Thus (1) preserves the shape of the input variable, whereas (2) creates a potentially very long single line of output with 'words' separated by single spaces (where a 'word' is a sequence of non-whitespace characters; there needn't be any alphanumerics in any of the words).
Another pitfall with this is that command substitution — $() — strips trailing newlines. Probably not always important, but if you really want to preserve exactly what was output, you'll have to use another line and some quoting:
RESULTX="$(./myscript; echo x)"
RESULT="${RESULTX%x}"
This is especially important if you want to handle all possible filenames (to avoid undefined behavior like operating on the wrong file).
In case that you're interested in specific lines, use a result-array:
declare RESULT=($(./myscript)) # (..) = array
echo "First line: ${RESULT[0]}"
echo "Second line: ${RESULT[1]}"
echo "N-th line: ${RESULT[N]}"
In addition to the answer given by #l0b0 I just had the situation where I needed to both keep any trailing newlines output by the script and check the script's return code.
And the problem with l0b0's answer is that the 'echo x' was resetting $? back to zero... so I managed to come up with this very cunning solution:
RESULTX="$(./myscript; echo x$?)"
RETURNCODE=${RESULTX##*x}
RESULT="${RESULTX%x*}"
Parsing multiple output
Introduction
So your myscript output 3 lines, could look like:
myscript() { echo $'abc\ndef\nghi'; }
or
myscript() { local i; for i in abc def ghi ;do echo $i; done ;}
Ok this is a function, not a script (no need of path ./), but output is same
myscript
abc
def
ghi
Considering result code
To check for result code, test function will become:
myscript() { local i;for i in abc def ghi ;do echo $i;done;return $((RANDOM%128));}
1. Storing multiple output in one single variable, showing newlines
Your operation is correct:
RESULT=$(myscript)
About result code, you could add:
RCODE=$?
even in same line:
RESULT=$(myscript) RCODE=$?
Then
echo $RESULT $RCODE
abc def ghi 66
echo "$RESULT"
abc
def
ghi
echo ${RESULT#Q}
$'abc\ndef\nghi'
printf '%q\n' "$RESULT"
$'abc\ndef\nghi'
but for showing variable definition, use declare -p:
declare -p RESULT RCODE
declare -- RESULT="abc
def
ghi"
declare -- RCODE="66"
2. Parsing multiple output in array, using mapfile
Storing answer into myvar variable:
mapfile -t myvar < <(myscript)
echo ${myvar[2]}
ghi
Showing $myvar:
declare -p myvar
declare -a myvar=([0]="abc" [1]="def" [2]="ghi")
Considering result code
In case you have to check for result code, you could:
RESULT=$(myscript) RCODE=$?
mapfile -t myvar <<<"$RESULT"
declare -p myvar RCODE
declare -a myvar=([0]="abc" [1]="def" [2]="ghi")
declare -- RCODE="40"
3. Parsing multiple output by consecutives read in command group
{ read firstline; read secondline; read thirdline;} < <(myscript)
echo $secondline
def
Showing variables:
declare -p firstline secondline thirdline
declare -- firstline="abc"
declare -- secondline="def"
declare -- thirdline="ghi"
I often use:
{ read foo;read foo total use free foo ;} < <(df -k /)
Then
declare -p use free total
declare -- use="843476"
declare -- free="582128"
declare -- total="1515376"
Considering result code
Same prepended step:
RESULT=$(myscript) RCODE=$?
{ read firstline; read secondline; read thirdline;} <<<"$RESULT"
declare -p firstline secondline thirdline RCODE
declare -- firstline="abc"
declare -- secondline="def"
declare -- thirdline="ghi"
declare -- RCODE="50"
After trying most of the solutions here, the easiest thing I found was the obvious - using a temp file. I'm not sure what you want to do with your multiple line output, but you can then deal with it line by line using read. About the only thing you can't really do is easily stick it all in the same variable, but for most practical purposes this is way easier to deal with.
./myscript.sh > /tmp/foo
while read line ; do
echo 'whatever you want to do with $line'
done < /tmp/foo
Quick hack to make it do the requested action:
result=""
./myscript.sh > /tmp/foo
while read line ; do
result="$result$line\n"
done < /tmp/foo
echo -e $result
Note this adds an extra line. If you work on it you can code around it, I'm just too lazy.
EDIT: While this case works perfectly well, people reading this should be aware that you can easily squash your stdin inside the while loop, thus giving you a script that will run one line, clear stdin, and exit. Like ssh will do that I think? I just saw it recently, other code examples here: https://unix.stackexchange.com/questions/24260/reading-lines-from-a-file-with-bash-for-vs-while
One more time! This time with a different filehandle (stdin, stdout, stderr are 0-2, so we can use &3 or higher in bash).
result=""
./test>/tmp/foo
while read line <&3; do
result="$result$line\n"
done 3</tmp/foo
echo -e $result
you can also use mktemp, but this is just a quick code example. Usage for mktemp looks like:
filenamevar=`mktemp /tmp/tempXXXXXX`
./test > $filenamevar
Then use $filenamevar like you would the actual name of a file. Probably doesn't need to be explained here but someone complained in the comments.
How about this, it will read each line to a variable and that can be used subsequently !
say myscript output is redirected to a file called myscript_output
awk '{while ( (getline var < "myscript_output") >0){print var;} close ("myscript_output");}'

Resources