Process contents in array based on type in shellscript - bash

I have an array that has three types of data in it, integer, integer/integer, and the string value.
I have shown a sample below.
myarr = (2301/2320,Team Lifeline, 2311, 7650/7670, 232)
I have the following algorithm that I want to come up with.
For index in myarr
if index contains data as number1/number2; then
create an array, "mynumbers" to hold all the numbers starting from number1 to number2
else if index is a string
add it in "mystrarr"
else
add it in "myintarr"
done
For the first case, if I have an enter in the myarr as 2301/2320,
then the mynumbers as shown in the pseudocode will have entries from {2301, 2302, ... , 2320}. I am not able to understand on how to parse the entry in myarr and identify that it has a / in the array.
For the second situation, I am also not sure on how to identify if the entry in the myarr and know it is a string. mystrarr should have {Team Lifeline}.
For the final case, the myintarr should have {2311, 232}.
Any help would be appreciated. I am very new to shell script.

Stack Overflow is not a coding service.... but I was bored so here you go...
#!/bin/bash
myarr=(2301/2320 'Team Lifeline' 2311 7650/7670 232)
for element in "${myarr[#]}"; do
if [[ $element =~ ^[0-9]+/[0-9]+$ ]]; then
range="{${element%/*}..${element##*/}}"
mynumbers=( $(eval "echo $range") )
elif [ $element -eq $element ] 2>> /dev/null; then
intarr+=( $element )
else
strarr+=( "$element" )
fi
done
echo "mynumbers = ${mynumbers[*]}"
echo "intarr = ${intarr[*]}"
echo "strarr = ${strarr[*]}"
A lot to unpack here for inexperienced. So ask questions where I didn't cover anything. Things to note:
All assignments there are no spaces around =.
Array assignments are of the format ( element1 element2 ... )
Appending to arrays with +=(...) format
Looping through array elements for element in "${myarr[#]}"
Note that the array generated by 7650/7670 will overwrite the array generated by 2301/2320. I assume you have some kind of plan for this array, so I didn't do anything to stop it from being overwritten.
More details
This line is validating the format for 111/222:
if [[ $element =~ ^[0-9]+/[0-9]+$ ]]; then
[[ x =~ x ]] performs a regex comparison and this regex essentially just means:
^ - beginning of the string
[0-9]+ - Atleast 1 number
/ - character literal
$ - end of string
These lines are expanding your beginning and ending numbers:
range="{${element%/*}..${element##*/}}"
mynumbers=( $(eval "echo $range") )
This is maybe more complicated than it needs to be as most people try to avoid eval in general for security reasons. I'm leveraging bash's brace expansion. If you run echo {5..9}, it will output 5 6 7 8 9. This does not trigger with variables, so I cheated and used eval.
This line is checking if we are dealing with an integer:
[ $element -eq $element ] 2>> /dev/null
This works by running an integer -eq (equals) comparison on the variable against itself. This will actually fail and throw an error message on anything but an integer. This is not the way it was designed to be used which is why we discard all the error messages (2>> /dev/null).
This is a nice succinct script, but is using some unconventional practices. A longer more verbose version may be better for a beginner.

You can use regular expressions to match elements that are nothing but digits, or digits/digits, and assume everything else is a string:
#!/bin/bash
myarr=(2301/2320 "Time Lifeline" 2311 7650/7670 232)
declare -a mynumbers mystrarr myintarr
for elem in "${myarr[#]}"; do
if [[ $elem =~ ^([0-9]+)/([0-9]+)$ ]]; then
mynumbers+=($(seq ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}))
elif [[ $elem =~ ^[0-9]+$ ]]; then
myintarr+=($elem)
else
mystrarr+=("$elem")
fi
done
echo mynumbers is "${mynumbers[#]}"
echo myintarr is "${myintarr[#]}"
echo mystrarr is "${mystrarr[*]}"
Jason explained a lot in his (very similar; there's only so many obvious ways to do this) answer, so to expand on where ours are different:
We both use regular expressions to match the integer/integer case, but he then goes on to extract the two numbers using parameter expansion with pattern removal options, while mine captures the two integers in the regular expression, and uses the BASH_REMATCH array to access their values as well as the seq command to generate the numbers between the two.

Related

How to extend string to certain length

Hey basically right now my program gives me this output:
BLABLABLA
TEXTEXOUAIGJIOAJGOAJFKJAFKLAJKLFJKL
TEXT
MORE TEXT OF RANDOM CHARACTER OVER LIMIT
which is a result of for loop. Now here's what i want:
if the string raches over 10 characters, cut the rest and add two dots & colon to the end "..:"
otherwise (if the string has less than 10 characters) fill the gap with spaces so they're alligned
so on the example i provided i'd want something like this as output:
BLABLABLA :
TEXTEXOUA..:
TEXT :
MORE TEXT..:
I also solved the first part of the problem (when its over 10 characters), only the second one gives me trouble.
AMOUNT=definition here, just simplyfying so not including it
for (( i=1; i<="$AMOUNT"; i++ )); do
STRING=definition here, just simplyfying so not including it
DOTS="..:"
STRING_LENGTH=`echo -n "$STRING" | wc -c`
if [ "$STRING_LENGTH" -gt 10 ]
then
#Takes
STRING=`echo -n "${STRING:0:10}"$DOTS`
else
#now i dont know what to do here, how can i take my current $STRING
#and add spaces " " until we reach 10 characters. Any ideas?
fi
Bash provides a simple way to get the length of a string stored in a variable: ${#STRING}
STRING="definition here, just simplyfying so not including it"
if [ ${#STRING} -gt 10 ]; then
STR12="${STRING:0:10}.."
else
STR12="$STRING " # 12 spaces here
STR12="${STR12:0:12}"
fi
echo "$STR12:"
The expected output you posted doesn't match the requirements in the question. I tried to follow the requirements and ignored the sample expected output and the code you posted.
Use printf:
PADDED_STRING=$(printf %-10s $STRING)

How to loop through the first n letters of the alphabet in bash

I know that to loop through the alphabet, one can do
for c in {a..z}; do something; done
My question is, how can I loop through the first n letters (e.g. to build a string) where n is a variable/parameter given in the command line.
I searched SO, and only found answers doing this for numbers, e.g. using C-style for loop or seq (see e.g. How do I iterate over a range of numbers defined by variables in Bash?). And I don't have seq in my environment.
Thanks.
The straightforward way is sticking them in an array and looping over that by index:
#!/bin/bash
chars=( {a..z} )
n=3
for ((i=0; i<n; i++))
do
echo "${chars[i]}"
done
Alternatively, if you just want them dash-separated:
printf "%s-" "${chars[#]:0:n}"
that other guy's answer is probably the way to go, but here's an alternative that doesn't require an array variable:
n=3 # sample value
i=0 # var. for counting iterations
for c in {a..z}; do
echo $c # do something with "$c"
(( ++i == n )) && break # exit loop, once desired count has been reached
done
#rici points out in a comment that you could make do without aux. variable $i by using the conditional (( n-- )) || break to exit the loop, but note that this modifies $n.
Here's another array-free, but less efficient approach that uses substring extraction (parameter expansion):
n=3 # sample value
# Create a space-separated list of letters a-z.
# Note that chars={a..z} does NOT work.
chars=$(echo {a..z})
# Extract the substring containing the specified number
# of letters using parameter expansion with an arithmetic expression,
# and loop over them.
# Note:
# - The variable reference must be _unquoted_ for this to work.
# - Since the list is space-separated, each entry spans 2
# chars., hence `2*n` (you could subtract 1 after, but it'll work either way).
for c in ${chars:0:2*n}; do
echo $c # do something with "$c"
done
Finally, you can combine the array and list approaches for concision, although the pure array approach is more efficient:
n=3 # sample value
chars=( {a..z} ) # create array of letters
# `${chars[#]:0:n}` returns the first n array elements as a space-separated list
# Again, the variable reference must be _unquoted_.
for c in ${chars[#]:0:n}; do
echo $c # do something with "$c"
done
Are you only iterating over the alphabet to create a subset? If that's the case, just make it simple:
$ alpha=abcdefghijklmnopqrstuvqxyz
$ n=4
$ echo ${alpha:0:$n}
abcd
Edit. Based on your comment below, do you have sed?
% sed -e 's/./&-/g' <<< ${alpha:0:$n}
a-b-c-d-
You can loop through the character code of the letters of the alphabet and convert back and forth:
# suppose $INPUT is your input
INPUT='x'
# get the character code and increment it by one
INPUT_CHARCODE=`printf %x "'$INPUT"`
let INPUT_CHARCODE++
# start from character code 61 = 'a'
I=61
while [ $I -ne $INPUT_CHARCODE ]; do
# convert the index to a letter
CURRENT_CHAR=`printf "\x$I"`
echo "current character is: $CURRENT_CHAR"
let I++
done
This question and the answers helped me with my problem, partially.
I needed to loupe over a part of the alphabet based on a letter in bash.
Although the expansion is strictly textual
I found a solution: and made it even more simple:
START=A
STOP=D
for letter in $(eval echo {$START..$STOP}); do
echo $letter
done
Which results in:
A
B
C
D
Hope it's helpful for someone looking for the same problem i had to solve,
and ends up here as well
(also answered here)
And the complete answer to the original question is:
START=A
n=4
OFFSET=$( expr $(printf "%x" \'$START) + $n)
STOP=$(printf "\x$OFFSET")
for letter in $(eval echo {$START..$STOP}); do
echo $letter
done
Which results in the same:
A
B
C
D

stopping 'sed' after match found on a line; don't let sed keep checking all lines to EOF

I have a text file in which each a first block of text on each line is separated by a tab from a second block of text like so:
VERBS, AUXILIARY. "Be," subjunctive and quasi-subjunctive Be, Beest, &c., was used in A.-S. (beon) generally in a future sense.
In case it is hard to tell, tab is long space between "quasi-subjunctive" and "Be".
So I am thinking off the top of my head a 'for' loop in which a var is set using 'sed' to read the first block of text of a line, upto and including the tab (or not, doesn't really matter) and then the 'var' is used to find subsequent matches adding a "(x)" right before the tab to make the line unique. The 'x' of course would be a running counter numbering the first instance '1' incrementing and then each subsequent match one number higher.
One problem I see is stopping 'sed' after each subsequent match so the counter can be incremented. Is there a way to do this, since it is "sed's" normal behaviour to continue on thru without stop (as far as I know) until all lines are processed.
You can set the IFS to TAB character and read the line into variables. Something like:
$ while IFS=$'\t' read block1 block2;do
echo "block1 is $block1"
echo "block2 is $block2"
done < file
block1 is VERBS, AUXILIARY. "Be," subjunctive and quasi-subjunctive
block2 is Be, Beest, &c., was used in A.-S. (beon) generally in a future sense.
Ok so I got the job done with this little (or perhaps big if too much overkill?) script I whipped up:
#!/bin/bash
sedLnCnt=1
while [[ "$sedLnCnt" -lt 521 ]] ; do
lN=$(sed -n "${sedLnCnt} p" sGNoSecNums.html|sed -r 's/^([^\t]*\t).*$/\1/') #; echo "\$lN: $lN"
lnNum=($(grep -n "$lN" sGNoSecNums.html|sed -r 's/^([0-9]+):.*$/\1/')) #; echo "num of matches: ${#lnNum[#]}"
if [[ "${#lnNum[#]}" -gt 1 ]] ; then #'if'
lCnt="${#lnNum[#]}"
((eleN = $lCnt-1)) #; echo "\$eleN: ${eleN}" # var $eleN needs to be 1 less than total line count of zero-based array
while [[ "$lCnt" -gt 0 ]] ; do
sed -ri "${lnNum[$eleN]}s/^([^\t]*)\t/\1 \(${lCnt}\)\t/" sGNoSecNums.html
((lCnt--))
((eleN--))
done
fi
((sedLnCnt++))
done
Grep was the perfect way to find line numbers of matches, jamming them into an array and then editing each line appending the unique identifier.

bash find keyword in an associative array

I have incoming messages from a chat server that need to be compared against a list of keywords. I was using regular arrays, but would like to switch to associative arrays to try to increase the speed of the processing.
The list of words would be in an array called aWords and the values would be a 'type' indicator, i.e. aWords[damn]="1", with 1 being swear word in a legend to inform the user.
The issue is that I need to compare every index value with the input $line looking for substrings. I'm trying to avoid a loop thru each index value if at all possible.
From http://tldp.org/LDP/abs/html/string-manipulation.html, I'm thinking of the Substring Removal section.
${string#substring}
Deletes shortest match of $substring from front of $string.
A comparison of the 'removed' string from the $line, may help, but will it match also words in the middle of other words? i.e. matching the keyword his inside of this.
Sorry for the long-winded post, but I tried to cover all of what I'm attempting to accomplish as best I could.
# create a colon-separated string of the array keys
# you can do this once, after the array is created.
keys=$(IFS=:; echo "${!aWords[*]}")
if [[ ":$keys:" == *:"$word":* ]]; then
# $word is a key in the array
case ${aWords[$word]} in
1) echo "Tsk tsk: $word is a swear word" ;;
# ...
esac
fi
This is the first time I heard of associative arrays in bash. It inspired me to also try to add something, with the chance ofcourse that I completely miss the point.
Here is a code snippet. I hope I understood how it works:
declare -A SWEAR #create associative array of swearwords (only once)
while read LINE
do
[ "$LINE"] && SWEAR["$LINE"]=X
done < "/path/to/swearword/file"
while :
do
OUTGOING="" #reset output "buffer"
read REST #read a sentence from stdin
while "$REST" #evaluate every word in the sentence
do
WORD=${REST%% *}
REST=${REST#* }
[ ${SWEAR[$WORD]} ] && WORD="XXXX"
OUTGOING="$OUTGOING $WORD"
done
echo "$OUTGOING" #output to stdout
done

atoi() like function in bash?

Imagine that I use a state file to store a number, I read the number like this:
COUNT=$(< /tmp/state_file)
But since the file could be disrupted, $COUNT may not contain a "number", but any characters.
Other than using regex, i.e if [[ $COUNT ~ ^[0-9]+$ ]]; then blabla; fi, is there a "atoi" function that convert it to a number(0 if invalid)?
EDIT
Finally I decided to use something like this:
let a=$(($a+0))
Or
declare -i a; a="abcd123"; echo $a # got 0
Thanks to J20 for the hint.
You don't need an atoi equivalent, Bash variables are untyped. Trying to use variables set to random characters in arithmetic will just silently ignore them. eg
foo1=1
foo2=bar
let foo3=foo1+foo2
echo $foo3
Gives the result 1.
See this reference
echo $COUNT | bc should be able to cast a number, prone to error as per jurgemaister's comments...
echo ${COUNT/[a-Z]*} | bc which is similar to your regex method but not prone to error.
case "$c" in
[0-9])...
You should eat the input string charwise.

Resources