Check if any substring is contained in an array in Bash - bash

Suppose I have a string,
a="This is a string"
and an array,
b=("This is my" "sstring")
I want to execute an if condition if any substring of a lies in b which is true because "This is" is a substring of the first element of b.
In case of two strings I know how to check if $x is a substring of $y using,
if [[ $y == *$x* ]]; then
#Something
fi
but since $x is an array of strings I don't know how to do it without having to explicitly loop through the array.

This might be all you need:
$ printf '%s\n' "${b[#]}" | grep -wFf <(tr ' ' $'\n' <<<"$a")
This is my
Otherwise - a shell is a tool to manipulate files/processes and sequence calls to tools. The guys who invented shell also invented awk for shell to call to manipulate text. What you're trying to do is manipulate text so there's a good chance you should be using awk instead of shell for whatever it is you're doing that this task is a part of.
$ printf '%s\n' "${b[#]}" |
awk -v a="$a" '
BEGIN { split(a,words) }
{ for (i in words) if (index($0,words[i])) { print; f=1; exit} }
END { exit !f }
'
This is my
The above assumes a doesn't contain any backslashes, if it can then use this instead:
printf '%s\n' "${b[#]}" | a="$a" awk 'BEGIN{split(ENVIRON["a"],words)} ...'
If any element in b can contain newlines then:
printf '%s\0' "${b[#]}" | a="$a" awk -v RS='\0' 'BEGIN{split(ENVIRON["a"],words)} ...'

Here is how to match the maximum number of words from string a to entries of array b:
#!/usr/bin/env bash
a="this is a string"
b=("this is my" "string" )
# tokenize a words into an array
read -ra a_words <<<"$a"
match()
{
# iterate entries of array b
for e in "${b[#]}"; do
# tokenize entry words into an array
read -ra e_words <<<"$e"
# initialize counter/length to the shortest MIN words count
i=$(( ${#a_words[#]} < ${#e_words[#]} ? ${#a_words[#]} : ${#e_words[#]} ))
# iterate matching decreasing number of words
while [ 0 -lt "$i" ]; do
# return true it matches
[ "${e_words[*]::$i}" = "${a_words[*]::$i}" ] && return
# decrease number of words to match
i=$(( i - 1 ))
done
done
# reaching here means no match found, return false
return 1
}
if match; then
printf %s\\n 'It matches!'
fi

You can split the $a into an array, then loop both arrays to find matches:
a="this is a string"
b=( "this is my" "string")
# Make an array by splitting $a on spaces
IFS=' ' read -ra aarr <<< "$a"
for i in "${aarr[#]}"
do
for j in "${b[#]}"
do
if [[ $j == *"$i"* ]]; then
echo "Match: $i : $j"
break
fi
done
done
# Match: this : this is my
# Match: is : this is my
# Match: string : string
If you need to handle substrings in $a (e.g. this is, is my etc) then you will need to loop over the array, generating all possible substrings:
for (( length=1; length <= "${#aarr[#]}"; ++length )); do
for (( start=0; start + length <= "${#aarr[#]}"; ++start )); do
substr="${aarr[#]:start:length}"
for j in "${b[#]}"; do
if [[ $j == *"${substr}"* ]]; then
echo "Match: $substr : $j"
break
fi
done
done
done
# Match: this : this is my
# Match: is : this is my
# Match: string : string
# Match: this is : this is my

Related

Bash check string that don't match on order

how to compare string in bash? I only want to compare words, not word order
for example i have variable
VAR1=eu-endpoint-2021.09.20 prod-store-2021.09.20 service-trace-2021.09.20
and another variable that stores the same info but with different order
VAR2=prod-store-2021.09.20 eu-endpoint-2021.09.20 service-trace-2021.09.20
and how can i compare this only by words? nor the words order
for example
if $VAR1 == $VAR2
then
do smth;
else
do smth;
fi
Since both your input string only contains parts that don't contain any spaces, we can
Convert the strings into arrays ($VAR1)
Loop over array1: Loop through an array of strings in Bash?
Check if current element exist in array2: Check if a Bash array contains a value
If not, set result to false, and break out of the loop
#!/bin/bash
VAR1='eu-endpoint-2021.09.20 prod-store-2021.09.20 service-trace-2021.09.20'
VAR2='prod-store-2021.09.20 eu-endpoint-2021.09.20 service-trace-2021.09.20'
ARR1=($VAR1)
ARR2=($VAR2)
RES=1
for i in "${ARR1[#]}"; do
[[ ! " ${ARR2[*]} " =~ " ${i} " ]] && RES=0 && break
done
[ $RES -eq 1 ] && echo 'Equal' || echo 'Not equal'
Will show Equal for the provided example strings as you can try here.
If you change any of the strings, you'll get Not equal as you can try here.
I'd just sort them then compare the result, e.g.:
$ VAR1='eu-endpoint-2021.09.20 prod-store-2021.09.20 service-trace-2021.09.20'
$ VAR2='prod-store-2021.09.20 eu-endpoint-2021.09.20 service-trace-2021.09.20'
$ if [[ $(tr ' ' '\n' <<<"$VAR1" | sort) = $(tr ' ' '\n' <<<"$VAR2" | sort) ]]; then echo same; else echo diff; fi
same

Changing alternative character from lower to upper and upper to low - Unix shell script

How to convert the alternative character of a string passed to script, if it is lower then it should be converted to upper and if it is upper then to lower??
read -p " Enter string" str
for i in `seq 0 ${#str}`
do
#echo $i
rem=$(($i % 2 ))
if [ $rem -eq 0 ]
then
echo ${str:$i:1}
else
fr=${str:$i:1}
if [[ "$fr" =~ [A-Z] ]]
then
echo ${str:$i:1} | tr '[:upper:]' '[:lower:]'
elif [[ "$fr" =~ [a-z] ]]
then
echo ${str:$i:1} | tr '[:lower:]' '[:upper:]'
else
echo ""
fi
fi
done
Your question is a bit challenging given that it is tagged shell and not as a question pertaining to an advanced shell like bash or zsh. In POSIX shell, you have no string indexes, no C-style for loop, and no [[ .. ]] operator to use character class pattern matching.
However, with a bit of awkward creativity, the old expr and POSIX string and arithmetic operations, and limiting your character strings to ASCII characters, you can iterate over a string changing uppercase to lowercase and lowercase and uppercase while leaving all other characters unchanged.
I wouldn't recommend the approach if you have an advanced shell available, but if you are limited to POSIX shell, as your question is tagged, it will work, but don't expect it to be super-fast...
#!/bin/sh
a=${1:-"This Is My 10TH String"} ## input and output strings
b=
i=1 ## counter and string length
len=$(expr length "$a")
asciiA=$(printf "%d" "'A") ## ASCII values for A,Z,a,z
asciiZ=$(printf "%d" "'Z")
asciia=$(printf "%d" "'a")
asciiz=$(printf "%d" "'z")
echo "input : $a" ## output original string
while [ "$i" -le "$len" ]; do ## loop over each character
c=$(expr substr "$a" "$i" "1") ## extract char from string
asciic=$(printf "%d" "'$c") ## convert to ASCII value
## check if asciic is [A-Za-z]
if [ "$asciiA" -le "$asciic" -a "$asciic" -le "$asciiZ" ] ||
[ "$asciia" -le "$asciic" -a "$asciic" -le "$asciiz" ]
then ## toggle the sign bit (bit-6)
b="${b}$(printf "\x$(printf "%x" $((asciic ^ 1 << 5)))\n")"
else
b="$b$c" ## otherwise copy as is
fi
i=$(expr $i + 1)
done
echo "output: $b" ## output resluting string
The case change is affected by relying on a simple bit-toggle of the case-bit (bit-6) in the ASCII value of each upper or lower case character to change it from lower to upper or vice-versa. (and note, you can exchange the printf and bit-shift for tr of asciic as an alternative)
Example Use/Output
$ sh togglecase.sh
input : This Is My 10TH String
output: tHIS iS mY 10th sTRING
When you want to swab every second characters case, try this:
read -p " Enter string " str
for i in `seq 0 ${#str}`; do
rem=$(($i % 2 ))
if [ $rem -eq 0 ]
then
printf "%s" "${str:$i:1}"
else
fr=${str:$i:1}
printf "%s" "$(tr '[:upper:][:lower:]' '[:lower:][:upper:]' <<< "${str:$i:1}")"
fi
done
echo
EDIT: Second solution
Switch case of str and merge the old and new string.
#!/bin/bash
str="part is lowercase & PART IS UPPERCASE"
str2=$(tr '[:upper:][:lower:]' '[:lower:][:upper:]' <<< "${str}")
str_chopped=$(sed -r 's/(.)./\1\n/g' <<< "${str}");
# Will have 1 additional char for odd length str
# str2_chopped_incorrect=$(sed -r 's/.(.)/\1\n/g' <<< "${str2}");
str2_chopped=$(fold -w2 <<< "${str2}" | sed -nr 's/.(.)/\1/p' );
paste -d '\n' <(echo "${str_chopped}") <(echo "${str2_chopped}") | tr -d '\n'; echo

Concatenating digits from a string in sh

Assuming that I have a string like this one:
string="1 0 . # 1 1 ? 2 2 4"
Is it possible to concatenate digits that are next to each other?
So that string be like: 10 . # 11 ? 224 ?
I found only basic things how to distinguish integers from other characters and how to "connect" them. But I have no idea how to iterate properly.
num=""
for char in $string; do
if [ $char -eq $char 2>/dev/null ] ; then
num=$num$char
Here's an almost pure-shell implementation -- transforming the string into a character per line and using a BashFAQ #1 while read loop.
string="1 0 . # 1 1 ? 2 2 4"
output=''
# replace spaces with newlines for easier handling
string=$(printf '%s\n' "$string" | tr ' ' '\n')
last_was_number=0
printf '%s\n' "$string" | {
while read -r char; do
if [ "$char" -eq "$char" ] 2>/dev/null; then # it's a number
if [ "$last_was_number" -eq "1" ]; then
output="$output$char"
last_was_number=1
continue
fi
last_was_number=1
else
last_was_number=0
fi
output="$output $char"
done
printf '%s\n' "$output"
}
To complement Charles Duffy's helpful, POSIX-compliant sh solution with a more concise perl alternative:
Note: perl is not part of POSIX, but it is preinstalled on most modern Unix-like platforms.
$ printf '%s\n' "1 0 . # 1 1 ? 2 2 4" | perl -pe 's/\d( \d)+/$& =~ s| ||gr/eg'
10 . # 11 ? 224
The outer substitution, s/\d( \d)+/.../eg, globally (g) finds runs of at least 2 adjacent digits (\d( \d)+), and replaces each run with the result of the expression (e) specified as the replacement string (represented as ... here).
The expression in the inner substitution, $& =~ s| ||gr, whose result is used as the replacement string, removes all spaces from each run of adjacent digits:
$& represents what the outer regex matched - the run of adjacent digits.
=~ applies the s call on the RHS to the LHS, i.e., $& (without this, the s call would implicitly apply to the entire input string, $_).
s| ||gr replaces all (g) instances of <space> from the value of the value of $& and returns (r) the result, effectively removing all spaces.
Note that | is used arbitrarily as the delimiter character for the s call, so as to avoid a clash with the customary / delimiter used by the outer s call.
POSIX compliant one-liner with sed:
string="1 0 . # 1 1 ? 2 2 4"
printf '%s\n' "$string" | sed -e ':b' -e ' s/\([0-9]\) \([0-9]\)/\1\2/g; tb'
It just iteratively removes the any space between two digits until there aren't any more, resulting in:
10 . # 11 ? 224
Here is my solution:
string="1 0 . # 1 1 ? 2 2 4"
array=(${string/// })
arraylength=${#array[#]}
pattern="[0-9]"
i=0
while true; do
str=""
start=$i
if [ $i -eq $arraylength ]; then
break;
fi
for (( j=$start; j<${arraylength}; j++ )) do
curr=${array[$j]}
i=$((i + 1))
if [[ $curr =~ $pattern ]]; then
str="$str$curr"
else
break
fi
done
echo $str
done

How to Get a Substring Using Positive and Negative Indexes in Bash

What I want is pretty simple. Given a string 'this is my test string,' I want to return the substring from the 2nd position to the 2nd to last position. Something like:
substring 'this is my test string' 1,-1. I know I can get stuff from the beginning of the string using cut, but I'm not sure how I can calculate easily from the end of the string. Help?
Turns out I can do this with awk pretty easily as follows:
echo 'this is my test string' | awk '{ print substr( $0, 2, length($0)-2 ) }'
Be cleaner in awk, python, perl, etc. but here's one way to do it:
#!/usr/bin/bash
msg="this is my test string"
start=2
len=$((${#msg} - ${start} - 2))
echo $len
echo ${msg:2:$len}
results in is is my test stri
You can do this with just pure bash
$ string="i.am.a.stupid.fool.are.you?"
$ echo ${string: 2:$((${#string}-4))}
am.a.stupid.fool.are.yo
Look ma, no global variables or forks (except for the obvious printf) and thoroughly tested:
substring()
{
# Extract substring with positive or negative indexes
# #param $1: String
# #param $2: Start (default start of string)
# #param $3: Length (default until end of string)
local -i strlen="${#1}"
local -i start="${2-0}"
local -i length="${3-${#1}}"
if [[ "$start" -lt 0 ]]
then
let start+=$strlen
fi
if [[ "$length" -lt 0 ]]
then
let length+=$strlen
let length-=$start
fi
if [[ "$length" -lt 0 ]]
then
return
fi
printf %s "${1:$start:$length}"
}

Bash Select position of array element by args

I want make a bash script which returns the position of an element from an array by give an arg. See code below, I use:
#!/bin/bash
args=("$#")
echo ${args[0]}
test_array=('AA' 'BB' 'CC' 'DD' 'EE')
echo $test_array
elem_array=${#test_array[#]}
for args in $test_array
do
echo
done
Finally I should have output like:
$script.sh DD
4
#!/bin/bash
A=(AA BB CC DD EE)
for i in "${!A[#]}"; do
if [[ "${A[i]}" = "$1" ]]; then
echo "$i"
fi
done
Note the "${!A[#]}" notation that gives the list of valid indexes in the array. In general you cannot just go from 0 to "${#A[#]}" - 1, because the indexes are not necessarily contiguous. There can be gaps in the index range if there were gaps in the array element assignments or if some elements have been unset.
The script above will output all indexes of the array for which its content is equal to the first command line argument of the script.
EDIT:
In your question, you seem to want the result as a one-based array index. In that case you can just increment the result by one:
#!/bin/bash
A=(AA BB CC DD EE)
for i in "${!A[#]}"; do
if [[ "${A[i]}" = "$1" ]]; then
let i++;
echo "$i"
fi
done
Keep in mind, though, that this index will have to be decremented before being used with a zero-based array.
Trying to avoid complex tools:
test_array=('AA' 'BB' 'CC' 'D D' 'EE')
OLD_IFS="$IFS"
IFS="
"
element=$(grep -n '^D D$' <<< "${test_array[*]}" | cut -d ":" -f 1)
IFS="$OLD_IFS"
echo $element
However, it consumes 2 processes. If we allow ourselves sed, we could do it with a single process:
test_array=('AA' 'BB' 'CC' 'D D' 'EE')
OLD_IFS="$IFS"
IFS="
"
element=$(sed -n -e '/^D D$/=' <<< "${test_array[*]}")
IFS="$OLD_IFS"
echo $element
Update:
As pointed out by thkala in the comments, this solution is broken in 3 cases. Be careful not to use it if:
You want zero indexed offset.
You have newlines in your array elements.
And you have a sparse array, or have other keys than integers.
Loop over the array and keep track of the position.
When you find the element matching the input argument, print out the position of the element. You need to add one to the position, because arrays have zero-based indexing.
#! /bin/sh
arg=$1
echo $arg
test_array=('AA' 'BB' 'CC' 'DD' 'EE')
element_count=${#test_array[#]}
index=0
while [ $index -lt $element_count ]
do
if [ "${test_array[index]}" = "$arg" ]
then
echo $((index+1))
break
fi
((index++))
done
Without loop:
#!/bin/bash
index() {
local IFS=$'\n';
echo "${*:2}" | awk '$0 == "'"${1//\"/\\\"}"'" { print NR-1; exit; }'
}
array=("D A D" "A D" bBb "D WW" D "\" D \"" e1e " D " E1E D AA "" BB)
element=${array[5]}
index "$element" "${array[#]}"
Output:
5

Resources