Concatenating digits from a string in sh - shell

Assuming that I have a string like this one:
string="1 0 . # 1 1 ? 2 2 4"
Is it possible to concatenate digits that are next to each other?
So that string be like: 10 . # 11 ? 224 ?
I found only basic things how to distinguish integers from other characters and how to "connect" them. But I have no idea how to iterate properly.
num=""
for char in $string; do
if [ $char -eq $char 2>/dev/null ] ; then
num=$num$char

Here's an almost pure-shell implementation -- transforming the string into a character per line and using a BashFAQ #1 while read loop.
string="1 0 . # 1 1 ? 2 2 4"
output=''
# replace spaces with newlines for easier handling
string=$(printf '%s\n' "$string" | tr ' ' '\n')
last_was_number=0
printf '%s\n' "$string" | {
while read -r char; do
if [ "$char" -eq "$char" ] 2>/dev/null; then # it's a number
if [ "$last_was_number" -eq "1" ]; then
output="$output$char"
last_was_number=1
continue
fi
last_was_number=1
else
last_was_number=0
fi
output="$output $char"
done
printf '%s\n' "$output"
}

To complement Charles Duffy's helpful, POSIX-compliant sh solution with a more concise perl alternative:
Note: perl is not part of POSIX, but it is preinstalled on most modern Unix-like platforms.
$ printf '%s\n' "1 0 . # 1 1 ? 2 2 4" | perl -pe 's/\d( \d)+/$& =~ s| ||gr/eg'
10 . # 11 ? 224
The outer substitution, s/\d( \d)+/.../eg, globally (g) finds runs of at least 2 adjacent digits (\d( \d)+), and replaces each run with the result of the expression (e) specified as the replacement string (represented as ... here).
The expression in the inner substitution, $& =~ s| ||gr, whose result is used as the replacement string, removes all spaces from each run of adjacent digits:
$& represents what the outer regex matched - the run of adjacent digits.
=~ applies the s call on the RHS to the LHS, i.e., $& (without this, the s call would implicitly apply to the entire input string, $_).
s| ||gr replaces all (g) instances of <space> from the value of the value of $& and returns (r) the result, effectively removing all spaces.
Note that | is used arbitrarily as the delimiter character for the s call, so as to avoid a clash with the customary / delimiter used by the outer s call.

POSIX compliant one-liner with sed:
string="1 0 . # 1 1 ? 2 2 4"
printf '%s\n' "$string" | sed -e ':b' -e ' s/\([0-9]\) \([0-9]\)/\1\2/g; tb'
It just iteratively removes the any space between two digits until there aren't any more, resulting in:
10 . # 11 ? 224

Here is my solution:
string="1 0 . # 1 1 ? 2 2 4"
array=(${string/// })
arraylength=${#array[#]}
pattern="[0-9]"
i=0
while true; do
str=""
start=$i
if [ $i -eq $arraylength ]; then
break;
fi
for (( j=$start; j<${arraylength}; j++ )) do
curr=${array[$j]}
i=$((i + 1))
if [[ $curr =~ $pattern ]]; then
str="$str$curr"
else
break
fi
done
echo $str
done

Related

How to insert variable and code after pattern using sed?

I'm using a shell script to insert code with a variable after a previous code pattern in script.tex, however sed is not adding anything after the expected pattern.
cat script.tex
\multicolumn{1}{c}{st_var}
Expected result (script.tex) after script.sh is run:
\multicolumn{1}{c}{A} & \multicolumn{1}{c}{B} & \multicolumn{1}{c}{C} & \multicolumn{1}{c}{D} & \multicolumn{1}{c}{E} & \multicolumn{1}{c}{F} \\
Current result (script.tex):
\multicolumn{1}{c}{A}
The first part of the conditional is working as expected. The remaining is not being found by sed.
cat script.sh:
#!/bin/bash
var=("NA" "A" "B" "C" "D" "E" "F")
clen=$(( ${#var[#]} - 1 ))
cind=1
for (( i=1; i<${#var[#]}; i++ )) ; do
if [[ "$cind" -eq 1 ]]; then
sed -i 's/st_var/'${var[$i]//\"/}'/g' script.tex
elif [[ "$cind" -gt 1 ]] && [[ "$cind" -lt "$clen" ]]; then
sstr="\multicolumn{1}{c}{${var[$i-1]//\"/}}"
estr=" & \multicolumn{1}{c}{${var[$i]//\"/}}"
festr=" & \multicolumn{1}{c}{${var[$i]//\"/}} \\\\"
sed -i '/^${sstr}/ s/$/${estr}/' script.tex
else
sed -i '/^${sstr}/ s/$/${festr}/' script.tex
fi
cind=$((cind + 1))
done
The var array here must have all elements double quoted for other purposes outside of this question. Also, the var array is shown here for simplicity - the letters A-F could be any random string. The first element in the array here is skipped (NA).
The best attempt so far:
script.sh:
#!/bin/bash -x
var=("NA" "A" "B" "C" "D" "E" "F")
clen=$(( ${#var[#]} - 1 ))
cind=1
for (( i=1; i<${#var[#]}; i++ )) ; do
if [[ "$cind" -eq 1 ]]; then
sed -i 's/st_var/'${var[$i]//\"/}'/g' script.tex
elif [[ "$cind" -gt 1 ]] && [[ "$cind" -lt "$clen" ]]; then
sstr='\multicolumn{1}{c}{'${var[$i-1]//\"/}'}'
estr=' \& \multicolumn{1}{c}{'${var[$i]//\"/}'}'
festr=' \& \multicolumn{1}{c}{'${var[$i+1]//\"/}'} \\'
# sed -i '/$sstr/r $estr/' script.tex
# sed -i '/^'"${sstr}"'/'"${estr}"'/' script.tex
sed -i "s/$sstr/&$estr/" script.tex
else
sed -i "s/$sstr/&$festr/" script.tex
# sed -i '/^'"${sstr}"'/'"${festr}"'/' script.tex
fi
cind=$((cind + 1))
done
Result:
\multicolumn{1}{c}{A} & multicolumn{1}{c}{B} & multicolumn{1}{c}{C} & multicolumn{1}{c}{D} & multicolumn{1}{c}{F} \ & multicolumn{1}{c}{E}
The ampersands are coming through, however the backslashes before multicolumn aren't coming through, and neither are the two backslashes at the end of the line. E and F are also flipped - F should be last.
Consider a different approach. Instead of adding anything incrementally, which might be hard and confusing because you have to keep "state", just do one single run. One replacement and regex pattern.
var=("A" "B" "C" "D" "E" "F")
# Generate replacement for the line.
repl=$(
# Print var on separate lines with the stub
printf " \multicolumn{1}{c}{%s} \n" "${var[#]}" |
# join lines with & + space character
paste -sd '&'
)
# add trailing \\
repl+="\\\\"
# Remove leading space
repl=${repl:1}
# Properly escape
# see https://stackoverflow.com/questions/407523/escape-a-string-for-a-sed-replace-pattern
ESCAPED_REPLACE=$(printf '%s\n' "$repl" | sed -e 's/[\/&]/\\&/g')
KEYWORD="\multicolumn{1}{c}{st_var}";
ESCAPED_KEYWORD=$(printf '%s\n' "$KEYWORD" | sed -e 's/[]\/$*.^[]/\\&/g');
# Finally run sed
set -x
sed "s/^$ESCAPED_KEYWORD$/$ESCAPED_REPLACE/"
When executed, for the following input:
\multicolumn{1}{c}{st_var}
outputs:
+ sed 's/^\\multicolumn{1}{c}{st_var}$/\\multicolumn{1}{c}{A} \& \\multicolumn{1}{c}{B} \& \\multicolumn{1}{c}{C} \& \\multicolumn{1}{c}{D} \& \\multicolumn{1}{c}{E} \& \\multicolumn{1}{c}{F} \\\\/'
\multicolumn{1}{c}{A} & \multicolumn{1}{c}{B} & \multicolumn{1}{c}{C} & \multicolumn{1}{c}{D} & \multicolumn{1}{c}{E} & \multicolumn{1}{c}{F} \\
The following code works:
#!/bin/bash -x
var=("NA" "A" "B" "C" "D" "E" "F")
clen=$(( ${#var[#]} - 1 ))
cind=1
for (( i=1; i<${#var[#]}; i++ )) ; do
if [[ "$cind" -eq 1 ]]; then
sed -i 's/st_var/'${var[$i]//\"/}'/g' script.tex
elif [[ "$cind" -gt 1 ]] && [[ "$cind" -lt "$clen" ]]; then
sstr='\\multicolumn{1}{c}{'${var[$i-1]//\"/}'}'
estr=' \& \\multicolumn{1}{c}{'${var[$i]//\"/}'}'
festr=' \& \\multicolumn{1}{c}{'${var[$i+1]//\"/}'} \\\\'
sed -i "s/$sstr/&$estr/" script.tex
else
sed -i "s/$estr/&$festr/" script.tex
fi
cind=$((cind + 1))
done
This might work for you (GNU sed):
sed -E 's/\\multicolumn\{1\}\{c\}\{st_var\}/ ABCDEF\n&/
:a;ta;s/(\S)(\S*\n(.*)\{st_var\})/\3{\1} \& \2/;ta
s/ (.*)\&.*/\1\\\\/' file
Prepend a space, the values to substituted for st_var and a newline to the original sting \multicolumn{1}{c}{st_var}.
Iterate through each value prepending the original string with the new value substituted until no more values to be substituted exist.
Clean up the new string, removing the introduced newline and the original string and append \\.

Check if any substring is contained in an array in Bash

Suppose I have a string,
a="This is a string"
and an array,
b=("This is my" "sstring")
I want to execute an if condition if any substring of a lies in b which is true because "This is" is a substring of the first element of b.
In case of two strings I know how to check if $x is a substring of $y using,
if [[ $y == *$x* ]]; then
#Something
fi
but since $x is an array of strings I don't know how to do it without having to explicitly loop through the array.
This might be all you need:
$ printf '%s\n' "${b[#]}" | grep -wFf <(tr ' ' $'\n' <<<"$a")
This is my
Otherwise - a shell is a tool to manipulate files/processes and sequence calls to tools. The guys who invented shell also invented awk for shell to call to manipulate text. What you're trying to do is manipulate text so there's a good chance you should be using awk instead of shell for whatever it is you're doing that this task is a part of.
$ printf '%s\n' "${b[#]}" |
awk -v a="$a" '
BEGIN { split(a,words) }
{ for (i in words) if (index($0,words[i])) { print; f=1; exit} }
END { exit !f }
'
This is my
The above assumes a doesn't contain any backslashes, if it can then use this instead:
printf '%s\n' "${b[#]}" | a="$a" awk 'BEGIN{split(ENVIRON["a"],words)} ...'
If any element in b can contain newlines then:
printf '%s\0' "${b[#]}" | a="$a" awk -v RS='\0' 'BEGIN{split(ENVIRON["a"],words)} ...'
Here is how to match the maximum number of words from string a to entries of array b:
#!/usr/bin/env bash
a="this is a string"
b=("this is my" "string" )
# tokenize a words into an array
read -ra a_words <<<"$a"
match()
{
# iterate entries of array b
for e in "${b[#]}"; do
# tokenize entry words into an array
read -ra e_words <<<"$e"
# initialize counter/length to the shortest MIN words count
i=$(( ${#a_words[#]} < ${#e_words[#]} ? ${#a_words[#]} : ${#e_words[#]} ))
# iterate matching decreasing number of words
while [ 0 -lt "$i" ]; do
# return true it matches
[ "${e_words[*]::$i}" = "${a_words[*]::$i}" ] && return
# decrease number of words to match
i=$(( i - 1 ))
done
done
# reaching here means no match found, return false
return 1
}
if match; then
printf %s\\n 'It matches!'
fi
You can split the $a into an array, then loop both arrays to find matches:
a="this is a string"
b=( "this is my" "string")
# Make an array by splitting $a on spaces
IFS=' ' read -ra aarr <<< "$a"
for i in "${aarr[#]}"
do
for j in "${b[#]}"
do
if [[ $j == *"$i"* ]]; then
echo "Match: $i : $j"
break
fi
done
done
# Match: this : this is my
# Match: is : this is my
# Match: string : string
If you need to handle substrings in $a (e.g. this is, is my etc) then you will need to loop over the array, generating all possible substrings:
for (( length=1; length <= "${#aarr[#]}"; ++length )); do
for (( start=0; start + length <= "${#aarr[#]}"; ++start )); do
substr="${aarr[#]:start:length}"
for j in "${b[#]}"; do
if [[ $j == *"${substr}"* ]]; then
echo "Match: $substr : $j"
break
fi
done
done
done
# Match: this : this is my
# Match: is : this is my
# Match: string : string
# Match: this is : this is my

Changing alternative character from lower to upper and upper to low - Unix shell script

How to convert the alternative character of a string passed to script, if it is lower then it should be converted to upper and if it is upper then to lower??
read -p " Enter string" str
for i in `seq 0 ${#str}`
do
#echo $i
rem=$(($i % 2 ))
if [ $rem -eq 0 ]
then
echo ${str:$i:1}
else
fr=${str:$i:1}
if [[ "$fr" =~ [A-Z] ]]
then
echo ${str:$i:1} | tr '[:upper:]' '[:lower:]'
elif [[ "$fr" =~ [a-z] ]]
then
echo ${str:$i:1} | tr '[:lower:]' '[:upper:]'
else
echo ""
fi
fi
done
Your question is a bit challenging given that it is tagged shell and not as a question pertaining to an advanced shell like bash or zsh. In POSIX shell, you have no string indexes, no C-style for loop, and no [[ .. ]] operator to use character class pattern matching.
However, with a bit of awkward creativity, the old expr and POSIX string and arithmetic operations, and limiting your character strings to ASCII characters, you can iterate over a string changing uppercase to lowercase and lowercase and uppercase while leaving all other characters unchanged.
I wouldn't recommend the approach if you have an advanced shell available, but if you are limited to POSIX shell, as your question is tagged, it will work, but don't expect it to be super-fast...
#!/bin/sh
a=${1:-"This Is My 10TH String"} ## input and output strings
b=
i=1 ## counter and string length
len=$(expr length "$a")
asciiA=$(printf "%d" "'A") ## ASCII values for A,Z,a,z
asciiZ=$(printf "%d" "'Z")
asciia=$(printf "%d" "'a")
asciiz=$(printf "%d" "'z")
echo "input : $a" ## output original string
while [ "$i" -le "$len" ]; do ## loop over each character
c=$(expr substr "$a" "$i" "1") ## extract char from string
asciic=$(printf "%d" "'$c") ## convert to ASCII value
## check if asciic is [A-Za-z]
if [ "$asciiA" -le "$asciic" -a "$asciic" -le "$asciiZ" ] ||
[ "$asciia" -le "$asciic" -a "$asciic" -le "$asciiz" ]
then ## toggle the sign bit (bit-6)
b="${b}$(printf "\x$(printf "%x" $((asciic ^ 1 << 5)))\n")"
else
b="$b$c" ## otherwise copy as is
fi
i=$(expr $i + 1)
done
echo "output: $b" ## output resluting string
The case change is affected by relying on a simple bit-toggle of the case-bit (bit-6) in the ASCII value of each upper or lower case character to change it from lower to upper or vice-versa. (and note, you can exchange the printf and bit-shift for tr of asciic as an alternative)
Example Use/Output
$ sh togglecase.sh
input : This Is My 10TH String
output: tHIS iS mY 10th sTRING
When you want to swab every second characters case, try this:
read -p " Enter string " str
for i in `seq 0 ${#str}`; do
rem=$(($i % 2 ))
if [ $rem -eq 0 ]
then
printf "%s" "${str:$i:1}"
else
fr=${str:$i:1}
printf "%s" "$(tr '[:upper:][:lower:]' '[:lower:][:upper:]' <<< "${str:$i:1}")"
fi
done
echo
EDIT: Second solution
Switch case of str and merge the old and new string.
#!/bin/bash
str="part is lowercase & PART IS UPPERCASE"
str2=$(tr '[:upper:][:lower:]' '[:lower:][:upper:]' <<< "${str}")
str_chopped=$(sed -r 's/(.)./\1\n/g' <<< "${str}");
# Will have 1 additional char for odd length str
# str2_chopped_incorrect=$(sed -r 's/.(.)/\1\n/g' <<< "${str2}");
str2_chopped=$(fold -w2 <<< "${str2}" | sed -nr 's/.(.)/\1/p' );
paste -d '\n' <(echo "${str_chopped}") <(echo "${str2_chopped}") | tr -d '\n'; echo

Absolute value of a number

I want to take the absolute of a number by the following code in bash:
#!/bin/bash
echo "Enter the first file name: "
read first
echo "Enter the second file name: "
read second
s1=$(stat --format=%s "$first")
s2=$(stat -c '%s' "$second")
res= expr $s2 - $s1
if [ "$res" -lt 0 ]
then
res=$res \* -1
fi
echo $res
Now the problem I am facing is in the if statement, no matter what I changes it always goes in the if, I tried to put [[ ]] around the statement but nothing.
Here is the error:
./p6.sh: line 13: [: : integer expression expected
You might just take ${var#-}.
${var#Pattern} Remove from $var the shortest part of $Pattern that matches the front end of $var. tdlp
Example:
s2=5; s1=4
s3=$((s1-s2))
echo $s3
-1
echo ${s3#-}
1
$ s2=5 s1=4
$ echo $s2 $s1
5 4
$ res= expr $s2 - $s1
1
$ echo $res
What's actually happening on the fourth line is that res is being set to nothing and exported for the expr command. Thus, when you run [ "$res" -lt 0 ] res is expanding to nothing and you see the error.
You could just use an arithmetic expression:
$ (( res=s2-s1 ))
$ echo $res
1
Arithmetic context guarantees the result will be an integer, so even if all your terms are undefined to begin with, you will get an integer result (namely zero).
$ (( res = whoknows - whocares )); echo $res
0
Alternatively, you can tell the shell that res is an integer by declaring it as such:
$ declare -i res
$ res=s2-s1
The interesting thing here is that the right hand side of an assignment is treated in arithmetic context, so you don't need the $ for the expansions.
I know this thread is WAY old at this point, but I wanted to share a function I wrote that could help with this:
abs() {
[[ $[ $# ] -lt 0 ]] && echo "$[ ($#) * -1 ]" || echo "$[ $# ]"
}
This will take any mathematical/numeric expression as an argument and return the absolute value. For instance: abs -4 => 4 or abs 5-8 => 3
A workaround: try to eliminate the minus sign.
with sed
x=-12
x=$( sed "s/-//" <<< $x )
echo $x
12
Checking the first character with parameter expansion
x=-12
[[ ${x:0:1} = '-' ]] && x=${x:1} || :
echo $x
12
This syntax is a ternary opeartor. The colon ':' is the do-nothing instruction.
or substitute the '-' sign with nothing (again parameter expansion)
x=-12
echo ${x/-/}
12
Personally, scripting bash appears easier to me when I think string-first.
I translated this solution to bash. I like it more than the accepted string manipulation method or other conditionals because it keeps the abs() process inside the mathematical section
abs_x=$(( x * ((x>0) - (x<0)) ))
x=-3
abs_x= -3 * (0-1) = 3
x=4
abs_x= 4 * (1-0) = 4
For the purist, assuming bash and a relatively recent one (I tested on 4.2 and 5.1):
abs() {
declare -i _value
_value=$1
(( _value < 0 )) && _value=$(( _value * -1 ))
printf "%d\n" $_value
}
If you don't care about the math and only the result matters, you may use
echo $res | awk -F- '{print $NF}'
The simplest solution:
res="${res/#-}"
Deletes only one / occurrence if - is at the first # character.

Simple bash script (input letter output number)

Hi I'm looking to write a simple script which takes an input letter and outputs it's numerical equivalent :-
I was thinking of listing all letters as variables, then have bash read the input as a variable but from here I'm pretty stuck, any help would be awesome!
#!/bin/bash
echo "enter letter"
read "LET"
a=1
b=2
c=3
d=4
e=5
f=6
g=7
h=8
i=9
j=10
k=11
l=12
m=13
n=14
o=15
p=16
q=17
r=18
s=19
t=20
u=21
v=22
w=23
x=24
y=25
z=26
LET=${a..z}
if
$LET = [ ${a..z} ];
then
echo $NUM
sleep 5
echo "success!"
sleep 1
exit
else
echo "FAIL :("
exit
fi
Try this:
echo "Input letter"
read letter
result=$(($(printf "%d\n" \'$letter) - 65))
echo $result
0
ASCII equivalent of 'A' is 65 so all you've got to do to is to take away 65 (or 64, if you want to start with 1, not 0) from the letter you want to check. For lowercase the offset will be 97.
A funny one, abusing Bash's radix system:
read -n1 -p "Type a letter: " letter
if [[ $letter = [[:alpha:]] && $letter = [[:ascii:]] ]]; then
printf "\nCode: %d\n" "$((36#$letter-9))"
else
printf "\nSorry, you didn't enter a valid letter\n"
fi
The interesting part is the $((36#$letter-9)). The 36# part tells Bash to understand the following string as a number in radix 36 which consists of a string containing the digits and letters (case not important, so it'll work with uppercase letters too), with 36#a=10, 36#b=11, …, 36#z=35. So the conversion is just a matter of subtracting 9.
The read -n1 only reads one character from standard input. The [[ $letter = [[:alpha:]] && $letter = [[:ascii:]] ]] checks that letter is really an ascii letter. Without the [[:ascii:]] test, we would validate characters like é (depending on locale) and this would mess up with the conversion.
use these two functions to get chr and ord :
chr() {
[ "$1" -lt 256 ] || return 1
printf "\\$(printf '%03o' "$1")"
}
ord() {
LC_CTYPE=C printf '%d' "'$1"
}
echo $(chr 97)
a
USing od and tr
echo "type letter: "
read LET
echo "$LET" | tr -d "\n" | od -An -t uC
OR using -n
echo -n "$LET" | od -An -t uC
If you want it to start at a=1
echo $(( $(echo -n "$LET" | od -An -t uC) - 96 ))
Explanation
Pipes into the tr to remove the newline.
Use od to change to unsigned decimal.
late to the party: use an associative array:
# require bash version 4
declare -A letters
for letter in {a..z}; do
letters[$letter]=$((++i))
done
read -p "enter a single lower case letter: " letter
echo "the value of $letter is ${letters[$letter]:-N/A}"

Resources