Determine the number of characters in a variable - bash

How can I determine the number of characters in a variable?
FOO="blabla.bla.blabla.bla."
--check--
echo $FOO # 4 dot
FOO="..bla.bla.bla.blabla.bla."
--check--
echo $FOO # 7 dot

You should try this:
echo ${#FOO}
${#VARIABLE_NAME} gives you the lenght of a string. Read (its on top of the page)

awk -F. '{print NF-1}' <<<$FOO
example:
kent$ FOO="blabla.bla.blabla.bla."
kent$ awk -F. '{print NF-1}' <<<$FOO
4
kent$ FOO="..bla.bla.bla.blabla.bla."
kent$ awk -F. '{print NF-1}' <<<$FOO
7

echo $FOO | tr -dc \\. | wc -c
Does that answer your question?

Strip the non-dots and count the length of the result.
$ x=..bla.bla.bla.blabla.bla.
$ _=${x//[^.]} count=${#_}; echo "$count"
7
$ printf -v _ %s%n "${x//[^.]}" count; echo "$count"
7

Related

How to get version number from string in bash

I have a variable having following format
bundle="chn-pro-X.Y-Z.el8.x86_64"
X,Y,Z are numbers having any number of digits
Ex:
1.0-2 # X=1 Y=0 Z=2
12.45-9874 # X=12 Y=45 Z=9874
How can I grab X.Y and store it in another variable?
EDIT:
I wasn't right with my wording, but
I want to store X.Y into new variable not individual X & Y's
I'm looking to finally have a variable version which has X.Y grabbed from bundle:
version="X.Y"
I would use awk:
bundle="chn-pro-12.45-9874.el8.x86_64"
echo "$bundle" | awk -F "[.-]" '{print $3,$4,$5}'
12 45 9874
Now if you want to assign to x, y, z use read and process substitution:
read -r x y z < <(echo "$bundle" | awk -F "[.-]" '{print $3,$4,$5}')
echo "x=$x, y=$y, z=$z"
x=12, y=45, z=9874
If you just want the value of X.Y as a single value this is still great use for awk:
bundle="chn-pro-12.45-9874.el8.x86_64"
echo "$bundle" | awk -F "[-]" '{print $3}'
12.45
And if you then want to put that into a variable:
x_y=$(echo "$bundle" | awk -F "[-]" '{print $3}')
echo "x_y=$x_y"
x_y=12.45
Or you can use cut in this case to get the third field:
echo "$bundle" | cut -d- -f3
12.45
Like that:
$ bundle="chn-pro-1.0-2.el8.x86_64"
$ X="$(echo "$bundle" | cut -d . -f1 | cut -d- -f3)"
$ Y="$(echo "$bundle" | cut -d . -f2 | cut -d- -f1)"
$ Z="$(echo "$bundle" | cut -d . -f2 | cut -d- -f2)"
$ echo "$X"
1
$ echo "$Y"
0
$ echo "$Z"
2
You can merge X and Y into a single variable:
$ XY="$X.$Y"
$ echo $XY
1.0
Use regex to separate numbers:
numbers=$(echo $bundle | grep -Eo '([0-9]+\.[0-9]+\-[0-9]+)' | sed 's/\./\t/g;s/\-/\t/g')
Then assign them to variables with using awk or tr or cut, whatever you want:
X=$(echo $numbers| awk '{print $1}')
Y=$(echo $numbers| awk '{print $2}')
Z=$(echo $numbers| awk '{print $3}')
EDIT
For storing x.y into single version variable you can simply ignore pervios commands:
version=$(echo $bundle | grep -Eo '([0-9]+\.[0-9]+\-[0-9]+)' | grep -Eo '([0-9]+\.[0-9]+)')
Given this input:
$ bundle="chn-pro-12.45-9874.el8.x86_64"
using GNU or BSD sed for -E:
$ foo=$(echo "$bundle" | sed -E 's/.*-([0-9]+\.[0-9]+)-[0-9].*/\1/')
$ echo "$foo"
12.45
or with any sed:
$ foo=$(echo "$bundle" | sed 's/.*-\([0-9][0-9]*\.[0-9][0-9]*\)-[0-9].*/\1/')
$ echo "$foo"
12.45
Assumptions:
the input string will always contain (at least) 3 hyphens
the desired version string will always reside between the 2nd and 3rd hyphens of the input string
we need to maintain the input string (ie, don't clobber/overwrite the variable containing the input string)
We can eliminate the subprocess calls (necessary for echo/sed/grep/awk/sed) by using some parameter expansions:
$ bundle="chn-pro-X.Y-Z.el8.x86_64"
$ temp="${bundle#*-}" # strip off 1st hyphen delimited string
$ echo "${temp}"
pro-X.Y-Z.el8.x86_64
$ temp="${temp#*-}" # strip off 2nd hyphen delimited string
$ echo "${temp}"
X.Y-Z.el8.x86_64
$ version="${temp%%-*}" # save 3rd hyphen delimited string (aka our version)
$ echo "${version}"
X.Y
NOTE: We can eliminate the temp variable by replacing all occurrences of temp with version with the understanding version does not contain what we want until after the 3rd parameter expansion has occurred, eg:
$ bundle="chn-pro-X.Y-Z.el8.x86_64"
$ version="${bundle#*-}"
$ version="${version#*-}"
$ version="${version%%-*}"
$ echo "${version}"
X.Y

How to get the line number of a string in another string in Shell

Given
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
I'd like to get the line number of the first occurrence of $str in $sourceStr, which should be 3.
I don't know how to do it.
I have tried:
awk 'match($0, v) { print NR; exit }' v=$str <<<$sourceStr
grep -n $str <<< $sourceStr | grep -Eo '^[^:]+';
grep -n $str <<< $sourceStr | cut -f1 -d: | sort -ug
grep -n $str <<< $sourceStr | awk -F: '{ print $1 }' | sort -u
All output 1, not 3.
How can I get the line number of $str in $sourceStr?
Thanks!
You may use this awk + printf in bash:
awk -v s="$str" '$0 == s {print NR; exit}' <(printf "%b\n" "$sourceStr")
3
Or even this awk without any bash support:
awk -v s="$str" -v source="$sourceStr" 'BEGIN {
split(source, a); for (i=1; i in a; ++i) if (a[i] == s) {print i; exit}}'
3
You may use this sed as well:
sed -n "/^$str$/{=;q;}" <(printf "%b\n" "$sourceStr")
3
Or this grep + cut:
printf "%b\n" "$sourceStr" | grep -nxF -m 1 "$str" | cut -d: -f1
3
It's not clear if you've just made a cut-n-paste error, but your sourceStr is not a multiline string (as demonstrated below). Also, you really need to quote your herestring (also demonstrated below). Perhaps you just want:
$ sourceStr="abc\nefg\nhij\nlmn\nhij"
$ echo "$sourceStr"
abc\nefg\nhij\nlmn\nhij
$ sourceStr=$'abc\nefg\nhij\nlmn\nhij'
$ echo "$sourceStr"
abc
efg
hij
lmn
hij
$ cat <<< $sourceStr
abc efg hij lmn hij
$ cat <<< "$sourceStr"
abc
efg
hij
lmn
hij
$ str=hij
$ awk "/${str}/ {print NR; exit}" <<< "$sourceStr"
3
Just use sed!
printf 'abc\nefg\nhij\nlmn\nhij\n' \
| sed -n '/hij/ { =; q; }'
Explanation: if sed meets a line that contains "hij" (regex /hij/), it prints the line number (the = command) and exits (the q command). Else it doesn't print anything (the -n switch) and goes on with the next line.
[update] Hmmm, sorry, I just noticed your "All output 1, not 3".
The primary reason why your commands don't output 3 is that sourceStr="abc\nefg\nhij\nlmn\nhij" doesn't automagically change your \n into new lines, so it ends up being one single line and that's why your commands always display 1.
If you want a multiline string, here are two solutions with bash:
printf -v sourceStr "abc\nefg\nhij\nlmn\nhij"
sourceStr=$'abc\nefg\nhij\nlmn\nhij'
And now that your variable contains space characters (new lines), as stated by William Pursell, in order to preserve them, you must enclose your $sourceStr with double quotes:
grep -n "$str" <<< "$sourceStr" | ...
There's always a hard way to do it:
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | nl | grep $str | head -1 | gawk '{ print $1 }'
or, a bit more efficient:
str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | gawk '/'$str/'{ print NR; exit }'

Get only numbers in output

I need to get only numbers from this:
release/M_0.1.0
thus, need to extract with bash to have in output this:
0.1.0.
I have tried this but cannot finish it:
echo "release/M_0.1.0" | awk -F'/' '{print $2}'
And what about if given such string? relea234se/sdf23_4Mm0.1.0.8. How to get only 0.1.0.8? Please pay attention that this can be random digits such as 0.2 or 1.9.1.
Please check if this grep command works
echo "release/M_0.1.0" | egrep -o '[0-9.]+'
You could also use general parameter expansion parsing to literally remove characters up through the last that isn't digits or dots.
$: ver() { echo "${1//*[^.0-9]/}"; }
$: ver release/M_0.1.0
0.1.0
$: ver relea234se/sdf23_4Mm0.1.0.8
0.1.0.8
With sed you can do:
echo "release/M_0.1.0" | sed 's#.*_##'
Output:
0.1.0
Considering that your Input_file will be same as shown samples.
echo "$var" | awk -F'_' '{print $2}'
OR could use sub:
echo "$var" | awk '{sub(/.*_/,"")} 1'
With simple bash you could use:
echo "${var#*_}"
echo release/M_0.1.0 | awk -F\_ '{print $2}'
0.1.0
Take your pick:
$ var='relea234se/sdf23_4Mm0.1.0.8'
$ [[ $var =~ .*[^0-9.](.*) ]] && echo "${BASH_REMATCH[1]}"
0.1.0.8
$ echo "$var" | sed 's/.*[^0-9.]//'
0.1.0.8
$ echo "$var" | awk -F'[^0-9.]' '{print $NF}'
0.1.0.8
if data in d file, tried on gnu sed:
sed -E 's/relea.*/.*([0-9][0-9.]*)$/\1/' d

Extract data between delimiters from a Shell Script variable

I have this shell script variable, var. It keeps 3 entries separated by new line. From this variable var, I want to extract 2, and 0.078688. Just these two numbers.
var="USER_ID=2
# 0.078688
Suhas"
These are the code I tried:
echo "$var" | grep -o -P '(?<=\=).*(?=\n)' # For extracting 2
echo "$var" | awk -v FS="(# |\n)" '{print $2}' # For extracting 0.078688
None of the above working. What is the problem here? How to fix this ?
Just use tr alone for retaining the numerical digits, the dot (.) and the white-space and remove everything else.
tr -cd '0-9. ' <<<"$var"
2 0.078688
From the man page, of tr for usage of -c, -d flags,
tr [OPTION]... SET1 [SET2]
-c, -C, --complement
use the complement of SET1
-d, --delete
delete characters in SET1, do not translate
To store it in variables,
IFS=' ' read -r var1 var2 < <(tr -cd '0-9. ' <<<"$var")
printf "%s\n" "$var1"
2
printf "%s\n" "$var2"
2
0.078688
Or in an array as
IFS=' ' read -ra numArray < <(tr -cd '0-9. ' <<<"$var")
printf "%s\n" "${numArray[#]}"
2
0.078688
Note:- The -cd flags in tr are POSIX compliant and will work on any systems that has tr installed.
echo "$var" |grep -oP 'USER_ID=\K.*'
2
echo "$var" |grep -oP '# \K.*'
0.078688
Your solution is near to perfect, you need to chance \n to $ which represent end of line.
echo "$var" |awk -F'# ' '/#/{print $2}'
0.078688
echo "$var" |awk -F'=' '/USER_ID/{print $2}'
2
You can do it with pure bash using a regex:
#!/bin/bash
var="USER_ID=2
# 0.078688
Suhas"
[[ ${var} =~ =([0-9]+).*#[[:space:]]([0-9\.]+) ]] && result1="${BASH_REMATCH[1]}" && result2="${BASH_REMATCH[2]}"
echo "${result1}"
echo "${result2}"
With awk:
First value:
echo "$var" | grep 'USER_ID' | awk -F "=" '{print $2}'
Second value:
echo "$var" | grep '#' | awk '{print $2}'
Assuming this is the format of data as your sample
# For extracting 2
echo "$var" | sed -e '/.*=/!d' -e 's///'
echo "$var" | awk -F '=' 'NR==1{ print $2}'
# For extracting 0.078688
echo "$var" | sed -e '/.*#[[:blank:]]*/!d' -e 's///'
echo "$var" | awk -F '#' 'NR==2{ print $2}'

Count the number of digits in a bash variable

I have a number num=010. I would like to count the number of digits contained in this number. If the number of digits is above a certain number, I would like to do some processing.
In the above example, the number of digits is 3.
Thanks!
Assuming the variable only contains digits then the shell already does what you want here with the length Shell Parameter Expansion.
$ var=012
$ echo "${#var}"
3
In BASH you can do this:
num='a0b1c0d23'
n="${num//[^[:digit:]]/}"
echo ${#n}
5
Using awk you can do:
num='012'
awk -F '[0-9]' '{print NF-1}' <<< "$num"
3
num='00012'
awk -F '[0-9]' '{print NF-1}' <<< "$num"
5
num='a0b1c0d'
awk -F '[0-9]' '{print NF-1}' <<< "$num"
3
Assuming that the variable x is the "certain number" in the question
chars=`echo -n $num | wc -c`
if [ $chars -gt $x ]; then
....
fi
this work for arbitrary string mixed with digits and non digits:
ndigits=`echo $str | grep -P -o '\d' | wc -l`
demo:
$ echo sf293gs192 | grep -P -o '\d' | wc -l
6
Using sed:
s="string934 56 96containing digits98w6"
num=$(echo "$s" |sed 's/[^0-9]//g')
echo ${#num}
10
Using grep:
s="string934 56 96containing digits98w6"
echo "$s" |grep -o "[0-9]" |grep -c ""
10

Resources