to extract pattern from a variable in unix - shell

I have a pattern for example
type=hello(10,1)
How can I extract text alone into one variable and the contents into another variable in shell script.
The desired output is
text=hello
p1=10
p2=1
Appreciate any help in this regard.
Thanks

you can use awk with -F to define delimiter
In this case your delimeter can be [=(,] (escape the open bracket)
$ y=$(echo "type=hello(10,1)" | awk -F [=\(,] '{print $2}')
$ echo $y
hello
awk index starts from 1, so $2 = hello, $3 = 10, $4=1

If you are using bash, you can use a regular expression with capture groups. A simple example:
[[ $type =~ (.*)\((.*),(.*)\) ]]
text=${BASH_REMATCH[1]}
p1=${BASH_REMATCH[2]}
p2=${BASH_REMATCH[3]}
If you are a using a shell that does not directly support regular expression matching, you can use the expr command, but it only supports one capture group at a time.
$ text=$( expr "$type" : '\(.*\)(' )
$ p1=$( expr "$type" : '.*(\(.*\),' )
$ pt2=$( expr "$type" : '.*(.*,\(.*\))' )

Related

How to parse multiple line output as separate variables

I'm relatively new to bash scripting and I would like someone to explain this properly, thank you. Here is my code:
#! /bin/bash
echo "first arg: $1"
echo "first arg: $2"
var="$( grep -rnw $1 -e $2 | cut -d ":" -f1 )"
var2=$( grep -rnw $1 -e $2 | cut -d ":" -f1 | awk '{print substr($0,length,1)}')
echo "$var"
echo "$var2"
The problem I have is with the output, the script I'm trying to write is a c++ function searcher, so upon launching my script I have 2 arguments, one for the directory and the second one as the function name. This is how my output looks like:
first arg: Projekt
first arg: iseven
Projekt/AX/include/ax.h
Projekt/AX/src/ax.cpp
h
p
Now my question is: how do can I save the line by line output as a variable, so that later on I can use var as a path, or to use var2 as a character to compare. My plan was to use IF() statements to determine the type, idea: IF(last_char == p){echo:"something"}What I've tried was this question: Capturing multiple line output into a Bash variable and then giving it an array. So my code looked like: "${var[0]}". Please explain how can I use my line output later on, as variables.
I'd use readarray to populate an array variable just in case there's spaces in your command's output that shouldn't be used as field separators that would end up messing up foo=( ... ). And you can use shell parameter expansion substring syntax to get the last character of a variable; no need for that awk bit in your var2:
#!/usr/bin/env bash
readarray -t lines < <(printf "%s\n" "Projekt/AX/include/ax.h" "Projekt/AX/src/ax.cpp")
for line in "${lines[#]}"; do
printf "%s\n%s\n" "$line" "${line: -1}" # Note the space before the -1
done
will display
Projekt/AX/include/ax.h
h
Projekt/AX/src/ax.cpp
p

How to add a hyphen after every fifth character of a word in bash

Given "ABCDEFGHIJKLMOPQRSTUVWXY"
How does one achieve this outcome? "ABCDE-FGHIJ-KLMNO-PQRST-UVWXY"
With sed you can do this by first adding a - after every 5 characters, then removing the trailing - at the end of the line:
$ sed -E 's/.{5}/&-/g; s/-$//' <<<"ABCDEFGHIJKLMNOPQRSTUVWXY"
ABCDE-FGHIJ-KLMNO-PQRST-UVWXY
In extended (-E) mode:
.{5} matches any 5 characters
&- replaces with the whole match (the 5 characters) plus -
Then the second substitution command matches - at the end of the line ($) and replaces with nothing.
With GNU awk, one option would be to use FPAT to define the way the line is interpreted as a series of fields, then add - between each field:
$ awk -v FPAT='.{5}' -v OFS='-' '{ $1 = $1 } 1' <<<"ABCDEFGHIJKLMNOPQRSTUVWXY"
ABCDE-FGHIJ-KLMNO-PQRST-UVWXY
The field pattern FPAT is defined as any 5 characters and the Output Field Separator OFS is defined as -. $1 = $1 "touches" every line, causing it to be reformatted (without this part, nothing would happen). 1 is the shortest true condition causing each line to be printed.
It's not too difficult to do this in bash either:
#!/bin/bash
input="ABCDEFGHIJKLMNOPQRSTUVWXY"
parts=()
# build an array from slices of length 5
for (( i = 0; i < ${#input}; i += 5 )) do
parts+=( "${input:i:5}" )
done
# join the array on IFS (use a subshell to avoid modifying IFS for rest of script)
( IFS=-; echo "${parts[*]}" )
Could you please try following.
echo "ABCDEFGHIJKLMOPQRSTUVWXY" | sed 's/...../&-/g;s/-$//'
A simple solution for only letters will be
sed -E 's/[A-Z]{4}./&-/g' file.txt
The output will be:
ABCDE-FGHIJ-KLMOP-QRSTU-VWXY
if you want them to include more than capital letters just do a:
sed -E 's/[A-Za-z]{4}./&-/g' file.txt
Try this
#!/bin/bash
s="ABCDEFGHIJKLMNOPQRSTUVWXY"
a=($(echo ${s} | grep -o .))
o=""
i=0
while [[ ${i} -lt ${#a[#]} ]]; do
o="${o}${a[${i}]}"
(( i++ ))
[[ $(( i % 5 )) -eq 0 ]] && [[ ${i} -ne ${#a[#]} ]] && o="${o}-"
done
echo ${o}
exit 0
another solution with fold/paste
$ echo {A..Y} | tr -d ' ' | # this is to generate the string
fold -w5 | paste -sd-
ABCDE-FGHIJ-KLMNO-PQRST-UVWXY
This might work for you (GNU sed):
sed 's/.\{5\}\B/&-/g' file
Insert a hyphen every five characters as long as the fifth character is inside a word.
Yet another choice
perl -pe 's/(.{5})(?=.)/$1-/g' file
Match 5 characters that are followed by another character (to avoid the trailing hyphen problem)

Trying to retrieve first 5 characters (only number & alphabet) from string in bash

I have a string like that
1-a-bc-dxyz
I'd want to get 1-a-bc-d ( first 5 characters, only number and alphabet)
Thanks
With gawk:
awk '{ for ( i=1;i<=length($0);i++) { if ( match(substr($0,i,1),/[[:alnum:]]/)) { cnt++;if ( cnt==5) { print substr($0,1,i) } } } }' <<< "1-a-bc-dxyz"
Read each character one by one and then if there is a pattern match for an alpha-numeric character (using the match function), increment a variable cnt. When cnt gets to 5, print the string we have seen so far (using the substr function)
Output:
1-a-bc-d
a='1-a-bc-dxyz'
count=0
for ((i=0;i<${#a};i++)); do
if [[ "${a:$i:1}" =~ [0-9]|[a-Z] ]] && [[ $((++count)) -eq 5 ]]; then
echo "${a:0:$((i+1))}"
exit
fi
done
You can further shrink this as;
a='1-a-bc-dxyz'
count=0
for ((i=0;i<${#a};i++)); do [[ "${a:$i:1}" =~ [0-9]|[a-Z] ]] && [[ $((++count)) -eq 5 ]] && echo "${a:0:$((i+1))}"; done
Using GNU awk:
$ echo 1-a-bc-dxyz | \
awk -F '' '{b=i="";while(gsub(/[0-9a-z]/,"&",b)<5)b=b $(++i);print b}'
1-a-bc-d
Explained:
awk -F '' '{ # separate each char to its own field
b=i="" # if you have more than one record to process
while(gsub(/[0-9a-z]/,"&",b)<5) # using gsub for counting (adjust regex if needed)
b=b $(++i) # gather buffer
print b # print buffer
}'
GNU sed supports an option to replace the k-th occurrence and all after that.
echo "1-a-bc-dxyz" | sed 's/[^a-zA-Z0-9]*[a-zA-Z0-9]//g6'
Using Combination of sed & AWK
echo 1-a-bc-dxyz | sed 's/[-*%$##]//g' | awk -F '' {'print $1$2$3$4$5'}
You can use for loop for printing character as well.
echo '1-a-bc-dxyz' | grep -Eo '^[[:print:]](-*[[:print:]]){4}'
That is pretty simple.
Neither sed nor awk.

Substring from a string in bash using scripting language

How can we fetch a substring from a string in bash using scripting language?
Example:
fullstring="mnuLOCNMOD.URL = javascript:parent.doC...something"
The substring I want is everything before ".URL" in the full string.
With Parameter Expansion, you can do:
fullstring="mnuLOCNMOD.URL = javascript:parent.doC...something"
echo ${fullstring%\.URL*}
prints:
mnuLOCNMOD
$ fullstring="mnuLOCNMOD.URL = javascript:parent.doC...something"
$ sed -r 's/^(.*)\.URL.*$/\1/g' <<< "$fullstring"
mnuLOCNMOD
$
You can use grep:
echo "mnuLOCNMOD.URL = javas" | grep -oP '\w+(?=\.URL)'
and assign the result to a string. I used a positive lookahead (?=regex) because it's a zero length assertion, meaning that it'll be matched but won't be displayed.
Run grep --help to find out what o and P flags stand for.
Parameter Expansion is the way to go.
If you are interested in a simple grep:
% fullstring="mnuLOCNMOD.URL = javascript:parent.doC...something"
% grep -o '^[^.]*' <<<"$fullstring"
mnuLOCNMOD
fullstring="mnuLOCNMOD.URL = javascript:parent.doC...something"
menuID=`echo $fullstring | cut -f 1 -d '.'`
here I used dot as a separator
this works in .sh files
To offer yet another alternative: Bash's regular-expression matching operator, =~:
fullstring="mnuLOCNMOD.URL = javascript:parent.doC...something"
echo "$([[ $fullstring =~ ^(.*)'.URL' ]] && echo "${BASH_REMATCH[1]}")"
Note how the (one and only) capture group ((.*)) is reported through element 1 of the special "${BASH_REMATCH[#]}" array variable.
While in this case l3x's parameter expansion solution is simpler, =~ generally offers more flexibility.
awk offers an easy solution as well:
echo "$(awk -F'\\.URL' '{ print $1 }' <<<"$fullstring")"

Extract numbers from strings

I have a file containing on each line a string of the form
string1.string2:\string3{string4}{number}
and what I want to extract is the number. I've searched and tried for a while to get this done using sed or bash, but failed. Any help would be much appreciated.
Edit 1: The strings may contains numbers.
$ echo 'string1.string2:\string3{string4}{number}' |\
cut -d'{' -f3 | cut -d'}' -f 1
number
Using sed:
sed 's/[^}]*}{\([0-9]*\)}/\1/' input_file
Description:
[^}]*} : match anything that is not } and the following }
{\([0-9]*\)}: capture the following digits within {...}
/\1/ : substitute all with the captured number
Use grep:
grep -o '\{[0-9]\+\}' | tr -d '[{}]'
In bash:
sRE='[[:alnum:]]+'
nRE='[[:digit:]]+'
[[ $str =~ $sRE\.$sRE:\\$sRE\{$sRE\}\{($nRE)\} ]] && number=${BASH_REMATCH[1]}
You can drop the first part of the regular expression, if your text file is sufficiently uniform:
[[ $str =~ \\$sRE{$sRE}{($nRE)} ]] && number=${BASH_REMATCH[1]}
or even
[[ $str =~ {$sRE}{($nRE)} ]] && number=${BASH_REMATCH[1]}

Resources