How to extract multiple text and numbers from a string using sed? [duplicate] - bash

This question already has answers here:
Linux bash: Multiple variable assignment
(6 answers)
Closed 7 years ago.
How can I extract 3 or more separate text from a line using 'sed'
I have the following line:
echo <MX><[Mike/DOB-029/Post-555/Male]><MX>
So far I am able to extract the 'DOB-029' by doing
sed -n 's/.*\(DOB-[0-9]*\).*/\1/p'
but I am not getting the other texts such as the name or the post.
My expected output should be Mike DOB-029 Post-555
Edited
Say I have a list within a file and I want to extract specific text/IDs from the entire list and save it to a .txt file

sed 's/.*[\(.*\).\(DOB-[0-9]*\).\(Post-[0-9]*\).*/\1 \2 \3/' should do the trick!
Parts in between \( and \) are captured strings that can be called upon using \i with i the index of the group.
Script for custom use:
#! /bin/bash
fields=${1:-123}
file='/path/to/input'
name=$(sed 's/.*\[\([^\/]*\)\/.*/\1/' $file)
dob=$(sed 's/.*\(DOB-[0-9]*\).*/\1/' $file)
post=$(sed 's/.*\(Post-[0-9]*\).*/\1/' $file)
[[ $fields =~ .*1.* ]] && output=$name
[[ $fields =~ .*2.* ]] && output="$output $dob"
[[ $fields =~ .*3.* ]] && output="$output $post"
echo $output
Set the file with the line you want to parse in the file variable (I can add more functionality such as supplying the file as argument or getting it from a larger file if you like). And execute the script with an int argument, if this int contains '1' it will display name, if 2, it will display DOB and 3 will output post information. You can combine to e.g. '123' or '32' or whichever combination you like.
Stdin
If you want to read from stdin, use following script:
#! /usr/bin/env bash
line=$(cat /dev/stdin)
fields=${1:-123}
name=$(echo $line | sed 's/.*\[\([^\/]*\)\/.*/\1/')
dob=$(echo $line | sed 's/.*\(DOB-[0-9]*\).*/\1/')
post=$(echo $line | sed 's/.*\(Post-[0-9]*\).*/\1/')
[[ $fields =~ .*1.* ]] && output=$name
[[ $fields =~ .*2.* ]] && output="$output $dob"
[[ $fields =~ .*3.* ]] && output="$output $post"
echo $output
Example usage:
$ chmod +x script.sh
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 123
Mike DOB-029 Post-555
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 12
Mike DOB-029
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 32
DOB-029 Post-555
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh
Mike DOB-029 Post-555

A solution with awk:
echo "<MX><[Mike/DOB-029/Post-555/Male]><MX>" | awk -F[/[] '{print $2, $3, $4}'
We set the delimiter as / or [ (-F[/[]). then we just print the fields $2, $3 and $4 which are the 2nd, 3rd and 4th fields respectively.
With sed:
echo "<MX><[Mike/DOB-029/Post-555/Male]><MX>" | sed 's/\(^.*\[\)\(.*\)\(\/[^/]*$\)/\2/; s/\// /g'

use the bash substitution builtins.
line="<MX><[Mike/D0B-029/Post-555/Male]><MX>";
linel=${line/*[/}; liner=${linel%\/*}; echo ${liner//\// }

Related

How to cut variables which are beteween quotes from a string

I had problem with cut variables from string in " quotes. I have some scripts to write for my sys classes, I had a problem with a script in which I had to read input from the user in the form of (a="var1", b="var2")
I tried the code below
#!/bin/bash
read input
a=$($input | cut -d '"' -f3)
echo $a
it returns me a error "not found a command" on line 3 I tried to double brackets like
a=$(($input | cut -d '"' -f3)
but it's still wrong.
In a comment the OP gave a working answer (should post it as an answer):
#!/bin/bash
read input
a=$(echo $input | cut -d '"' -f2)
b=$(echo $input | cut -d '"' -f4)
echo sum: $(( a + b))
echo difference: $(( a - b))
This will work for user input that is exactly like a="8", b="5".
Never trust input.
You might want to add the check
if [[ ${input} =~ ^[a-z]+=\"[0-9]+\",\ [a-z]+=\"[0-9]+\"$ ]]; then
echo "Use your code"
else
echo "Incorrect input"
fi
And when you add a check, you might want to execute the input (after replacing the comma with a semicolon).
input='testa="8", testb="5"'
if [[ ${input} =~ ^[a-z]+=\"[0-9]+\",\ [a-z]+=\"[0-9]+\"$ ]];
then
eval $(tr "," ";" <<< ${input})
set | grep -E "^test[ab]="
else
echo no
fi
EDIT:
#PesaThe commented correctly about BASH_REMATCH:
When you use bash and a test on the input you can use
if [[ ${input} =~ ^[a-z]+=\"([0-9]+)\",\ [a-z]+=\"([0-9])+\"$ ]];
then
a="${BASH_REMATCH[1]}"
b="${BASH_REMATCH[2]}"
fi
To extract the digit 1 from a string "var1" you would use a Bash substring replacement most likely:
$ s="var1"
$ echo "${s//[^0-9]/}"
1
Or,
$ a="${s//[^0-9]/}"
$ echo "$a"
1
This works by replacing any non digits in a string with nothing. Which works in your example with a single number field in the string but may not be what you need if you have multiple number fields:
$ s2="1 and a 2 and 3"
$ echo "${s2//[^0-9]/}"
123
In this case, you would use sed or grep awk or a Bash regex to capture the individual number fields and keep them distinct:
$ echo "$s2" | grep -o -E '[[:digit:]]+'
1
2
3

Replacing part of a string in bash using sed [duplicate]

This question already has answers here:
unix sed substitute nth occurence misfunction?
(3 answers)
Closed 4 years ago.
In bash, suppose I have the input:
ATGTGSDTST
and I want to print:
AT
ATGT
ATGTGSDT
ATGTGSDTST
which means that I need to look for all the substrings that end with 'T' and print them.
I thought I should use sed inside a for loop, but I don't understand how to use sed correctly in this case.
Any help?
Thanks
The following script uses sed:
#!/usr/bin/env bash
pattern="ATGTGSDTST"
sub="T"
# Get number of T in $pattern:
num=$(grep -o -n "T" <<< "$pattern" | cut -d: -f1 | uniq -c | grep -o "[0-9]\+ ")
i=1
text=$(sed -n "s/T.*/T/p" <<< "$pattern")
echo $text
while [ $i -lt $num ]; do
text=$(sed -n "s/\($sub[^T]\+T\).*/\1/p" <<< "$pattern")
sub=$text
echo $text
((i++))
done
gives output:
AT
ATGT
ATGTGSDT
ATGTGSDTST
No sed needed, just use parameter expansion:
#! /bin/bash
string=ATGTGSDTST
length=${#string}
prefix=''
while (( ${#prefix} != $length )) ; do
sub=${string%%T*}
sub+=T
echo $prefix$sub
string=${string#$sub}
prefix+=$sub
done

Parsing .ini file in bash

I have a below properties file and would like to parse it as mentioned below. Please help in doing this.
.ini file which I created :
[Machine1]
app=version1
[Machine2]
app=version1
app=version2
[Machine3]
app=version1
app=version3
I am looking for a solution in which ini file should be parsed like
[Machine1]app = version1
[Machine2]app = version1
[Machine2]app = version2
[Machine3]app = version1
[Machine3]app = version3
Thanks.
Try:
$ awk '/\[/{prefix=$0; next} $1{print prefix $0}' file.ini
[Machine1]app=version1
[Machine2]app=version1
[Machine2]app=version2
[Machine3]app=version1
[Machine3]app=version3
How it works
/\[/{prefix=$0; next}
If any line begins with [, we save the line in the variable prefix and then we skip the rest of the commands and jump to the next line.
$1{print prefix $0}
If the current line is not empty, we print the prefix followed by the current line.
Adding spaces
To add spaces around any occurrence of =:
$ awk -F= '/\[/{prefix=$0; next} $1{$1=$1; print prefix $0}' OFS=' = ' file.ini
[Machine1]app = version1
[Machine2]app = version1
[Machine2]app = version2
[Machine3]app = version1
[Machine3]app = version3
This works by using = as the field separator on input and = as the field separator on output.
I love John1024's answer. I was looking for exactly that. I have created a bash function that allows me to lookup sections or specific keys based on his idea:
function iniget() {
if [[ $# -lt 2 || ! -f $1 ]]; then
echo "usage: iniget <file> [--list|<section> [key]]"
return 1
fi
local inifile=$1
if [ "$2" == "--list" ]; then
for section in $(cat $inifile | grep "\[" | sed -e "s#\[##g" | sed -e "s#\]##g"); do
echo $section
done
return 0
fi
local section=$2
local key
[ $# -eq 3 ] && key=$3
# https://stackoverflow.com/questions/49399984/parsing-ini-file-in-bash
# This awk line turns ini sections => [section-name]key=value
local lines=$(awk '/\[/{prefix=$0; next} $1{print prefix $0}' $inifile)
for line in $lines; do
if [[ "$line" = \[$section\]* ]]; then
local keyval=$(echo $line | sed -e "s/^\[$section\]//")
if [[ -z "$key" ]]; then
echo $keyval
else
if [[ "$keyval" = $key=* ]]; then
echo $(echo $keyval | sed -e "s/^$key=//")
fi
fi
fi
done
}
So given this as file.ini
[Machine1]
app=version1
[Machine2]
app=version1
app=version2
[Machine3]
app=version1
app=version3
then the following results are produced
$ iniget file.ini --list
Machine1
Machine2
Machine3
$ iniget file.ini Machine3
app=version1
app=version3
$ iniget file.ini Machine1 app
version1
$ iniget file.ini Machine2 app
version2
version3
Again, thanks to #John1024 for his answer, I was pulling my hair out trying to create a simple bash ini parser that supported sections.
Tested on Mac using GNU bash, version 5.0.0(1)-release (x86_64-apple-darwin18.2.0)
You can try using awk:
awk '/\[[^]]*\]/{ # Match pattern like [...]
a=$1;next # store the pattern in a
}
NF{ # Match non empty line
gsub("=", " = ") # Add space around the = character
print a $0 # print the line
}' file
Excellent answers here. I made some modifications to #davfive's function to fit it better to my use case. This version is largely the same except it allows for whitespace before and after = characters, and allows values to have spaces in them.
# Get values from a .ini file
function iniget() {
if [[ $# -lt 2 || ! -f $1 ]]; then
echo "usage: iniget <file> [--list|<section> [key]]"
return 1
fi
local inifile=$1
if [ "$2" == "--list" ]; then
for section in $(cat $inifile | grep "^\\s*\[" | sed -e "s#\[##g" | sed -e "s#\]##g"); do
echo $section
done
return 0
fi
local section=$2
local key
[ $# -eq 3 ] && key=$3
# This awk line turns ini sections => [section-name]key=value
local lines=$(awk '/\[/{prefix=$0; next} $1{print prefix $0}' $inifile)
lines=$(echo "$lines" | sed -e 's/[[:blank:]]*=[[:blank:]]*/=/g')
while read -r line ; do
if [[ "$line" = \[$section\]* ]]; then
local keyval=$(echo "$line" | sed -e "s/^\[$section\]//")
if [[ -z "$key" ]]; then
echo $keyval
else
if [[ "$keyval" = $key=* ]]; then
echo $(echo $keyval | sed -e "s/^$key=//")
fi
fi
fi
done <<<"$lines"
}
For taking disparate sectional and tacking the section name (including 'no-section'/Default together) to each of its related keyword (along with = and its keyvalue), this one-liner AWK will do the trick coupled with a few clean-up regex.
ini_buffer="$(echo "$raw_buffer" | awk '/^\[.*\]$/{obj=$0}/=/{print obj $0}')"
Will take your lines and output them like you wanted:
+++ awk '/^\[.*\]$/{obj=$0}/=/{print obj $0}'
++ ini_buffer='[Machine1]app=version1
[Machine2]app=version1
[Machine2]app=version2
[Machine3]app=version1
[Machine3]app=version3'
A complete solution to the INI-format File
As Clonato, INI-format expert said that for the latest INI version 1.4 (2009-10-23), there are several other tricky aspects to the INI file:
character set constraint for section name
character set constraint for keyword
And lastly is for the keyvalue to be able to handle pretty much anthing that is not used in the section and keyword name; that includes nesting of quotes inside a pair of same single/double-quote.
Except for the nesting of quotes, a INI-format Github complete solution to parsing INI-format file with default section:
# syntax: ini_file_read <raw_buffer>
# outputs: formatted bracket-nested "[section]keyword=keyvalue"
ini_file_read()
{
local ini_buffer raw_buffer hidden_default
raw_buffer="$1"
# somebody has to remove the 'inline' comment
# there is a most complex SED solution to nested
# quotes inline comment coming ... TBA
raw_buffer="$(echo "$raw_buffer" | sed '
s|[[:blank:]]*//.*||; # remove //comments
s|[[:blank:]]*#.*||; # remove #comments
t prune
b
:prune
/./!d; # remove empty lines, but only those that
# become empty as a result of comment stripping'
)"
# awk does the removal of leading and trailing spaces
ini_buffer="$(echo "$raw_buffer" | awk '/^\[.*\]$/{obj=$0}/=/{print obj $0}')" # original
ini_buffer="$(echo "$ini_buffer" | sed 's/^\s*\[\s*/\[/')"
ini_buffer="$(echo "$ini_buffer" | sed 's/\s*\]\s*/\]/')"
# finds all 'no-section' and inserts '[Default]'
hidden_default="$(echo "$ini_buffer" \
| egrep '^[-0-9A-Za-z_\$\.]+=' | sed 's/^/[Default]/')"
if [ -n "$hidden_default" ]; then
echo "$hidden_default"
fi
# finds sectional and outputs as-is
echo "$(echo "$ini_buffer" | egrep '^\[\s*[-0-9A-Za-z_\$\.]+\s*\]')"
}
The unit test for this StackOverflow post is included in this file:
https://github.com/egberts/bash-ini-file
Source:
https://github.com/egberts/easy-admin/blob/main/test/section-regex.sh
https://cloanto.com/specs/ini/#escapesequences

Validate user password according to regex bash

I've been trying to write bash script, that validates user input with given rules: length > 8, at least one digit, and at least one of these: [#, #, $]
So regex for that is this:
((?=.*\d)(?=.*[##$%&*+-=]).{8,})
I've tried this, but with no result:
result=$(echo $1 | egrep "((?=.*\d)(?=.*[##$%&*+-=]).{8,})")
echo $result
with $1 being input parameter. Also, I'd like to wrap it in IF clause, but echo never outputs anything. What am i doing wrong?
This might help:
[[ ${#1} -ge 8 && $1 =~ [0-9] && $1 =~ [##$] ]] && result="$1"
or with three grep:
result=$(grep -E '.{8}' <<< "$1" | grep '[0-9]' | grep '[##$]')

How to get the 'variable' line from file? [duplicate]

This question already has answers here:
Bash tool to get nth line from a file
(22 answers)
Closed 7 years ago.
This is my script. It print every row in the file with the number of row.
Next i want to read which row user choosed and save it to some variable.
I=1
for ROW in $(cat file.txt)
do
echo "$I $ROW"
I=`expr $I + 1`
done
read var
awk 'FNR = $var {print}' file.txt
Then i want to to print / save the chosen row into the file.
How can I do this ?
when i echo $var it shows me properly the number. But when i'm trying to use this variable in awk, it print every line.
How to read the 'var' line from file?
And moreover, how to save this line in other variable?
Example file.txt
1 line1
2 line2
3 line3
4 line4
when i tap 3 i want to read third line from file.
Try this:
cat -n file.txt; read var; line="$(sed -n ${var}p file)"; echo "$line"
With more focus on Dryingsoussage's version:
#!/bin/bash
file="file.txt"
declare -i counter=0 # set integer attribute
var=0
while read -r line; do
counter=counter+1
printf "%d %s\n" "$counter" "$line"
done < "$file"
# check for number and greater-than 0 and less-than-or-equal $counter
until [[ $var =~ ^[0-9]+$ ]] && [[ $var -gt 0 ]] && [[ $var -le $counter ]]; do
read -p "Enter line number:" var
done
awk -v var="$var" 'FNR==var {print}' "$file"
You cannot use $varname inside ' ' they will not be resolved.
look at this other post it should help you:
How to use shell variables in an awk script
cat -n file.txt
read var
row="$(awk -v tgt="$var" 'NR==tgt{print;exit}' file.txt)"
First: You cannot use $var in a single quotes, as echo '$var' would be plain $var, no its value.
Second: You used = (assignment) operator instead of == (equality) operator.
Third: You don't have to write { print } if you want the line to be printed. You can write nothing instead.
Fourth: As was explained in the deleted comment below - do not allow bash expanding the variables in the awk script code, as it can lead to code injection.
So conclusion is:
awk -v var="$var" 'FNR == var' file.txt
should do what you want.

Resources