I want to extract substring till the point the last numeric ends.
for example:
In the string "abcd123z" , I want the output to be "abcd123"
In the string "abcdef123gh01yz" , I want the output to be "abcdef123gh01"
In the string "abcd123" , I want the output to be "abcd123"
How to do this in the unix shell?
Try this sed command,
sed 's/^\(.*[0-9]\).*$/\1/g' file
Example:
$ echo 'abcdef123gh01yz' | sed 's/^\(.*[0-9]\).*$/\1/g'
abcdef123gh01
You can do this in BASH regex:
str='abcdef123gh01yz'
[[ "$str" =~ ^(.*[[:digit:]]) ]] && echo "${BASH_REMATCH[1]}"
abcdef123gh01
tmp="${str##*[0-9]}" # cut off all up to last digit, keep intermediate
echo "${str%$tmp}" # remove intermediate from end of string
Related
I have a below file which containing some data
name:Mark
age:23
salary:100
I want to read only name, age and assign to a variable in shell script
How I can achieve this thing
I am able to real all file data by using below script not a particular data
#!/bin/bash
file="/home/to/person.txt"
val=$(cat "$file")
echo $val
please suggest.
Rather than running multiple greps or bash loops, you could just run a single read that reads the output of a single invocation of awk:
read age salary name <<< $(awk -F: '/^age/{a=$2} /^salary/{s=$2} /^name/{n=$2} END{print a,s,n}' file)
Results
echo $age
23
echo $salary
100
echo $name
Mark
If the awk script sees an age, it sets a to the age. If it sees a salary , it sets s to the salary. If it sees a name, it sets n to the name. At the end of the input file, it outputs what it has seen for the read command to read.
Using grep : \K is part of perl regex. It acts as assertion and checks if text supplied left to it is present or not. IF present prints as per regex ignoring the text left to it.
name=$(grep -oP 'name:\K.*' person.txt)
age=$(grep -oP 'age:\K.*' person.txt)
salary=$(grep -oP 'salary:\K.*' person.txt)
Or using awk one liner ,this may break if the line containing extra : .
declare $(awk '{sub(/:/,"=")}1' person.txt )
Will result in following result:
sh-4.1$ echo $name
Mark
sh-4.1$ echo $age
23
sh-4.1$ echo $salary
100
You could try this
if your data is in a file: data.txt
name:vijay
age:23
salary:100
then you could use a script like this
#!/bin/bash
# read will read a line until it hits a record separator i.e. newline, at which
# point it will return true, and store the line in variable $REPLY
while read
do
if [[ $REPLY =~ ^name:.* || $REPLY =~ ^age:.* ]]
then
eval ${REPLY%:*}=${REPLY#*:} # strip suffix and prefix
fi
done < data.txt # read data.txt from STDIN into the while loop
echo $name
echo $age
output
vijay
23
well if you can store data in json or other similar formate it will be very easy to access complex data
data.json
{
"name":"vijay",
"salary":"100",
"age": 23
}
then you can use jq to parse json and get data easily
jq -r '.name' data.json
vijay
The initial string is RU="903B/100ms"
from which I wish to obtain B/100ms.
Currently, I have written:
#!/bin/bash
RU="903B/100ms"
RU=${RU#*[^0-9]}
echo $RU
which returns /100ms since the parameter expansion removes up to and including the first non-numeric character. I would like to keep the first non-numeric character in this case. How would I do this by amending the above text?
You can use BASH_REMATCH to extract the desired matching value:
$ RU="903B/100ms"
$ [[ $RU =~ ^([[:digit:]]+)(.*) ]] && echo ${BASH_REMATCH[2]}
B/100ms
Or just catch the desired part as:
$ [[ $RU =~ ^[[:digit:]]+(.*) ]] && echo ${BASH_REMATCH[1]}
B/100ms
Assuming shopt -s extglob:
RU="${RU##+([0-9])}"
echo "903B/100ms" | sed 's/^[0-9]*//g'
B/100ms
I have the following string
git#bitbucket.org:user/my-repo-name.git
I want to extract this part
my-repo-name
With bash:
s='git#bitbucket.org:user/my-repo-name.git'
[[ $s =~ ^.*/(.*)\.git$ ]]
echo ${BASH_REMATCH[1]}
Output:
my-repo-name
Another method, using bash's variable substitution:
s='git#bitbucket.org:user/my-repo-name.git'
s1=${s#*/}
echo ${s1%.git}
Output:
my-repo-name
I'm not sure if there's a way to combine the # and % operators into a single substitution.
The following line removes the leading text before the variable $PRECEDING
temp2=${content#$PRECEDING}
But now i want the $PRECEDING to be case-insensitive. This works with sed's I flag. But i can't figure out the whole cmd.
No need to call out to sed or use shopt. The easiest and quickest way to do this (as long as you have Bash 4):
if [ "${var1,,}" = "${var2,,}" ]; then
echo "matched"
fi
All you're doing there is converting both strings to lowercase and comparing the results.
Here's a way to do it with sed:
temp2=$(sed -e "s/^.*$PRECEDING//I" <<< "$content")
Explanation:
^.*$PRECEDING: ^ means start of string, . means any character, .* means any character zero or more times. So together this means "match any pattern from start of string that is followed by (and including) string stored in $PRECEDING.
The I part means case-insensitive, the g part (if you use it) means "match all occurrences" instead of just the 1st.
The <<< notation is for herestrings, so you save an echo.
The only bash way I can think of is to check if there's a match (case-insensitively) and if yes, exclude the appropriate number of characters from the beginning of $content:
content=foo_bar_baz
PRECEDING=FOO
shopt -s nocasematch
[[ $content == ${PRECEDING}* ]] && temp2=${content:${#PRECEDING}}
echo $temp2
Outputs: _bar_baz
your examples have context-switching techniques.
better is (bash v4):
VAR1="HELLoWORLD"
VAR2="hellOwOrld"
if [[ "${VAR1^^}" = "${VAR2^^}" ]]; then
echo MATCH
fi
link: Converting string from uppercase to lowercase in Bash
If you don't have Bash 4, I find the easiest way is to first convert your string to lowercase using tr
VAR1=HelloWorld
VAR2=helloworld
VAR1_LOWER=$(echo "$VAR1" | tr '[:upper:]' '[:lower:]')
VAR2_LOWER=$(echo "$VAR2" | tr '[:upper:]' '[:lower:]')
if [ "$VAR1_LOWER" = "$VAR2_LOWER" ]; then
echo "Match"
else
echo "Invalid"
fi
This also makes it really easy to assign your output to variables by changing your echo to OUTPUT="Match" & OUTPUT="Invalid"
Is there any way in bash to parse this filename :
$file = dos1-20120514104538.csv.3310686
into variables like $date = 2012-05-14 10:45:38 and $id = 3310686 ?
Thank you
All of this can be done with Parameter Expansion. Please read about it in the bash manpage.
$ file='dos1-20120514104538.csv.3310686'
$ date="${file#*-}" # Use Parameter Expansion to strip off the part before '-'
$ date="${date%%.*}" # Use PE again to strip after the first '.'
$ id="${file##*.}" # Use PE to get the id as the part after the last '.'
$ echo "$date"
20120514104538
$ echo "$id"
3310686
Combine PEs to put date back together in a new format. You could also parse the date with GNU date, but that would still require rearranging the date so it can be parsed. In its current format, this is how I would approach it:
$ date="${date:0:4}-${date:4:2}-${date:6:2} ${date:8:2}:${date:10:2}:${date:12:2}"
$ echo "$date"
2012-05-14 10:45:38
Using Bash's regular expression feature:
file='dos1-20120514104538.csv.3310686'
pattern='^[^-]+-([[:digit:]]{4})'
for i in {1..5}
do
pattern+='([[:digit:]]{2})'
done
pattern+='\.[^.]+\.([[:digit:]]+)$'
[[ $file =~ $pattern ]]
read -r _ Y m d H M S id <<< "${BASH_REMATCH[#]}"
date="$Y-$m-$d $H:$M:$S"
echo "$date"
echo "$id"
Extract id:
f='dos1-20120514104538.csv.3310686'
echo ${f/*./}
# 3310686
id=${f/*./}
Remove prefix, and extract core date numbers:
noprefix=${f/*-/}
echo ${noprefix/.csv*/}
# 20120514104538
ds=${noprefix/.csv*/}
format the date like this (only partially done:)
echo $ds | sed -r 's/(.{4})(.{2})(.{2})/\1.\2.\3/'
You can alternatively split the initial variable into an array,
echo $f
# dos1-20120514104538.csv.3310686
after exchanging - and . like this:
echo ${f//[-.]/ }
# dos1 20120514104538 csv 3310686
ar=(${f//[-.]/ })
echo ${ar[1]}
# 20120514104538
echo ${ar[3]}
# 3310686
The date transformation can be done via an array similarly:
dp=($(echo 20120514104538 | sed -r 's/(.{2})/ \1/g'))
echo ${dp[0]}${dp[1]}-${dp[2]}-${dp[3]} ${dp[4]}:${dp[5]}:${dp[6]}
It splits everything into groups of 2 characters:
echo ${dp[#]}
# 20 12 05 14 10 45 38
and merges 2012 together in the output.
You can tokenize the string first for - and then for .. There are various threads on SO on how to do this:
How do I split a string on a delimiter in Bash?
Bash: How to tokenize a string variable?
To transform 20120514104538 into 2012-05-14 10:45:38 :
Since we know that first 4 characters is year, next 2 is months and so on, you will first need to break this token into sub-strings and then recombine into a single string. You can start with the following answer:
https://stackoverflow.com/a/428580/365188