BASH - Capture string between a FIXED and 2 possible variables - bash

To get what is between "aa=" and either % or empty
string = "aa=value%bb"
string2 = "bb=%aa=value"
The rule must work on both strings to get the value of "aa="
I would like a BASH LANGUAGE solution if possible.

Use this:
result=$(echo "$string" | grep -o 'aa=[^%]*')
result=${result:3} # remove aa=
[^%]* matches any sequence of characters that doesn't contain %, so it will stop when it gets to % or the end of the string. $(result:3} expands to the substring starting from character 3, which removes aa= from the beginning.

Related

How to convert a semantic version shell variable to a shifted integer?

Given a shell variable whose value is a semantic version, how can I create another shell variable whose value is (tuple 1 × 1000000) + (tuple 2 × 1000) + (tuple 3) ?
E.g.
$ FOO=1.2.3
$ BAR=#shell magic that, given ${FOO} returns `1002003`
# Shell-native string-manipulation? sed? ...?
I'm unclear about how POSIX-compliance vs. shell-specific syntax comes into play here, but I think a solution not bash-specific is preferred.
Update: To clarify: this isn't as straightforward as replacing "." with zero(es), which was my initial thought.
E.g. The desired output for 1.12.30 is 1012030, not 100120030, which is what a .-replacement approach might provide.
Bonus if the answer can be a one-liner variable-assignment.
A perl one-liner:
echo $FOO | perl -pne 's/\.(\d+)/sprintf "%03d", $1/eg'
How it works:
perl -pne does a REPL with the supplied program
The program contains a replacement function s///
The search string is the regex \.(\d+) which matches a string beginning with dot and ends with digits and capture those digits
The e modifier of the s/// function evaluates the right-hand side of the s/// replacement as an expression. Since we captured the digits, they'll be converted into int and formatted into leading zeros with sprintf
The g modifier replaces all instances of the regex in the input string
Demo
Split on dots, then loop and multiply/add:
version="1.12.30"
# Split on dots instead of spaces from now on
IFS="."
# Loop over each number and accumulate
int=0
for n in $version
do
int=$((int*1000 + n))
done
echo "$version is $int"
Be aware that this treats 1.2 and 0.1.2 the same. If you want to always treat the first number as major/million, consider padding/truncating beforehand.
This should do it
echo $foo | sed 's/\./00/g'
How about this?
$ ver=1.12.30
$ foo=$(bar=($(echo $ver|sed 's/\./ /g')); expr ${bar[0]} \* 1000000 + ${bar[1]} \* 1000 + ${bar[2]})
$ echo $foo
1012030

Ruby: How to insert three backslashes into a string?

I want to use backticks in ruby for a programm call.
The parameter is a String variable containing one or more backticks, i.e.
"&E?##A`?". The following command yields a new label as its return value:
echo "&E?##A\`?" | nauty-labelg 2>/dev/null
From a ruby program I can call it as follows and get the correct result:
new_label = `echo "&E?##A\\\`?" | nauty-labelg 2>/dev/null`
I want to achieve the same using a variable for the label.
So I have to insert three slashes into my variable label = "&E?##A`?" in order to escape the backtick. The following seems to work, though it is not very elegant:
escaped_label = label.gsub(/`/, '\\\`').gsub(/`/, '\\\`').gsub(/`/, '\\\`')
But the new variable cannot be used in the program call:
new_label = `echo "#{escaped_label}" | nauty-labelg 2>/dev/null`
In this case I do not get an answer from nauty-labelg.
So I have to insert three slashes into my variable label = "&E?##A`?" in order to escape the backtick.
No, you only need to add one backslash for the output. To escape the ` special bash character. The other other two are only for representation proposes, otherwise it isn't valid Ruby code.
new_label = `echo "&E?##A\\\`?" | nauty-labelg 2>/dev/null`
The first backslash will escape the second one (outputting one single backslash). The third backslash escapes the ` character (outputting one single `).
You should only add backslashes before characters that have a special meaning within double quoted bash context. These special characters are: $, `, \ and \n. Those can be escaped with the following code:
def escape_bash_string(string)
string.gsub(/([$`"\\\n])/, '\\\\\1')
end
For label = "&E?##A`?" only the ` should be escaped.
escaped_string = escape_bash_string("&E?##A\`?")
puts escaped_string
# &E?##A\`?

Build a variable made with 2 sub-stings of another variable in bash

Here is a script I use:
for dir in $(find . -type d -name "single_copy_busco_sequences"); do
sppname=$(dirname $(dirname $(dirname $dir))| sed 's#./##g');
for file in ${dir}/*.faa; do name=$(basename $file); cp $file /Users/admin/Documents/busco_aa/${sppname}_${name}; sed -i '' 's#>#>'${sppname}'|#g' /Users/admin/Documents/busco_aa/${sppname}_${name}; cut -f 1 -d ":" /Users/admin/Documents/busco_aa/${sppname}_${name} > /Users/admin/Documents/busco_aa/${sppname}_${name}.1;
done;
done
The sppname variable is something like Gender_species
do you know how could I add a line in my script to creat a new variable called abbrev which transformes Gender_species into Genspe, the 3 first letters cat with the 3 first letters after _
exemples:
Homo_sapiens gives Homsap
Canis_lupus gives Canlup
etc
Thank for your help :)
You can achieve this using a regular expression with sed:
echo "Homo_sapiens" | sed -e s'/^\(...\).*_\(...\).*/\1\2/'
Homsap
start, get 3 chars (to keep in \1), anything, _, anything, get 3 chars (to keep in \2), anything
Replace echo "Homo_sapiens" by your $dir thing
PS: will fail if you have less than 3 chars in one word
You can do it all with bash built-in parameter expansions. Specifically, string indexes and substring removal.
$ a=Homo_sapiens; prefix=${a:0:3}; a=${a#*_}; postfix=${a:0:3}; echo $prefix$postfix
Homsap
$ a=Canis_lupus; prefix=${a:0:3}; a=${a#*_}; postfix=${a:0:3}; echo $prefix$postfix
Canlup
Using bash built-ins is always more efficient than spawning separate subshell(s) to invoke utilities to accomplish the same thing.
Explanation
Your string index form (bash only) allows you to index characters from within a string, e.g.
* ${parameter:offset:length} ## indexes are zero based, ${a:0:2} is 1st 2 chars
Where parameter is simply the variable name holding the string.
(you can index from the end of a string by using a negative offset preceded by a space or enclosed in parenthesis, e.g. a=12345; echo ${a: -3:2} outputs "34")
prefix=${a:0:3} ## save the first 3 characters in prefix
a=${a#*_} ## remove the front of the string through '_' (see below)
postfix=${a:0:3} ## save the first 3 characters after '_'
Your substring removal forms (POSIX) are:
${parameter#word} trim to 1st occurrence of word from parameter from left
${parameter##word} trim to last occurrence of word from parameter from left
and
${parameter%word} trim to 1st occurrence of word from parameter from right
${parameter%%word} trim to last occurrence of word from parameter from right
(word can contain globbing to expand to a pattern as well)
a=${a#*_} ## trim from left up to (and including) the first '_'
See bash(1) - Linux manual page for full details.

adding a colon after every two letters in an alphanumeric string in shell

So i have an alphanumeric string 10006cc2190ab011 i am trying to add a colon after every two letters in this alphanumeric string.
this is the string : 10006cc2190ab011
i want it be - 10:00:6c:c2:19:0a:b0:11
Thanks in advance.
A sed solution:
$ echo 10006cc2190ab011 | sed 's/../&:/g; s/:$//'
10:00:6c:c2:19:0a:b0:11
Replaces each non-overlapping pair of characters with the same pair plus :. In the end removes the trailing : (if input text had even length).
str=10006cc2190ab011; str="${str//??/${.sh.match}:}"; echo ${str%:}
is doing the same replacement without the use of an external command, just using ksh-internals.
Doing the same as in sed (the other answer). Replace in $str every // two charactes ?? with / the matched string and a : (every match is kept in the ksh-variable ${.sh.match}). Then print $str without the last % ':'.

How do prevent whitespace from appearing in these bash variables?

I'm reading in values from an .ini file, and sometimes may get trailing or leading whitespace.
How do I amend this first line to prevent that?
db=$(sed -n 's/.*DB_USERNAME *= *\([^ ]*.*\)/\1/p' < config.ini);
echo -"$db"-
Result;
-myinivar -
I need;
-myinivar-
Use parameter expansion.
echo "=${db% }="
You don't need the .* inside the capturing group (or the semicolon at the end of line):
db="$(sed -n 's/.*DB_USERNAME *= *\([^ ]*\).*/\1/p' < config.ini)"
To elaborate:
.* matches anything at all
DB_USERNAME matches that literal string
* (a single space followed by an asterisk) matches any number of spaces
= matches that literal string
* (a single space followed by an asterisk) matches any number of spaces
\( starts the capturing group that is used for \1 later
[^ ] matches anything which is not a space character
* repeats that zero or more times
\) ends the capturing group
.* matches anything at all
Therefore, the result will be all the characters after DB_USERNAME = and any number of spaces, up to the next space or end of line, whichever comes first.
You can use echo to trim whitespace:
db='myinivar '
echo -"$(echo $db)"-
-myinivar-
Use crudini which handles these ini file edge cases transparently
db=$(crudini --get config.ini '' DB_USERNAME)
To get rid of more than one trailing space, use %% which removes the longest matching pattern from the end of the string
echo "=${db%% *}="

Resources