My non-working code sample (erroneous line -> empty=$empty\n$url):
empty=""
IFS=$'\n'
for line in $s; do
if [[ $line =~ $regex ]]; then
url="${BASH_REMATCH[2]}${BASH_REMATCH[1]}"
echo $url
empty=$empty\n$url
else
echo "$s does not match"
fi
done
echo $empty|sort -f -t/ -k 4
I try to rebuild the modified lines splitted by the for cycle.
empty="$empty"$'\n'"$url"
$'\n' is a literal newline in bash (double-quoting the variable references is not strictly necessary here, but helps readability; alternative: empty=${empty}$'\n'${url}).
Alternative solution with printf:
printf -v empty '%s\n%s' "$empty" "$url"
String literals can have embedded newlines:
entry="$entry
$url"
or
entry+="
$url"
Related
Here is my code
vmname="$1"
EXCEPTLIST="desktop-01|desktop-02|desktop-03|desktop-04"
if [[ $vmname != #(${EXCEPTLIST}) ]]; then
echo "${vmname}"
else
echo "Its in the exceptlist"
fi
The above code works perfectly but my question is , the EXCEPTLIST can be a long line, say 100 server names. In that case its hard to put all that names in one line. In that situation is there any way to make the variable EXCEPTLIST to be a multiline variable ? something like as follows:
EXCEPTLIST="desktop-01|desktop-02|desktop-03| \n
desktop-04|desktop-05|desktop-06| \n
desktop-07|desktop-08"
I am not sure but was thinking of possibilities.
Apparently I would like to know the terminology of using #(${})- Is this called variable expansion or what ? Does anyone know the documentation/explain to me about how this works in bash. ?
One can declare an array if the data/string is long/large. Use IFS and printf for the format string, something like:
#!/usr/bin/env bash
exceptlist=(
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
)
pattern=$(IFS='|'; printf '#(%s)' "${exceptlist[*]}")
[[ "$vmname" != $pattern ]] && echo good
In that situation is there any way to make the variable EXCEPTLIST to be a multiline variable ?
With your given input/data an array is also a best option, something like:
exceptlist=(
'desktop-01|desktop-02|desktop-03'
'desktop-04|desktop-05|desktop-06'
'desktop-07|desktop-08'
)
Check what is the value of $pattern variable one way is:
declare -p pattern
Output:
declare -- pattern="#(desktop-01|desktop-02|desktop-03|desktop-04|desktop-05|desktop-06)"
Need to test/check if $vmname is an empty string too, since it will always be true.
On a side note, don't use all upper case variables for purely internal purposes.
The $(...) is called Command Substitution.
See LESS=+'/\ *Command Substitution' man bash
In addition to what was mentioned in the comments about pattern matching
See LESS=+/'(pattern-list)' man bash
See LESS=+/' *\[\[ expression' man bash
s there any way to make the variable EXCEPTLIST to be a multiline variable ?
I see no reason to use matching. Use a bash array and just compare.
exceptlist=(
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
)
is_in_list() {
local i
for i in "${#:2}"; do
if [[ "$1" = "$i" ]]; then
return 0
fi
done
return 1
}
if is_in_list "$vmname" "${EXCEPTLIST[#]}"; then
echo "is in exception list ${vmname}"
fi
#(${})- Is this called variable expansion or what ? Does anyone know the documentation/explain to me about how this works in bash. ?
${var} is a variable expansion.
#(...) are just characters # ( ).
From man bash in Compund commands:
[[ expression ]]
When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules
described below under Pattern Matching, as if the extglob shell option were enabled. ...
From Pattern Matching in man bash:
#(pattern-list)
Matches one of the given patterns
[[ command receives the #(a|b|c) string and then matches the arguments.
There is absolutely no need to use Bash specific regex or arrays and loop for a match, if using grep for raw string on word boundary.
The exception list can be multi-line, it will work as well:
#!/usr/bin/sh
exceptlist='
desktop-01|desktop-02|desktop-03|
deskop-04|desktop-05|desktop-06|
desktop-07|deskop-08'
if printf %s "$exceptlist" | grep -qwF "$1"; then
printf '%s is in the exceptlist\n' "$1"
fi
I wouldn't bother with multiple lines of text. This is would be just fine:
EXCEPTLIST='desktop-01|desktop-02|desktop-03|'
EXCEPTLIST+='desktop-04|desktop-05|desktop-06|'
EXCEPTLIST+='desktop-07|desktop-08'
The #(...) construct is called extended globbing pattern and what it does is an extension of what you probably already know -- wildcards:
VAR='foobar'
if [[ "$VAR" == fo?b* ]]; then
echo "Yes!"
else
echo "No!"
fi
A quick walkthrough on extended globbing examples: https://www.linuxjournal.com/content/bash-extended-globbing
#!/bin/bash
set +o posix
shopt -s extglob
vmname=$1
EXCEPTLIST=(
desktop-01 desktop-02 desktop-03
...
)
if IFS='|' eval '[[ ${vmname} == #(${EXCEPTLIST[*]}) ]]'; then
...
Here's one way to load a multiline string into a variable:
fn() {
cat <<EOF
desktop-01|desktop-02|desktop-03|
desktop-04|desktop-05|desktop-06|
desktop-07|desktop-08
EOF
}
exceptlist="$(fn)"
echo $exceptlist
As to solving your specific problem, I can think of a variety of approaches.
Solution 1, since all the desktop has the same desktop-0 prefix and only differ in the last letter, we can make use of {,} or {..} expansion as follows:
vmname="$1"
found=0
for d in desktop-{01..08}
do
if [[ "$vmname" == $d ]]; then
echo "It's in the exceptlist"
found=1
break
fi
done
if (( !found )); then
echo "Not found"
fi
Solution 2, sometimes, it is good to provide a list in a maintainable clear text list. We can use a while loop and iterate through the list
vmname="$1"
found=0
while IFS= read -r d
do
if [[ "$vmname" == $d ]]; then
echo "It's in the exceptlist"
found=1
break
fi
done <<EOF
desktop-01
desktop-02
desktop-03
desktop-04
desktop-05
desktop-06
desktop-07
desktop-08
EOF
if (( !found )); then
echo "Not found"
fi
Solution 3, we can desktop the servers using regular expressions:
vmname="$1"
if [[ "$vmname" =~ ^desktop-0[1-8]$ ]]; then
echo "It's in the exceptlist"
else
echo "Not found"
fi
Solution 4, we populate an array, then iterate through an array:
vmname="$1"
exceptlist=()
exceptlist+=(desktop-01 desktop-02 desktop-03 deskop-04)
exceptlist+=(desktop-05 desktop-06 desktop-07 deskop-08)
found=0
for d in ${exceptlist[#]}
do
if [[ "$vmname" == "$d" ]]; then
echo "It's in the exceptlist"
found=1
break;
fi
done
if (( !found )); then
echo "Not found"
fi
I'am trying to get the first character of each string using regex and BASH_REMATCH in shell script.
My input text file contain :
config_text = STACK OVER FLOW
The strings STACK OVER FLOW must be uppercase like that.
My output should be something like this :
SOF
My code for now is :
var = config_text
values=$(grep $var test_file.txt | tr -s ' ' '\n' | cut -c 1)
if [[ $values =~ [=(.*)]]; then
echo $values
fi
As you can see I'am using tr and cut but I'am looking to replace them with only BASH_REMATCH because these two commands have been reported in many links as not functional on MacOs.
I tried something like this :
var = config_text
values=$(grep $var test_file.txt)
if [[ $values =~ [=(.*)(\b[a-zA-Z])]]; then
echo $values
fi
VALUES as I explained should be :
S O F
But it seems \b does not work on shell script.
Anyone have an idea how to get my desired output with BASH_REMATCH ONLY.
Thanks in advance for any help.
A generic BASH_REMATCH solution handling any number of words and any separator.
local input="STACK OVER FLOW" pattern='([[:upper:]]+)([^[:upper:]]*)' result=""
while [[ $input =~ $pattern ]]; do
result+="${BASH_REMATCH[1]::1}${BASH_REMATCH[2]}"
input="${input:${#BASH_REMATCH[0]}}"
done
echo "$result"
# Output: "S O F"
Bash's regexes are kind of cumbersome if you don't know how many words there are in the input string. How's this instead?
config_text="STACK OVER FLOW"
sed 's/\([^[:space:]]\)[^[:space:]]*/\1/g' <<<"$config_text"
First Put a valid shebang and paste your script at https://shellcheck.net for validation/recommendation.
With the assumption that the line starts with config and ends with FLOW e.g.
config_text = STACK OVER FLOW
Now the script.
#!/usr/bin/env bash
values="config_text = STACK OVER FLOW"
regexp="config_text = ([[:upper:]]{1})[^ ]+ ([[:upper:]]{1})[^ ]+ ([[:upper:]]{1}).+$"
while IFS= read -r line; do
[[ "$line" = "$values" && "$values" =~ $regexp ]] &&
printf '%s %s %s\n' "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"
done < test_file.txt
If there is Only one line or the target string/pattern is at the first line of the test_file.txt, the while loop is not needed.
#!/usr/bin/env bash
values="config_text = STACK OVER FLOW"
regexp="config_text = ([[:upper:]]{1})[^ ]+ ([[:upper:]]{1})[^ ]+ ([[:upper:]]{1}).+$"
IFS= read -r line < test_file.txt
[[ "$line" = "$values" && "$values" =~ $regexp ]] &&
printf '%s %s %s\n' "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"
Make sure you have and running/using Bashv4+ since MacOS, defaults to Bashv3
See How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
Another option rather than bash regex would be to utilize bash parameter expansion substring ${parameter:offset:length} to extract the desired characters:
$ read -ra arr <text.file ; printf "%s%s%s\n" "${arr[2]:0:1}" "${arr[3]:0:1}" "${arr[4]:0:1}"
SOF
I am trying to capture the longest match of a repeating pattern
do_run() {
local regex='.*((abc)+).*'
local str='_abcabcabc123_'
echo "regex=${regex}"$'\n'
echo "str=${str}"$'\n'
if [[ "${str}" =~ ${regex} ]]
then
for i in ${!BASH_REMATCH[#]}
do
echo "$i=${BASH_REMATCH[i]}"
done
else
echo "no match"
fi
}
I get the following output :
regex=.*((abc)+).*
str=_abcabcabc_
0=_abcabcabc123_
1=abc
2=abc
I am trying to get something like :
regex=.*((abc)+).*
str=_abcabcabc123_
0=_abcabcabc123_
x=abcabcabc
(Update : x is just here to indicate that the index of the matching group does not matter but I need to know what number to use to retrieve the matching group ...)
Update:
After reading comment, the following regex will work : ((abc)+)
However, I also need to capture what precedes and what follows ((abc)+).
I had not mentionned it earlier because I thought the same solution would be applied.
So the new code would be :
do_run() {
local regex='(.*)((abc)+)(.*)'
local str='_abcabcabc123_'
echo "regex=${regex}"$'\n'
echo "str=${str}"$'\n'
if [[ "${str}" =~ ${regex} ]]
then
for i in ${!BASH_REMATCH[#]}
do
echo "$i=${BASH_REMATCH[i]}"
done
else
echo "no match"
fi
}
I get then the following output :
regex=(.*)((abc)+)(.*)
str=_abcabcabc123_
0=_abcabcabc123_
1=_abcabc
2=abc
3=abc
4=123_
I want to be able to retrieve abcabcabc from a matching group but also what precedes it and what follows it
As a workaround you can do like this:
[STEP 101] $ cat foo.sh
v=_abcabcabc123_
if [[ $v =~ (abc)+ ]]; then
middle=${BASH_REMATCH[0]}
[[ $v =~ (.*)"$middle" ]]
before=${BASH_REMATCH[1]}
[[ $v =~ "$middle"(.*) ]]
after=${BASH_REMATCH[1]}
echo "before: $before"
echo "middle: $middle"
echo "after : $after"
fi
[STEP 102] $ bash foo.sh
before: _
middle: abcabcabc
after : 123_
[STEP 103] $
I also need to capture what precedes and what follows ((abc)+).
For that, typically you'll need a negative lookahead with perl regex, something along (?<!abc)((abs)+)(.*).
I am bad at perl regex, with perl-enabled grep I was able to this:
$ grep -oxP '(.*)(?<!abc)((abc)+)\K(.*)' <<<'_abcabcabc123_'
123_
$ grep -oP '((abc)+)' <<<'_abcabcabc123_'
abcabcabc
$ rev <<<'_abcabcabc123_' | grep -oP '(.*)(?<!cba)((cba)+)\K(.*)' | rev
_
Bash has no lookarounds and no perl regex. Consider using python or perl.
But you may use sed by splitting the part on the regex and then reading lines, which may be simpler:
$ readarray -t lines < <(<<<'_abcabcabc123_' sed -E 's/((abc)+)/\n&\n/'); declare -p lines
declare -a lines=([0]="_" [1]="abcabcabc" [2]="123_")
Another idea: you may use bash expansion to replace the abc parts by something unique, then split it on that separator:
$ IFS=' ' read -r before post < <(printf "%s\n" "${str//abc/ }") ; declare -p before post
declare -- before="_"
declare -- post="123_"
# or
$ IFS='#' read -r before post < <(<<<"${str//abc/#}" tr -s '#') ; declare -p before post
declare -- before="_"
declare -- post="123_"
For your given input this regex would work:
re='^([^a]|a[^b]*|ab[^c]*)((abc)+)(.*)'
str='_abcabcabc123_'
[[ $str =~ $re ]] && declare -p BASH_REMATCH
Output:
declare -ar BASH_REMATCH=([0]="_abcabcabc123_" [1]="_" [2]="abcabcabc" [3]="abc" [4]="123_")
So you can use:
"${BASH_REMATCH[1]}" # string before
"${BASH_REMATCH[2]}" # string containing all "abc"s
"${BASH_REMATCH[4]}" # string after
RegEx Demo
I need to to analyze (with grep) and print (with some formatting) the content of an
app's log.
This log contains text data in variable length lines. What I need is, after some grepping, loop each line of this output and print it with a maximum fixed length of 50 characters. If a line is longer than 50 chars, it should print a newline and then continue with the rest in the following line and so on until the line is completed.
I tried to use printf to do this, but it's not working and I don't know why. It just outputs the lines in same fashion of echo, without any consideration about printf formatting, though the \t character (tab) works.
function printContext
{
str="$1"
log="$2"
tmp="/tmp/deluge/$$"
rm -f $tmp
echo ""
echo -e "\tLog entries for $str :"
ln=$(grep -F "$str" "$log" &> "$tmp" ; cat "$tmp" | wc -l)
if [ $ln -gt 0 ];
then
while read line
do
printf "\t%50s\n" "$line"
done < $tmp
fi
}
What's wrong? I Know that I can make a substring routine to accomplish this task, but printf should be handy for stuff like this.
Instead of:
printf "\t%50s\n" "$line"
use
printf "\t%.50s\n" "$line"
to truncate your line to 50 characters only.
I'm not sure about printf but seeing as how perl is installed everywhere, how about a simple 1 liner?
echo $ln | perl -ne ' while( m/.{1,50}/g ){ print "$&\n" } '
Here's a clunky bash-only way to break the string into 50-character chunks
i=0
chars=50
while [[ -n "${y:$((chars*i)):$chars}" ]]; do
printf "\t%s\n" "${y:$((chars*i)):$chars}"
((i++))
done
Using bash, I have a string:
str='a $s'
echo ${str/\$/\$a}
# a $as
str='a $1'
echo ${str/\$/\$a}
# a $a1
How modify pattern that replacement performs only if word starts with a letter?
I want
str='a $s'
echo ${str/??/??}
# a $as
str='a $1'
echo ${str/??/??}
# a $1
This would have been trivial if the regex support in Bash allowed lookahead. The solution would then simply be ${str/\$(?=[a-zA-Z])/\$a}. Unfortunately, that is not the case.
You can still do it if you don't might using sed.
str='a $s'
echo $str | sed 's/\$\([a-zA-Z]\)/\$a\1/g' # gives you a $as
str='a $1'
echo $str | sed 's/\$\([a-zA-Z]\)/\$a\1/g' # gives you a $1
\$\([a-zA-Z]\) matches $ followed by an alphabet and store the alphabet. If a match is found, it is replaced with $a followed by the stored alphabet (\1). For more details, see this tutorial.
The trailing g means it will match all occurrences in the string. Remove that if you only want to only replace the first occurrence.
You can use an IF to check if any any letter follows the dollar sign and in case do the replacement:
if [[ "$str" =~ \$[[:alpha:]] ]]; then
echo ${str/$/\$a};
else
echo $str;
fi
Demonstration:
for str in 'a $1' 'a $s'
do
[[ $str =~ \$([[:alpha:]]) ]]
echo ${str/\$/\$${BASH_REMATCH[1]:+a}}
done
Result:
a $1
a $as