Shell Script : How to add in the end of lines which contains special string - bash

I have a VAR which contain :
list_data="toto
titi
tata
tete"
My array can contain like this example or more values then 3 data :
arraytest[0]="Hello|Test|env|tata|POLO|GER|GO|"
arraytest[1]="GOODNIGHT|Test2|env2|tete|GOLF|ITA|NOTGO|"
arraytest[2]="AFTER|Test3|env3|JAJA|CIT|FRA|GO|"
and my string is
string="INSERT"
What I want to do is to add special string in the end of each line where every line contain each value of list_data
For example :
arraytest[0]="Hello|Test|env|tata|POLO|GER|GO|INSERT"
arraytest[1]="GOODNIGHT|Test2|env2|tete|GOLF|ITA|NOTGO|INSERT"
arraytest[2]="AFTER|Test3|env3|JAJA|CIT|FRA|GO|"
I try this
echo ${arraytest[*]} | sed -i 's/$/ $string'
Please Help.
Thank you.

bash solution:
string="INSERT"
pat="(${list_data//$'\n'/|})"
for i in "${!arraytest[#]}"; do
[[ "${arraytest[$i]}" =~ $pat ]] && arraytest[$i]+=$string
done
The final arraytest contents:
echo ${arraytest[*]}
Hello|Test|env|tata|POLO|GER|GO|INSERT GOODNIGHT|Test2|env2|tete|GOLF|ITA|NOTGO|INSERT AFTER|Test3|env3|JAJA|CIT|FRA|GO|
Details:
pat="(${list_data//$'\n'/|})" - constructing regex pattern(pat contains (toto|titi|tata|tete))
${!arraytest[#]} - the basic signature format is ${!name[*]} - if name is an array variable, expands to the list of array indices (keys) assigned in name.

Loop over the individual elements of the array (rather than ${array[*]} )
for ((i=0 ; i<${#arraytest[*]} ; i++))
do
echo ${arraytest[${i}]} | sed "s/$/|${string}/"
done

Related

Extract value for a key in a key/pair string

I have key value pairs in a string like this:
key1 = "value1"
key2 = "value2"
key3 = "value3"
In a bash script, I need to extract the value of one of the keys like for key2, I should get value2, not in quote.
My bash script needs to work in both Redhat and Ubuntu Linux hosts.
What would be the easiest and most reliable way of doing this?
I tried something like this simplified script:
pattern='key2\s*=\s*\"(.*?)\".*$'
if [[ "$content" =~ $pattern ]]
then
key2="${BASH_REMATCH[1]}"
echo "key2: $key2"
else
echo 'not found'
fi
But it does not work consistently.
Any better/easier/more reliable way of doing this?
To separate the key and value from your $content variable, you can use:
[[ $content =~ (^[^ ]+)[[:blank:]]*=[[:blank:]]*[[:punct:]](.*)[[:punct:]]$ ]]
That will properly populate the BASH_REMATCH array with both values where your key is in BASH_REMATCH[1] and the value in BASH_REMATCH[2].
Explanation
In bash the [[...]] treats what appears on the right side of =~ as an extended regular expression and matched according to man 3 regex. See man 1 bash under the section heading for [[ expression ]] (4th paragraph). Sub-expressions in parenthesis (..) are saved in the array variable BASH_REMATCH with BASH_REMATCH[0] containing the entire portion of the string (your $content) and each remaining elements containing the sub-expressions enclosed in (..) in the order the parenthesis appear in the regex.
The Regular Expression (^[^ ]+)[[:blank:]]*=[[:blank:]]*[[:punct:]](.*)[[:punct:]]$ is explained as:
(^[^ ]+) - '^' anchored at the beginning of the line, [^ ]+ match one or more characters that are not a space. Since this sub-expression is enclosed in (..) it will be saved as BASH_REMATCH[1], followed by;
[[:blank:]]* - zero or more whitespace characters, followed by;
= - an equal sign, followed by;
[[:blank:]]* - zero or more whitespace characters, followed by;
[[:punct:]] - a punctuation character (matching the '"', which avoids caveats associated with using quotes within the regex), followed by the sub-expression;
(.*) - zero or more characters (the rest of the characters), and since it is a sub-expression in (..) it the characters will be stored in BASH_REMATCH[2], followed by;
[[:punct:]] - a punctuation character (matching the '"' ... ditto), at the;
$ - end of line anchor.
So if you match what your key and value input lines separated by an = sign, it will separate the key and value into the array BASH_REMATCH as you wanted.
Bash supports BRE only and you cannot use \s and .*?.
As an alternative, please try:
while IFS= read -r content; do
# pattern='key2\s*=\s*\"(.*)\".*$'
pattern='key2[[:blank:]]*=[[:blank:]]*"([^"]*)"'
if [[ $content =~ $pattern ]]
then
key2="${BASH_REMATCH[1]}"
echo "key2: $key2"
(( found++ ))
fi
done < input-file.txt
if (( found == 0 )); then
echo "not found"
fi
What you start talking about key-value pairs, it is best to use an associative array:
declare -A map
Now looking at your lines, they look like key = "value" where we assume that:
value is always encapsulated by double quotes, but also could contain a quote
an unknown number of white spaces is before and/or after the equal sign.
So assuming we have a variable line which contains key = "value", the following operations will extract that value:
key="${line%%=*}"; key="${key// /}"
value="${line#*=}"; value="${value#*\042}"; value="${value%\042*}"
IFS=" \t=" read -r value _ <<<"$line"
This allows us now to have something like:
declare -A map
while read -r line; do
key="${line%%=*}"; key="${key// /}"
value="${line#*=}"; value="${value#*\042}"; value="${value%\042*}"
map["$key"]="$value"
done <inputfile
With awk:
awk -v key="key2" '$1 == key { gsub("\"","",$3);print $3 }' <<< "$string"
Reading the output of the variable called string, pass the required key in as a variable called key and then if the first space delimited field is equal to the key, remove the quotes from the third field with the gsub function and print.
Ok, after spending so many hours, this is how I solved the problem:
If you don't know where your script will run and what type of file (win/mac/linux) are you reading:
Try to avoid non-greedy macth in linux bash instead of tweaking diffrent switches.
don't trus end of line match $ when you might get data from windows or mac
This post solved my problem: Non greedy text matching and extrapolating in bash
This pattern works for me in may linux environments and all type of end of lines:
pattern='key2\s*=\s*"([^"]*)"'
The value is in BASH_REMATCH[1]

In bash how can I get the last part of a string after the last hyphen [duplicate]

I have this variable:
A="Some variable has value abc.123"
I need to extract this value i.e abc.123. Is this possible in bash?
Simplest is
echo "$A" | awk '{print $NF}'
Edit: explanation of how this works...
awk breaks the input into different fields, using whitespace as the separator by default. Hardcoding 5 in place of NF prints out the 5th field in the input:
echo "$A" | awk '{print $5}'
NF is a built-in awk variable that gives the total number of fields in the current record. The following returns the number 5 because there are 5 fields in the string "Some variable has value abc.123":
echo "$A" | awk '{print NF}'
Combining $ with NF outputs the last field in the string, no matter how many fields your string contains.
Yes; this:
A="Some variable has value abc.123"
echo "${A##* }"
will print this:
abc.123
(The ${parameter##word} notation is explained in ยง3.5.3 "Shell Parameter Expansion" of the Bash Reference Manual.)
Some examples using parameter expansion
A="Some variable has value abc.123"
echo "${A##* }"
abc.123
Longest match on " " space
echo "${A% *}"
Some variable has value
Longest match on . dot
echo "${A%.*}"
Some variable has value abc
Shortest match on " " space
echo "${A%% *}"
some
Read more Shell-Parameter-Expansion
The documentation is a bit painful to read, so I've summarised it in a simpler way.
Note that the '*' needs to swap places with the ' ' depending on whether you use # or %. (The * is just a wildcard, so you may need to take off your "regex hat" while reading.)
${A% *} - remove shortest trailing * (strip the last word)
${A%% *} - remove longest trailing * (strip the last words)
${A#* } - remove shortest leading * (strip the first word)
${A##* } - remove longest leading * (strip the first words)
Of course a "word" here may contain any character that isn't a literal space.
You might commonly use this syntax to trim filenames:
${A##*/} removes all containing folders, if any, from the start of the path, e.g.
/usr/bin/git -> git
/usr/bin/ -> (empty string)
${A%/*} removes the last file/folder/trailing slash, if any, from the end:
/usr/bin/git -> /usr/bin
/usr/bin/ -> /usr/bin
${A%.*} removes the last extension, if any (just be wary of things like my.path/noext):
archive.tar.gz -> archive.tar
How do you know where the value begins? If it's always the 5th and 6th words, you could use e.g.:
B=$(echo "$A" | cut -d ' ' -f 5-)
This uses the cut command to slice out part of the line, using a simple space as the word delimiter.
As pointed out by Zedfoxus here. A very clean method that works on all Unix-based systems. Besides, you don't need to know the exact position of the substring.
A="Some variable has value abc.123"
echo "$A" | rev | cut -d ' ' -f 1 | rev
# abc.123
More ways to do this:
(Run each of these commands in your terminal to test this live.)
For all answers below, start by typing this in your terminal:
A="Some variable has value abc.123"
The array example (#3 below) is a really useful pattern, and depending on what you are trying to do, sometimes the best.
1. with awk, as the main answer shows
echo "$A" | awk '{print $NF}'
2. with grep:
echo "$A" | grep -o '[^ ]*$'
the -o says to only retain the matching portion of the string
the [^ ] part says "don't match spaces"; ie: "not the space char"
the * means: "match 0 or more instances of the preceding match pattern (which is [^ ]), and the $ means "match the end of the line." So, this matches the last word after the last space through to the end of the line; ie: abc.123 in this case.
3. via regular bash "indexed" arrays and array indexing
Convert A to an array, with elements being separated by the default IFS (Internal Field Separator) char, which is space:
Option 1 (will "break in mysterious ways", as #tripleee put it in a comment here, if the string stored in the A variable contains certain special shell characters, so Option 2 below is recommended instead!):
# Capture space-separated words as separate elements in array A_array
A_array=($A)
Option 2 [RECOMMENDED!]. Use the read command, as I explain in my answer here, and as is recommended by the bash shellcheck static code analyzer tool for shell scripts, in ShellCheck rule SC2206, here.
# Capture space-separated words as separate elements in array A_array, using
# a "herestring".
# See my answer here: https://stackoverflow.com/a/71575442/4561887
IFS=" " read -r -d '' -a A_array <<< "$A"
Then, print only the last elment in the array:
# Print only the last element via bash array right-hand-side indexing syntax
echo "${A_array[-1]}" # last element only
Output:
abc.123
Going further:
What makes this pattern so useful too is that it allows you to easily do the opposite too!: obtain all words except the last one, like this:
array_len="${#A_array[#]}"
array_len_minus_one=$((array_len - 1))
echo "${A_array[#]:0:$array_len_minus_one}"
Output:
Some variable has value
For more on the ${array[#]:start:length} array slicing syntax above, see my answer here: Unix & Linux: Bash: slice of positional parameters, and for more info. on the bash "Arithmetic Expansion" syntax, see here:
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Arithmetic-Expansion
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Shell-Arithmetic
You can use a Bash regex:
A="Some variable has value abc.123"
[[ $A =~ [[:blank:]]([^[:blank:]]+)$ ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
Prints:
abc.123
That works with any [:blank:] delimiter in the current local (Usually [ \t]). If you want to be more specific:
A="Some variable has value abc.123"
pat='[ ]([^ ]+)$'
[[ $A =~ $pat ]] && echo "${BASH_REMATCH[1]}" || echo "no match"
echo "Some variable has value abc.123"| perl -nE'say $1 if /(\S+)$/'

values to array from variable names with pattern

I have an unknown number of variable names with the pattern rundate*. For example, rundate=180618 && rundate2=180820. I know from here that I can send multiple variable names to a third variable: alld=(`echo "${!rundate*}"`) and while attempting to solve my problem, I figured out how to send multiple variable indices to a third variable: alld_indices=(`echo "${!alld[#]}"`). But, how do I send multiple values to my third variable: alld_values such that echo ${alld_values[#]} gives 180618 180820. I know from here how I can get the first value: firstd_value=(`echo "${!alld}"`). I suspect, I've seen the answer already in my searching but did not realize it. Happy to delete my question if that is the case. Thanks!
#!/usr/bin/env bash
# set up some test data
rundate="180618"
rundate1="180820"
rundate2="Values With Spaces Work Too"
# If we know all values are numeric, we can use a regular indexed array
# otherwise, the below would need to be ''declare -A alld=( )''
alld=( ) # initialize an array
for v in "${!rundate#}"; do # using # instead of * avoids IFS-related bugs
alld[${v#rundate}]=${!v} # populate the array, using varname w/o prefix as key
done
# print our results
printf 'Full array definition:\n '
declare -p alld # emits code that, if run, will redefine the array
echo; echo "Indexes only:"
printf ' - %s\n' "${!alld[#]}" # "${!varname[#]}" expands to the list of keys
echo; echo "Values only:"
printf ' - %s\n' "${alld[#]}" # "${varname[#]}" expands to the list of values
...properly emits as output:
Full array definition:
declare -a alld=([0]="180618" [1]="180820" [2]="Values With Spaces Work Too")
Indexes only:
- 0
- 1
- 2
Values only:
- 180618
- 180820
- Values With Spaces Work Too
...as you can see running at https://ideone.com/yjSD1J
eval in a loop will do it.
$: for v in ${!rundate*}
> do eval "alld_values+=( \$$v )"
> done
$: echo "${alld_values[#]}"
180618 180820
or
$: eval "alld_values=( $( sed 's/ / $/g' <<< " ${!rundate*}" ) )"
or
$: echo "alld_values=( $( sed 's/ / $/g' <<< " ${!rundate*}" ) )" > tmp && . tmp

String capturing and print the next characters.

I have tried few options but that not working on my case. My requirement is..
Suppose I have a parameter in a file and wanted to capture the details as below and run a shell script(ksh).
PARAMETR=aname1:7,aname2:5
The parameter contains 2 values delimited by a comma and each value separated by a colon.
So, wanted to process it as if the string matched as aname1 then print both in different variable $v1=aname1 and $v2=7. The same applies to the other value too if string searched as aname2 then $v1=aname2 and $v2=5.
Thank you in advance.
That will do what you're asking for
#!/bin/ksh
typeset -A valueArray
PARAMETR=aname1:7,aname2:5
paramArray=(${PARAMETR//,/ })
for ((i=0;i<=${#paramArray[#]};i++)); do
valueArray[${paramArray[$i]%:*}]=${paramArray[$i]#*:}
done
for j in ${!valueArray[#]}; do
print "$j = ${valueArray[$j]}"
done
Hope it can help
First split the line in two sets and than process each set.
echo "${PARAMETR}" | tr "," "\n" | while IFS=: read -r v1 v2; do
echo "v1=$v1 and v2=$v2"
done
Result:
v1=aname1 and v2=7
v1=aname2 and v2=5

List path with word match in bash script

I have a string that contain a list of lines.I want to search any particular string and list all the path that contains the string.
The given string contains the following:
755677 myfile/Edited-WAV-Files
756876 orignalfile/videofile
758224 orignalfile/audiofile
758224 orignalfile/photos
758225 others/video
758267 others/photo
758268 orignalfile/videofile1
758780 others/photo1
I want to extract and list only the path that start from Orignal File. My output should be like this:
756876 orignalfile/videofile
758224 orignalfile/audiofile
758224 orignalfile/photos
758268 orignalfile/videofile1
That looks easy enough...
echo "$string" | grep originalfile/
or
grep originalfile/ << eof
$string
eof
or, if it's in a file,
grep originalfile/ sourcefile
A bash solution:
while read f1 f2
do
[[ "$f2" =~ ^orignal ]] && echo $f1 $f2
done < file
If your string spans several lines like this:
755677 myfile/Edited-WAV-Files
756876 orignalfile/videofile
758224 orignalfile/audiofile
758224 orignalfile/photos
758225 others/video
758267 others/photo
758268 orignalfile/videofile1
758780 others/photo1
Then you can use this code:
echo "$(echo "$S" | grep -F ' orignalfile/')"
If the string is not separated by new lines then
echo $S | grep -oE "[0-9]+ orignalfile/[^ ]+"
Are you sure that your string contains linebreaks/newlines?
If it does then the solution of DigitalRoss will apply.
If it doesn't contain newlines then you must include them. In example if your code looks like
string=$(ls -l)
then you must prepend it with field separator string without linefeed:
IFS=$'\t| ' string=$(ls -l)
or with an empty IFS var:
IFS='' string=$(ls -l)
Docs for IFS from the bash man page:
IFS The Internal Field Separator that is used for word splitting after
expansion and to split lines into words with the read builtin command. The
default value is ``<space><tab><newline>''.
egrep '^[0-9]{6} orignalfile/' <<<"$string"
note:
the ^ matches the start of the string. You don't want to match things that happen to have orignalfile/ somewhere in the middle
[0-9]{6} matches the six digits at the start of each line

Resources