edit attribute value in bash shell unix script - bash

Guys i have an xml where they have requestIDattrbute and i want to change its value - it can be in lof of forms like this:
<ns0:requestID>12345</ns0:requestID>
<requestID>12345</requestID>
<requestID>12345667</requestID><anyOtherAttribute>131241</anyOtherAttribute>
any suggestion how to make it through "sed" ? - Thanks

any suggestion how to make it through "sed" ?
This simple substitution command handles the shown cases:
sed 's,\<requestID>[^<]*,requestID>CHANGE,'
can you tell me what does it means ?
s,regexp,replacement, - Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement.
\< - match the empty string at the beginning of a word
[^<] - matches any character but <
* - The preceding item will be matched zero or more times.
CHANGE - value to which you want to change

Related

Remove string before and after characters in bash

I have a string like : 2021_03_19/ 19-Mar-2021 11:55 -
stored in a variable a
I tried to extract from it the sequence: 2021_03_19, the second one after /"> sequence with the following script:
a=${a##'/">'}
a=${a%%'/</a'}
But the final result is the same string as the input.
The pattern in the parameter expansion needs to match the entire string you want to remove. You are trying to trim the literal prefix /"> but of course the string does not begin with this string, so the parameter expansion does nothing.
Try
a=${a##*'/">'}
a=${a%%'/</a'*}
The single quotes are kind of unusual; I would perhaps instead backslash-escape each metacharacter which should be matched literally.
a=${a##*/\"\>}
a=${a%%/\</a*}
You have to match the before and after pattern too.
a=${a##*'/">'}
a=${a%%'/</a'*}
You could use:
a='2021_03_19/ 19-Mar-2021 11:55 -'
b=${a#*>}
c=${b%%/<*}
Based on Extract substring in Bash
In your example you want to select based on 3 characters but have ##, not ###. I did try that but doesn't seem to work either. So, therefore an alternative solution.

Extract a substring (value of an HTML node tag) in a bash/zsh script

I'm trying to extract a tag value of an HTML node that I already have in a variable.
I'm currently using Zsh but I'm trying to make it work in Bash as well.
The current variable has the value:
<span class="alter" fill="#ffedf0" data-count="0" data-more="none"/>
and I would like to get the value of data-count (in this case 0, but could be any length integer).
I have tried using cut, sed and the variables expansion as explained in this question but I haven't managed to adapt the regexs, or maybe it has to be done differently for Zsh.
There is no reason why sed would not work in this situation. For your specific case, I would do something like this:
sed 's/.*data-count="\([0-9]*\)".*/\1/g' file_name.txt
Basically, it just states that sed is looking for the a pattern that contains data-count=, then saves everything within the paranthesis \(...\) into \1, which is subsequently printed in place of the match (full line due to the .*)
Could you please try following.
awk 'match($0,/data-count=[^ ]*/){print substr($0,RSTART+12,RLENGTH-13)}' Input_file
Explanation: Using match function of awk to match regex data-count=[^ ]* means match everything from data-count till a space comes, if this regex is TRUE(a match is found) then out of the box variables RSTART and RLENGTH will be set. Later I am printing current line's sub-string as per these variables values to get only value of data-count.
With sed could you please try following.
sed 's/.*data-count=\"\([^"]*\).*/\1/' Input_file
Explanation: Using sed's capability of group referencing and saving regex value in first group after data-count=\" which is its length, then since using s(substitution) with sed so mentioning 1 will replace all with \1(which is matched regex value in temporary memory, group referencing).
As was said before, to be on the safe side and handle any syntactically valid HTML tag, a parser would be strongly advised. But if you know in advance, what the general format of your HTML element will look like, the following hack might come handy:
Assume that your variable is called "html"
html='<span class="alter" fill="#ffedf0" data-count="0" data-more="none"/>'
First adapt it a bit:
htmlx="tag ${html%??}"
This will add the string tag in front and remove the final />
Now make an associative array:
declare -A fields
fields=( ${=$(tr = ' ' <<<$htmlx)} )
The tr turns the equal sign into a space and the ${= handles word splitting. You can now access the values of your attributes by, say,
echo $fields[data-count]
Note that this still has the surrounding double quotes. Yuo can easily remove them by
echo ${${fields[data-count]%?}#?}
Of course, once you do this hack, you have access to all attributes in the same way.

Simple sed search and replace

I use sed vary rarely, perhaps once a year, and it always seems to take me hours to work out how to make it do even the simplest of tasks - a simple search/replace in a text file.
I need a single command line (for Windows) that will search for a simple string in a text file and replace all instances of that string with another.
For example - replace   with in an xml file.
sed -i "s/ / /g" 20151201120758.xml
sed - invokes the program - remember to add it to your path.
-i - inplace - modifies the original file.
"..." - remember to add quotes if there are special chartacters in your regex such as or &.
s/x/y/ - search for regex x and replace with string y.
.../g - global search - without this it will only replce the first match.
xxx.yyy - file to search - can use wild cards.

How to remove all characters but dots and numbers

I need to clear all characters but numbers and dots in a file.
The numbers are formatted as follows:
$(24.50)
Im using the following code to accomplish the task:
sed 's/[^0-9]*//'
It works but the last parenthesis is not removed. After running the code i get:
24.50)
I should get:
24.50
Please help
I think you could use the following:
sed 's/[^0-9.]//g'
Your regular expression is only matching a single instance of [^0-9.]*. Namely, the $( at the beginning. In order to get sed to match and replace all instances, you need to put a g at the end, as in:
sed 's/[^0-9.]*//g'
The g basically means "match this regular expression anywhere in the input". By default, it will only match on the first instance it encounters, and then stop.

extract string conditionally from variable column sized text file

From a text file with variable number of columns per row (tab delimited), I would like to extract value with specific condition.
The text file looks like:
S1=dhs Sb=skf S3=ghw QS=ghr</b>
S1=dhf QS=thg S3=eiq<b/>
QS=bhf S3=ruq Gq=qpq GW=tut<b/>
Sb=ruw QS=ooe Gq=qfj GW=uvd<b/>
I would like to have a result like:
QS=ghr<b/>
QS=thg<b/>
QS=bhf<b/>
QS=ooe
Please excuse my naive question but I am a beginner trying to learn some basic bash scripting technique for text manipulation.
Thanks in advance!
You could use awk ,
awk '{for(i=1;i<=NF;i++){if($i~/^QS=/){print $i}}}' file
This awk command iterates through each fields and check for the column which has QS= string at the start. If it finds any, then the corresponding column would be printed.
Through grep,
grep -oP '(^|\t)\KQS=\S*' file
-o parameter means only matching. So it prints only the characters which are matched.
-P this enables the Perl-regex mode.
(^|\t) matches the start of a line or a tab character.
\K discards the previously matched tab or start of the line boundary.
QS= Now it matches the QS= string.
\S* Matches zero or more non-space characters.

Resources