Sed - Conditionally match and add additional string after the find

Sed - Conditionally match and add additional string after the find - macos

Let's say I have a line like this in a file "config.xml"
<widget android-packageName="com.myproject" android-versionCode="12334" ios-CFBundleIdentifier="com.myproject" ios-CFBundleVersion="12334" version="1.5.2" versionCode="1.5.2" xmlns="http://www.w3.org/ns/widgets" xmlns:android="http://schemas.android.com/apk/res/android" xmlns:cdv="http://cordova.apache.org/ns/1.0">
And I want to use a line of command in sed to change it into this, which is adding ".1" after the current version numbers:
<widget android-packageName="com.myproject" android-versionCode="12334" ios-CFBundleIdentifier="com.myproject" ios-CFBundleVersion="12334" version="1.5.2.1" versionCode="1.5.2.1" xmlns="http://www.w3.org/ns/widgets" xmlns:android="http://schemas.android.com/apk/res/android" xmlns:cdv="http://cordova.apache.org/ns/1.0">
Assuming the version number could change, which means I would likely need to match it as a string between "version="" and """ first then add something after. How should I achieve that?
Attempted code that was (wrongly) shown in the form of an answer:
sed -i '' -e 's/\" versionCode=\"/\.1\" versionCode=\"/g' config.xml
sed -i '' -e 's/\" xmlns=\"/\.1\" xmlns=\"/g' config.xml

You may use this sed to append .1 in version number of any field name starting with version:
sed -i.bak -E 's/( version[^=]*="[.0-9]+)/\1.1/g' file
Output:
<widget android-packageName="com.myproject" android-versionCode="12334.1" ios-CFBundleIdentifier="com.myproject" ios-CFBundleVersion="12334" version="1.5.2.1" versionCode="1.5.2.1" xmlns="http://www.w3.org/ns/widgets" xmlns:android="http://schemas.android.com/apk/res/android" xmlns:cdv="http://cordova.apache.org/ns/1.0">
Breakup:
(: Start capture group
version: natch text version
[^=]*: match 0 or more of any character that is not =
=: match a =
": match a "
[.0-9]+: match 1+ of any character that are digits or dot
): End capture group

Related

How to add ","after every sed match?

I have this code:
cat response_error.xml | sed -ne 's#\s*<[^>]*>\s*##gp' >> response_error.csv
but all sed match from xml are bonded, for exemple:
084521AntonioCallas
I want to get this effect
084521,Antonio,Callas,
is it possible?
I must write a script which collect XML documents from previous day, extract from them only data without <...> and save this information to csv file in this way: 084521,Antonio,Callas - information separated by commas. The XML look like this:
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<GenerarInformeResponse xmlns="http://experian.servicios.CAIS">
<GenerarInformeResult>
<InformeResumen xmlns="http://experian.servicios.CAIS.V2">
<IdSuscriptor>084521</IdSuscriptor>
<ReferenciaConsulta>Antonio Callas 00000000</ReferenciaConsulta>
<Error>
<Codigo>0000</Codigo>
<Descripcion>OK</Descripcion>
</Error>
<Documento>
<TipoDocumento>
<Codigo>01</Codigo>
<Descripcion>NIF</Descripcion>
</TipoDocumento>
<NumeroDocumento>000000000</NumeroDocumento>
<PaisDocumento>
<Codigo>000</Codigo>
<Descripcion>ESPAÑA</Descripcion>
</PaisDocumento>
</Documento>
<Resumen>
<Nombre>
<Nombre1>XXX</Nombre1>
<Nombre2>XXX</Nombre2>
<ApellidosRazonSocial>XXX</ApellidosRazonSocial>
</Nombre>
<Direccion>
<Direccion>XXX</Direccion>
<NombreLocalidad>XXX</NombreLocalidad>
<CodigoLocalidad/>
<Provincia>
<Codigo>39</Codigo>
<Descripcion>XXX</Descripcion>
</Provincia>
<CodigoPostal>39012</CodigoPostal>
</Direccion>
<NumeroTotalOperacionesImpagadas>1</NumeroTotalOperacionesImpagadas>
<NumeroTotalCuotasImpagadas>0</NumeroTotalCuotasImpagadas>
<PeorSituacionPago>
<Codigo>6</Codigo>
<Descripcion>XXX</Descripcion>
</PeorSituacionPago>
<PeorSituacionPagoHistorica>
<Codigo>6</Codigo>
<Descripcion>XXX</Descripcion>
</PeorSituacionPagoHistorica>
<ImporteTotalImpagado>88.92</ImporteTotalImpagado>
<MaximoImporteImpagado>88.92</MaximoImporteImpagado>
<FechaMaximoImporteImpagado>
<DD>27</DD>
<MM>03</MM>
<AAAA>2019</AAAA>
</FechaMaximoImporteImpagado>
<FechaPeorSituaiconPagoHistorica>
<DD>27</DD>
<MM>03</MM>
<AAAA>2019</AAAA>
</FechaPeorSituaiconPagoHistorica>
<FechaAltaOperacionMasAntigua>
<DD>16</DD>
<MM>12</MM>
<AAAA>2015</AAAA>
</FechaAltaOperacionMasAntigua>
<FechaUltimaActualizacion>
<DD>27</DD>
<MM>03</MM>
<AAAA>2019</AAAA>
</FechaUltimaActualizacion>
</Resumen>
</InformeResumen>
</GenerarInformeResult>
</GenerarInformeResponse>
</s:Body>
</s:Envelope>

You can extract the IdSuscriptor using the following command :
xmllint --xpath '//*[local-name()="IdSuscriptor"]/text()' response_error.xml
And the ReferenciaConsulta using the following command :
xmllint --xpath '//*[local-name()="ReferenciaConsulta"]/text()' response_error.xml
To produce the desired IdSubscriptor,FirstName,LastName I would use the following script :
id_suscriptor=$(xmllint --xpath '//*[local-name()="IdSuscriptor"]/text()' response_error.xml)
referencia_consulta=$(xmllint --xpath '//*[local-name()="IdSuscriptor"]/text()' response_error.xml)
first_name=$(echo "$referencia_consulta" | cut -f1)
last_name=$(echo "$referencia_consulta" | cut -f2)
echo "$id_suscriptor,$first_name,$last_name"
Note that this assumes the ReferenciaConsulta field will always contain a string starting with the first name and last name separated with a space.

If you want to parse XML, use a dedicated XML parser like Saxon.
If you want to parse a strange text file with some funny unrelated angle brackets, try this:
#! /bin/sed -nf
s/^<IdSuscriptor>\([0-9]\+\)<\/IdSuscriptor>/\1,/
t match1
b next
: match1
h
b
: next
s/^<ReferenciaConsulta>\([^ ]\+\) \([^ ]\+\) [0-9]\+<\/ReferenciaConsulta>/\1,\2,/
t match2
b
: match2
H
g
s/\n//
p
Explanation
t jumps to match1, if the preceeding s command did a replacement. Otherwise b jumps to next.
In case of a match h copies the matching string into the hold space and b stops the processing of the current line.
The second s command works the same way with the difference, that in case of no match b continues with the next line.
In case of the second match H appends the pattern space to the hold space, g copies the hold space to the pattern space, s removes the newline between the two matches and p prints the result.
Conclusion
If you do not know how to do it with sed don't try it. Try to learn a real programming language like Perl or JavaScript or Python. sed is a relic of bygone times.

if your data in 'd' file, try gnu sed:
sed -Ez 's/<[^>]*>//g;s/\n+|\s+/,/g;' d

How to get XML uncommented section using sed/awk

I have an XML file in my linux box & I want to read the lines which are not commented.
Example :
Input File
<?xml version="1.0" encoding="UTF-8"?>
<!-- This is an example
don't read it while working -->
<ccb>
<ccc>
<aaa>true</aaa>
<bbb>name_1</bbb>
<Port>1534</Port>
<datPort>1532</datPort>
<!--
<e214>
<ImsiPrefixLen>5</ImsiPrefixLen>
<LocalPrefix>97252</LocalPrefix>
</e214>
-->
</ccc>
</ccb>
Output file:
<?xml version="1.0" encoding="UTF-8"?>
<ccb>
<ccc>
<aaa>true</aaa>
<bbb>name_1</bbb>
<Port>1534</Port>
<datPort>1532</datPort>
</ccc>
</ccb>

Note that in XML a comment starts with <!-- and ends with -->; It can't contain --.
perl -pe 'BEGIN{undef$/}s/<!--.*?-->//gs' <<END
<?xml version="1.0" encoding="UTF-8"?>
<!-- This is an example
don't read it while working -->
<ccb>
<ccc>
<aaa>true</aaa>
<bbb>name_1</bbb>
<Port>1534</Port>
<datPort>1532</datPort>
<!--
<e214>
<ImsiPrefixLen>5</ImsiPrefixLen>
<LocalPrefix>97252</LocalPrefix>
</e214>
-->
</ccc>
</ccb>
END
Explanation
perl -h
-p : assume loop like -n but print line also, like sed
BEGIN block executed once at beginning to unset the input record separator ($/) because of multiline matching <!-- -->
s/// : substitute function (/ can be replaced by any other character)
<!--.*?--> : .* any string ? lazy modifier to get the shortest match
s : modifier so that . matches also newline character

Search and delete matches of patterns array

I made an array of filenames of files in which match an pattern:
lista=($(grep -El "<LastVisitedURL>.+</LastVisitedURL>.*<FavoriteTopic>0</FavoriteTopic>" *))
Now I would delete in a file index.xml all tags enclosure which contains the filenames in the array.
for e in ${lista[*]}
do
sed '/\<TopicKey FileName=\"$e\"\>.*\<\/TopicKey\>/d' index.xml
done
The complete script is:
#! /bin/bash
#search xml files watched and no favorites.
lista=($(grep -El "<LastVisitedURL>.+</LastVisitedURL>.*<FavoriteTopic>0</FavoriteTopic>" *))
#declare -p lista
for e in ${lista[*]}
do
sed '/<TopicKey FileName=\"$e\">.*<\/TopicKey>/d' index.xml
done
Even though the regex pattern doesn't work, -i option in sed for edit in place index.xml, reload index file many times how filenames have the array, and this is bad.
Any suggestions?

Here an example using xmlstarlet in a shell :
% cat file.xml
<?xml version="1.0"?>
<root>
<foobar>aaa</foobar>
<LastVisitedURL>http://foo.bar/?a=1</LastVisitedURL>
<LastVisitedURL>http://foo.bar/?a=2</LastVisitedURL>
<LastVisitedURL>http://foo.bar/?a=3</LastVisitedURL>
</root>
Then, the command line :
% xmlstarlet edit --delete '//LastVisitedURL' file.xml
<?xml version="1.0"?>
<root>
<foobar>aaa</foobar>
</root>

I need to remove with bash two characters from one long line xml string

I'm reading from stdin line by line strings like:
<xml version="1.0" encoding="UTF-8">\n<Datanode ....
I need to get rid of that \n , it is not a newline, just a nasty sequence.
I need to read it form pipe, process it and pipe further.
Usually I got help from tr or cut but against this sequence I cannot find the way, they either do not remove it, or remove some other "n"s from XML string as well.

So you want to remove the string made of '\' followed by 'n' ok?
Something like this should work:
... | sed 's/\\n//' | ...
or this if you want to remove multiple sequences:
... | sed 's/\\n//g' | ...
And, if you want to anchor the sequence to be removed:
... | sed 's/>\\n</></' | ...
UPDATE
In case you don't want to remove the sequence '\''n' but replace it with a real new line (and I did notice your tag osx), you might want to use the following:
... | sed -e 's/\\n/\'$'\n/' | ...

I'm assuming here that your document isn't valid XML on account of containing a text node outside the root, which would explain why you can't use conventional XML-centric tools.
To truly use only bash, and do this in a manner that's safe against corrupting your file (performs the replacement only for the exact header text only on the very first line):
correct_xml_header() {
local bad_header correct_header content
bad_header='<xml version="1.0" encoding="UTF-8">\n'
correct_header='<?xml version="1.0" encoding="UTF-8"?>'
IFS= read -r -d '' content
if [[ $content = "$bad_header"* ]]; then
content=${correct_header}${content#"$bad_header"}
fi
printf '%s' "$content"
}
You can then pipe through this function:
generate_bad_xml | correct_xml_header | consume_good_xml
If you want to add a literal newline, add $'\n' to the end of the definition of correct_header, as in:
correct_header='<?xml version="1.0" encoding="UTF-8"?>'$'\n'
Note that I'm also changing <xml ...> to <?xml ...?>, which is a change similarly necessary to make this tool's output parse correctly with XML-compliant tools.

sed command not replacing "\"

I am using sed to replace a content in a file with string "dba01upc\Fusion_test". After the replacement I see the '\' character is missing. The replaced string is dba01upcFusion_test . Looks like sed is ignoring '\' while replacing..
Can anyone let me know the sed command to include all characters?
My Sed Command:
sed -i "s%{"sara_ftp_username"}%"dba01upc\Fusion_test"%g" /home_ldap/user1/placeholder/Sara.xml
Before Replacement : Sara.xml
<?xml version="1.0" encoding="UTF-8"?>
<ser:service-account >
<ser:description/>
<ser:static-account>
<con:username>{sara_ftp_username}</con:username>
</ser:static-account>
</ser:service-account>
After Replacement : Sara.xml
<?xml version="1.0" encoding="UTF-8"?>
<ser:service-account>
<ser:description/>
<ser:static-account>
<con:username>dba01upcFusion_test</con:username>
</ser:static-account>
</ser:service-account>
Thanks

sed -i 's%{sara_ftp_username}%dba01upc[\]Fusion_test%g' /home_ldap/user1/placeholder/Sara.xml
# or
sed -i 's%{sara_ftp_username}%dba01upc\\Fusion_test%g' /home_ldap/user1/placeholder/Sara.xml
# or
sed -i "s%{sara_ftp_username}%dba01upc\\\Fusion_test%g" /home_ldap/user1/placeholder/Sara.xml
escape the \ (twice if double quote)

Try this:
sed -i "s%sara_ftp_username%dba01upc\\\Fusion_test%g" /home_ldap/user1/placeholder/Sara.xml

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Sed - Conditionally match and add additional string after the find - macos

Related

How to add ","after every sed match?

How to get XML uncommented section using sed/awk

Search and delete matches of patterns array

I need to remove with bash two characters from one long line xml string

sed command not replacing "\"

Categories

Resources