extract element from xml using xmlstarlet

extract element from xml using xmlstarlet - bash

I'm trying to extract the "enclosure url" from the xml below using xmlstarlet.
I found this Extract values from XML using xmlstarlet which has given me a couple of pointers but I'm a little confused as to why ias=http://www.oracle.com/ias-instance is set. Shouldn't the element be referenced? not the content?
Could something similar be used against the xml below or am I barking up the wrong tree?
<rss version="2.0">
<channel>
<title>Wrestle Me</title>
<description>
</description>
<acast:showId>fba44d70-2a73-44d9-8bf9-726e9fcc2b3e</acast:showId>
<acast:showUrl>wrestleme</acast:showUrl>
<link>https://play.acast.com/s/wrestleme</link>
<atom:link rel="self" type="application/rss+xml" href="https://rss.acast.com/wrestleme"/>
<pingback:receiver>https://interaction.acast.com/pingback</pingback:receiver>
<lastBuildDate>Thu, 18 Jul 2019 04:05:27 GMT</lastBuildDate>
<pubDate>Thu, 18 Jul 2019 04:00:00 GMT</pubDate>
<ttl>30</ttl>
<language>en</language>
<copyright>
</copyright>
<docs>https://www.acast.com/wrestleme</docs>
<acast:signature key="EXAMPLE" algorithm="aes-256-cbc">9w1/WFWZb9YCoFYTxRGhxA==</acast:signature>
<acast:network>
</acast:network>
<acast:settings>
</acast:settings>
<image>
</image>
<itunes:image href="https://thumborcdn.acast.com/66RENnKTBsvtDbIhZez9wTij8hg=/1500x1500/https://mediacdn.acast.com/assets/fba44d70-2a73-44d9-8bf9-726e9fcc2b3e/-jbxjj7ek-whatsapp_image_2018-01-02_at_11-16-53.jpeg"/>
<itunes:subtitle>
</itunes:subtitle>
<itunes:type>episodic</itunes:type>
<itunes:author>Radio Stakhanov</itunes:author>
<itunes:summary></itunes:summary>
<itunes:owner>
</itunes:owner>
<itunes:explicit>yes</itunes:explicit>
<itunes:keywords/>
<itunes:category text="Sports & Recreation">
</itunes:category>
<media:credit role="author">Radio Stakhanov</media:credit>
<media:description type="html">
</media:description>
<item>
</item>
<item>
<title>It just makes me sad - WrestleMania XX part 4</title>
<acast:episodeId>b51a603d-8980-4cef-9a6c-b5cf0e5038a0</acast:episodeId>
<acast:episodeUrl>itjustmakesmesad-wrestlemaniaxxpart4</acast:episodeUrl>
<itunes:subtitle>
</itunes:subtitle>
<itunes:summary>
</itunes:summary>
<guid isPermaLink="false">b51a603d-8980-4cef-9a6c-b5cf0e5038a0</guid>
<pubDate>Thu, 11 Jul 2019 01:00:00 GMT</pubDate>
<itunes:duration>00:40:32</itunes:duration>
<itunes:keywords/>
<itunes:explicit>no</itunes:explicit>
<itunes:episodeType>full</itunes:episodeType>
<itunes:image href="https://thumborcdn.acast.com/iRcL017SnqZUc-ne7xmt6JnGZQw=/3000x3000/https://mediacdn.acast.com/assets/b51a603d-8980-4cef-9a6c-b5cf0e5038a0/cover-image-jwamv8ll-wrestleme_1500.jpeg"/>
<description>
</description>
<acast:settings>
</acast:settings>
<link>
</link>
<enclosure url="https://media.acast.com/wrestleme/itjustmakesmesad-wrestlemaniaxxpart4/media.mp3" length="38927363" type="audio/mpeg"/>
</item>
</channel>
</rss>

Related

Is there a way to find a specific string in an xml file and then replace the next string underneath it with a batch script?

Is this possible? I need to edit the following xml file. For every "BASerialKeyND", I need to replace the 987654321 right underneath it. Same thing for "BASerialKey" and 98-7654-321. I cannot count the lines in the file and assign it to a variable and then replace those specific lines because BASerialKeyND and BASerialKey occur on different lines in different files.
Thanks so much for your help!!!!!!!!!!
<?xml version="1.0" encoding="utf-8"?>
<SerializableDictionary>
<item>
<key>BASerialKeyND</key>
<value>987654321</value>
</item>
<item>
<key>BASerialKey</key>
<value>98-7654-321</value>
</item>
<item>
<key>MACHINETYPE</key>
<value>Max</value>
</item>
<item>
<key>PC1NAME</key>
<value>987654321PC1</value>
</item>
<item>
<key>PC2NAME</key>
<value>987654321PC2</value>
</item>
<item>
<key>REPORTPRINTER</key>
<value>None</value>
</item>
<item>
<key>PC1PRINTERS</key>
<value>Name=Microsoft XPS Document Writer
</SerializableDictionary>

How to highlite blocks of HTML code in c++ code?

I write web project on c++. In my c++ code has to insert html, for example such
void CPage::putBaseFooter() {
if(m_canRender) {
HTML(
<!++
</main>
<footer>
<f++ composePageFooter(); ++f>
</footer>
</body>
</html>
++!>);
}
}
That is the whole html code is between <!++ and ++!> (Code is processed before compiling its own preprocessor to string)
Just have your own macros preprocessor, such as for example
<f++ composeHead(); ++f>
<v++ ts.tm_year + 1900++v>
<paged_list++ [day_tasks_control] [/tasks/list] [taskListRenderer]>
...
<++paged_list>
<labeled_control++ [Description] [taskDescription]>
<textarea></textarea>
<++labeled_control>
Tell me please, how i can highlite html keywords and own macroses into qt-creator code editor? I tried to write a higlite-xml for Kate (with inheritance c++ highlite), but probably something I do not understand, since the backlight does not work.
Here are my sketches syntax highlighting
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE language SYSTEM "language.dtd"
[
<!ENTITY space " ">
<!ENTITY separators ",;">
<!ENTITY ns_punctuators "!%&space;&()+-/.*<=>?[]{|}~^&separators;">
]>
<!--
Copyright (c) 2012 by Alex Turbov (i.zaufi#gmail.com)
-->
<language
name="C++"
section="Sources"
version="1.0"
kateversion="2.4"
indenter="cstyle"
style="C++"
mimetype="text/x-c++src;text/x-c++hdr;text/x-chdr"
extensions="*.c++;*.cxx;*.cpp;*.cc;*.C;*.h;*.hh;*.H;*.h++;*.hxx;*.hpp;*.hcc;*.moc"
author="Sheridan (gorlov.maxim#gmail.com)"
license="LGPL"
priority="11"
>
<highlighting>
<list name="InplaceHTML">
<item> form </item>
<item> table </item>
<item> div </item>
<item> td </item>
<item> tr </item>
<item> th </item>
<item> span </item>
<item> input </item>
<item> textarea </item>
<item> label </item>
<item> a </item>
<item> head </item>
<item> link </item>
<item> script </item>
</list>
<contexts>
<context attribute="Normal Text" lineEndContext="#stay" name="Normal">
<IncludeRules context="##C++" />
<IncludeRules context="DetectInplaceHTML" />
</context>
<context attribute="Normal Text" lineEndContext="#stay" name="DetectInplaceHTML">
<keyword attribute="Inplace HTML" context="#stay" String="InplaceHTML" />
</context>
</contexts>
<itemDatas>
<itemData name="Normal Text" defStyleNum="dsNormal" spellChecking="false" />
<itemData name="Inplace HTML" defStyleNum="dsKeyword" color="#0095ff" selColor="#ffffff" bold="1" italic="0" spellChecking="false" />
</itemDatas>
</highlighting>
</language>

The Kate C++ highlighting is not used in Qt Creator, changing/extending the Kate configuration file won't have anny affect. You could try to register a separate mime type that's not recognized as C or C++ by Qt Creator and write a Kate highlighter for that, but I don't know whether this would work.

Actually you may have few highlighters for the same language, but w/ different priority!
Take a look to syntax files: there is ISO C++ and C++ (which is a pure C++ syntax plus Qt4 addons). Also here is alternative C++ highlighters where C++ is a pure C++ syntax and C++/Qt4 is a secondary. One may use configuration settings to change priority according needs. Personally I prefer to have a pure C++ over "default" C++/Qt4.
So you may try to add your own C++/Custom syntax and boost its priority. Take a look to C++/Qt4 to get an idea how to "reuse" pure C++ syntax.
And finally, considering your example syntax, you'd better to detect your extension before fall into inherited C++ contexts.

Description for URL in Google Sitemap

Is it possible to specify description for URL-element in Google Sitemaps?
Example (see description-tag):
<url>
<loc>http://www.example.com/dogs</loc>
<description>This page for dogs, you can find new products for dogs here.</description>
<changefreq>hourly</changefreq>
<priority>1.00</priority>
<image:image>
<image:loc>http://www.example.com/image.jpg</image:loc>
</image:image>
</url>
<url>
<loc>http://www.example.com/cats</loc>
<description>This page for cats, you can find new products for cats here.</description>
<changefreq>hourly</changefreq>
<priority>1.00</priority>
<image:image>
<image:loc>http://www.example.com/image.jpg</image:loc>
</image:image>
</url>
Thank you!

Second the support of Google for sitemaps you can submit a RSS or atom file (a xml) like a sitemap.
In RSS or atom file you can to put a description for the url. You can read the especification of RSS
An exemple:
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>RSS Title</title>
<description>This is an example of an RSS feed</description>
<link>http://www.someexamplerssdomain.com/main.html</link>
<lastBuildDate>Mon, 06 Sep 2010 00:01:00 +0000 </lastBuildDate>
<pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>
<ttl>1800</ttl>
<item>
<title>Example entry</title>
<description>Here is some text containing an interesting description.</description>
<link>http://www.wikipedia.org/</link>
<guid>unique string per item</guid>
<pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>
</item>
</channel>
</rss>

Ruby zlib::Gzip not working properly

I was in the middle of creating an inport/export system that could encode a costume data structure in an xml tree and then read it back and recreate the object.
I got the xml part to work fine but when I discovered that the xml file was 1.5mb when the original ruby::Marshal file was only 105kb I decided that it would be a good idea to compress the file.
so I did this
require "rexml/document"
require "zlib"
include REXML
tilesetfile = File.new( "tilesets.rmpy", "w+" )
buffer = ""
tilesetgz = Zlib::GzipWriter.new(tilesetfile)
puts "Compressing output for: tilesets.rxdata ..."
tilesetdoc.write(buffer, 0)
tilesetgz.write(buffer)
tilesetgz.close
then I tried to get the buffer string back so that I could phrase it as xml again like so
require "rexml/document"
require "zlib"
include REXML
tilesetfile = File.open("tilesets.rmpy", "r")
tilesetgz = Zlib::GzipReader.new(tilesetfile)
testfile = File.new("importtest.txt", "w")
tilesetdoc = Document.new tilesetgz
it should be noted that neither of these spinets contains the entire system, just the require header and the last few lines that do the compression
but I get a phrasing error because the xml document has bee corrupted some how
this is the out put of the import scrupt before I Gziped it (tracated of course the file is 1.5mb after all)
<tilesetdata>
<tileset>
<id>
1
</id>
<022-Roof01/>
<tileset_name>
019-DesertTown01
</tileset_name>
<autotile_names>
<item>
015-Sa_Water01
</item>
<item>
016-Sa_Shadow01
</item>
<item>
018-Sa_Ground01
</item>
<item>
019-Sa_Grass02
</item>
<item>
020-Sa_Grass03
</item>
<item>
021-Sa_Road01
</item>
<item>
022-Roof01
</item>
</autotile_names>
<panorama_name>
</panorama_name>
<panorama_hue>
0
</panorama_hue>
<fog_name>
</fog_name>
<fog_hue>
0
</fog_hue>
<fog_opacity>
64
</fog_opacity>
<fog_blend_type>
0
</fog_blend_type>
<fog_zoom>
200
</fog_zoom>
<fog_sx>
0
</fog_sx>
<fog_sy>
0
</fog_sy>
<battleback_name>
</battleback_name>
<passages>
<item>
15
</item>
<item>
15
</item>
<item>
15
</item>
<item>
15
</item>
<item>
15
</item>
<item>
15
</item>
on the import side I intercepted the the unGziped file strangely enough the file is 1.3mb this time
<tilesetdata>
<tileset>
<id>
1
</id>
<022-Roof01/>
<tileset_name>
019-DesertTown01
</tileset_name>
<autotile_names>
<item>
015-Sa_Water01
</item>
<item>
016-Sa_Shadow01
</item>
<item>
018-Sa_Ground01
</item>
<item>
019-Sa_Grass02
</item>
<item>
020-Sa_Grass03
</item>
<item>
021-Sa_Road01
</item>
<item>
022-Roof01
</item>
</autotile_names>
<panorama_name>
</panorama_name>
<panorama_hue>
0
</panorama_hue>
<fog_name>
</fog_name>
<fog_hue>
0
</fog_hue>
<fog_opacity>
64
</fog_opacity>
<fog_blend_type>
0
</fog_blend_type>
<fog_zoom>
200
</fog_zoom>
<fog_sx>
0
</fog_sx>
<fog_syy>
0
</fog_sx>
<fog_syy>
0
</fog_sx>
<fog_syy>
0
</fog_sx>
<fog_syy>
0
</fog_sx>
<fog_syy>
0
</fog_sx>
the corruption of the original only gets worse from here on
to clarify I ran the import script and it generated the file tilesets.rmpy (at 18kb)
and the i ran the import to test the system and discovered this.
any idea what whent wrong? or if not how fix it, an alternative?

It seems that Gzip only works properly with files open in binary mode
require "rexml/document"
require "zlib"
include REXML
tilesetfile = File.new( "tilesets.rmpy", "wb" )
buffer = ""
tilesetgz = Zlib::GzipWriter.new(tilesetfile)
puts "Compressing output for: tilesets.rxdata ..."
tilesetdoc.write(buffer, 0)
tilesetgz.write(buffer)
tilesetgz.close
and
require "rexml/document"
require "zlib"
include REXML
tilesetfile = File.open("tilesets.rmpy", "rb")
tilesetgz = Zlib::GzipReader.new(tilesetfile)
tilesetdoc = Document.new tilesetgz.read.to_s
worked with out any problems

Handling an XML file with Ruby and Nokogiri

I am new to programming so bear with me. I have many XML documents that look like this:
File name: PRIDE_Exp_Complete_Ac_10094.xml.gz
<ExperimentCollection version="2.1">
<Experiment>
<ExperimentAccession>1015</ExperimentAccession>
<Title>Protein complexes in Saccharomyces cerevisiae (GPM06600002310)</Title>
<ShortLabel>GPM06600002310</ShortLabel>
<Protocol>
<ProtocolName>None</ProtocolName>
</Protocol>
<mzData version="1.05" accessionNumber="1015">
<cvLookup cvLabel="RESID" fullName="RESID Database of Protein Modifications" version="0.0" address="http://www.ebi.ac.uk/RESID/" />
<cvLookup cvLabel="UNIMOD" fullName="UNIMOD Protein Modifications for Mass Spectrometry" version="0.0" address="http://www.unimod.org/" />
<description>
<admin>
<sampleName>GPM06600002310</sampleName>
<sampleDescription comment="Ho, Y., et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002 Jan 10;415(6868):180-3.">
<cvParam cvLabel="NEWT" accession="4932" name="Saccharomyces cerevisiae (Baker's yeast)" value="Saccharomyces cerevisiae" />
</sampleDescription>
</admin>
</description>
<spectrumList count="0" />
</mzData>
</Experiment>
I want to take out the text in between "Title", "ProtocolName", and "SampleName" and save into a text file that has the same name as the .xml.gz. I have the following code so far (based on posts I saw on this site), but it seems not to work:
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML(File.open("PRIDE_Exp_Complete_Ac_10094.xml.gz"))
#ExperimentCollection = doc.css("ExperimentCollection Title").map {|node| node.children.text }
Can someone help me?
Thanks

IF you are happy with REXML, AND there's only one <Experiment> per file, then something like the following should help ... (by the way, above text is invalid XML since no closing <ExperimentCollection> tag)
require "rexml/document"
include REXML
xml=<<EOD
<Experiment>
<ExperimentAccession>1015</ExperimentAccession>
<Title>Protein complexes in Saccharomyces cerevisiae (GPM06600002310)</Title>
<ShortLabel>GPM06600002310</ShortLabel>
<Protocol>
<ProtocolName>None</ProtocolName>
</Protocol>
<mzData version="1.05" accessionNumber="1015">
<cvLookup cvLabel="RESID" fullName="RESID Database of Protein Modifications" version="0.0" address="http://www.ebi.ac.uk/RESID/" />
<cvLookup cvLabel="UNIMOD" fullName="UNIMOD Protein Modifications for Mass Spectrometry" version="0.0" address="http://www.unimod.org/" />
<description>
<admin>
<sampleName>GPM06600002310</sampleName>
<sampleDescription comment="Ho, Y., et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002 Jan 10;415(6868):180-3.">
<cvParam cvLabel="NEWT" accession="4932" name="Saccharomyces cerevisiae (Baker's yeast)" value="Saccharomyces cerevisiae" />
</sampleDescription>
</admin>
</description>
<spectrumList count="0" />
</mzData>
</Experiment>
EOD
doc = Document.new xml
doc.elements["Experiment/Title"].text
doc.elements["Experiment/Protocol/ProtocolName"].text
doc.elements["Experiment/mzData/description/admin/sampleName"].text

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

extract element from xml using xmlstarlet - bash

Related

Is there a way to find a specific string in an xml file and then replace the next string underneath it with a batch script?

How to highlite blocks of HTML code in c++ code?

Description for URL in Google Sitemap

Ruby zlib::Gzip not working properly

Handling an XML file with Ruby and Nokogiri

Categories

Resources