Replacing one XML element with another

Replacing one XML element with another - shell

I'm interested in finding out what's the shortest script one can write to replace one XML element in a file with another one from a second file.
I can whip up a simple program to easily do this, but I'm wondering if it's easily do-able using a shell script. It's just a utility tool meant as a convenience. Can this be done using sed? awk? I'm not familiar with those. I suppose I can probably do it with a combination of grep and wc, but it seems likely that there's a much more direct way to do this.
Essentially, I have a large configuration file, say config.xml, which say looks like this:
<config>
<element name="a">
<subelement />
</element>
<element name="b">
<subelement />
</element>
<element name="c">
<subelement />
</element>
<!-- and so on... -->
</config>
Once in a while, changes require me to modify/add/delete one subelement. Now, it so happens that there's a sort of generator that will generate an up-to-date subconfig.xml, like following file:
<config>
<element name="c">
<subelement />
<subelement />
</element>
</config>
My thinking is that if I can take the element in subconfig.xml and replace the existing one in config.xml, then hey, that'd be great! Yea, it's not much a work-saver, since it's only needed rarely, but it just occurred to me that I could try to do it in a script, but I'm not sure how.
Any help appreciated (including pointing out that I'd be better off writing a program for this ^-^).

If your xml are consistent and your replacement requirement is simple, there's actually no need to use parsers. Just simple awk will do
$ subconfig=(<subconfig.xml)
$ awk -v subconf=$subconfig '/<config>/{print subconf}/<config>/,/<\/config>/{next}1' config.xml

I wouldn't attempt to do this with command-line tools, you'll run into all sorts of difficulties. The way you should do this is with a proper XML parser. The logic would be: consume original file, parse it, consume update file, parse it, identify which node this guy is to replace, do the replace, write out the result.
I don't know what you are comfortable with coding-wise, but there's XML parsers available in most popular languages.

Related

Organizing references by year in Pandoc when generating HTML

I am relatively new with Pandoc and I am trying to generate an HTML file with my publications to put up on my website. I'd like to have the publication list numbered and organized by year first, with the most recent first and the oldest last.
I can get the numbering fine with the proper csl file, but I can't get the year sorting. The problem is that I'm not first author in all my publications, so what ends up happening is that they are organized alphabetically first and then by date, which is not what I want.
I can get the result I want when generating a PDF by using biblatex with the option sorting=ydnt (Year (Descending), Name, Title), but since Pandoc doesn't use biblatex to generate a list of references to HTML, I can't use this tactic here.
The only way I can see how to possibly solving this is to get a citation style in the Zotero style repo that does what I want, but I haven't been able to find one. So I'm trying to modify one to do it, but without success.
This answer teaches a way to change the sorting style, so I'm trying to manually change the sorting style of the Proceedings of the Royal Society B style. Specifically I'm changing
<sort>
<key variable="citation-number"/>
</sort>
to
<sort>
<key macro="issued" sort="descending"/>
<key macro="author"/>
</sort>
But that doesn't work (probably because that only changes the sorting of the text citations, not the reference list). I've tried a couple of other things, but I can't find something that works!
This doesn't matter much I guess, but I'm using Pandoc 2.7.3, citeproc version 0.16.2 and the file that I'm running on is:
---
bibliography: selectedpubs.bib
nocite: '#*'
linestretch: 1.5
fontsize: 12pt
output:
html:
output: pubpage.html
filter: pandoc-citeproc
csl: prsb2.csl
...
The file prsb2.csl is just the Proceedings of the Royal Society B csl.

You have the right idea, but misunderstood the linked thread. Instead of changing the sort keys for the citation, you'll want to add sorting to the bibliography, i.e.
<bibliography second-field-align="flush" et-al-min="11" et-al-use-first="10">
<sort>
<key macro="issued" sort="descending"/>
<key macro="author"/>
</sort>
<layout>
Instead of modifying a style, you could also use the APA-CV style that already exsits on the repository

Apache Nifi - move a top-level element into children (JSON/XML)

New to Apache Nifi and trying to process an XML that looks a bit like this:
<?xml version="1.0" encoding="iso-8859-1"?>
<productCatalog xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<version>CHANNELS-VERSION-100</version>
<channels>
<channel>
<id>1</id>
<name>Super Channel 1</name>
</channel>
<channel>
<id>2</id>
<name>Super Channel 2</name>
</channel>
</channels>
</productCatalog>
What I want, is to read the "version" element, then include it in the "channel" children when I process them further down the pipeline e.g. to produce something like this (in XML or JSON):
<processedChannel>
<catalogVersion>CHANNELS-VERSION-100</catalogVersion>
<id>2</id>
<name>Super Channel 2</name>
</processedChannel>
I've tried various permutations of XQuery, XMLSplit, UpdateAttribute to put the version in a flow attribute (not content),etc, but I cannot seem to make the "version" available for all the "channels" downstream. I can either get a flow that only contains the version, or I can get the channels, but I cannot find a way to combine them.
This seems like it should be easy, but I cannot find an obvious solution.
My real use case has a really big XML file, so I am trying to avoid loading it all in one go - I split it as early as possible so I can stream the children more easily. That's why I want to put the version onto the children if possible.
Any help gratefully received!

ForkRecord should do what you want. From your desired output I think you'll want "extract" as the mode, but you could try both and see what you get for output. ForkRecord and the XML Reader/Writer are available as of NiFi 1.7.0.

#mattyb: Thanks for your suggestions. ForkRecord looks really interesting, but doesn't fit with my current use case because it needs a schema. But the EvaluateXPath and EvaluateXQuery options both seem to work now, even though I spent hours playing around with these previously.
Here's my flow now:
ListFile --> FetchFile --> Evaluate XPath (to get version as flow-file attribute) --> SplitXml --> etc - and now I have the version in my flow-file attributes for the downstream processing, which was what was wanted.
Not sure why it didn't work before, but thanks for prompting me to look at it again.

parsing a XML document in ruby

I am new to ruby and XML. I have been given an XML file and asked to do some data manipulation in that.
For ex. consider the below XML file.
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to> Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
They are asking me to extract the the string which are inside the tags for ex "Tove", "Jani" and do some manipulation(for ex replacing "tove" with "john") on it and rewrite the data to same xml document.
I know ruby has a lot of gems and utilities and there must be a good utility to do it. If someone has any idea about any utility to do this work easily then just let me know.
And if there is no utility then if someone could give me some idea on how to proceed with it then it would be good.

One way is to use REXML that comes as part of the standard library.
Another way is to use Nokogiri (I would recommend using this).
Here are some good tutorials that will definitely help you:
http://ruby.bastardsbook.com/chapters/html-parsing/
https://blog.engineyard.com/2010/getting-started-with-nokogiri/

How to apply two different transformations on one web.config element?

From my VS2010 deployment project I would like to apply two different transformations to two different attributes of one element in my web.config. Consider the following web.config snippet:
<exampleElement attr1="false" attr2="false" attr3="true" attr4="~/" attr5="false">
<supportedLanguages>
<!-- Some more elements here -->
</supportedLanguages>
</exampleElement>
Now how can I change attribute 'attr1' and remove attribute 'attr5' in the transformed web.config? I know how to perform the individual transformations:
<exampleElement attr1="true" xdt:Transform="SetAttributes(attr1)"></exampleElement>
and:
<exampleElement xdt:Transform="RemoveAttributes(attr5)"></exampleElement>
But I don't know how to combine these transforms. Anybody?
EDIT:
Can't answer my own question yet, but the solution seems to be:
It seems that it is possible to repeat the same element with different transformations, like so:
<?xml version="1.0"?>
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
<exampleElement attr1="true" xdt:Transform="SetAttributes(attr1)"></exampleElement>
<exampleElement xdt:Transform="RemoveAttributes(attr5)"></exampleElement>
</configuration>
As said, this seems to work, but I'm not sure whether this is the intended use of the web.config transformation syntax.

As Nick Nieslanik confirmed this is done by repeating the same element with different transformations, like so:
<?xml version="1.0"?>
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
<exampleElement attr1="true" xdt:Transform="SetAttributes(attr1)"></exampleElement>
<exampleElement xdt:Transform="RemoveAttributes(attr5)"></exampleElement>
</configuration>

I'm using XmlPreprocess tool for config files transformations & manipulation. It is using one mapping file for multiple environments. You can edit mapping file by Excel. It is very easy to use.
You can update your config files with xmlpreprocess and use configuration (debug, dev, prod,...) as a parameter for the different setup...

Intelligencia URLRewriter - All .html to .aspx

How do I tell URLRewriter to convert all *.html request into *.aspx requests?
The following works just fine for one page.
<rewrite url=”~/mypage.html” to=”mypage.aspx″ / >
How do I do it in one place for all pages?
Thanks,

I'm not sure that this approach will be easy to maintain in the future. While this is easy to map as a wildcard, can you explain why this rule is needed?
For example, if you ever wanted to legitimately add a .HTML file to your solution, it would not be possible without an explicit <ignore /> statement being added to your XML file.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio