New to Apache Nifi and trying to process an XML that looks a bit like this:
<?xml version="1.0" encoding="iso-8859-1"?>
<productCatalog xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<version>CHANNELS-VERSION-100</version>
<channels>
<channel>
<id>1</id>
<name>Super Channel 1</name>
</channel>
<channel>
<id>2</id>
<name>Super Channel 2</name>
</channel>
</channels>
</productCatalog>
What I want, is to read the "version" element, then include it in the "channel" children when I process them further down the pipeline e.g. to produce something like this (in XML or JSON):
<processedChannel>
<catalogVersion>CHANNELS-VERSION-100</catalogVersion>
<id>2</id>
<name>Super Channel 2</name>
</processedChannel>
I've tried various permutations of XQuery, XMLSplit, UpdateAttribute to put the version in a flow attribute (not content),etc, but I cannot seem to make the "version" available for all the "channels" downstream. I can either get a flow that only contains the version, or I can get the channels, but I cannot find a way to combine them.
This seems like it should be easy, but I cannot find an obvious solution.
My real use case has a really big XML file, so I am trying to avoid loading it all in one go - I split it as early as possible so I can stream the children more easily. That's why I want to put the version onto the children if possible.
Any help gratefully received!
ForkRecord should do what you want. From your desired output I think you'll want "extract" as the mode, but you could try both and see what you get for output. ForkRecord and the XML Reader/Writer are available as of NiFi 1.7.0.
#mattyb: Thanks for your suggestions. ForkRecord looks really interesting, but doesn't fit with my current use case because it needs a schema. But the EvaluateXPath and EvaluateXQuery options both seem to work now, even though I spent hours playing around with these previously.
Here's my flow now:
ListFile --> FetchFile --> Evaluate XPath (to get version as flow-file attribute) --> SplitXml --> etc - and now I have the version in my flow-file attributes for the downstream processing, which was what was wanted.
Not sure why it didn't work before, but thanks for prompting me to look at it again.
Related
For reasons out of scope for this question I need to be able to handle multiple xml documents of the same structure but belonging to different namespaces (don't ask).
To achieve this I've become very accustomed to using an xpath like the following for many of my value selections:
//*[local-name()='apple']/*[local-name()='flavor']/text()"
My lack of understanding of predicates is preventing me from selecting a node's value based upon a sibling node's value. Consider the following xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<fruit>
<apple>
<kind>Red Delicious</kind>
<flavor>starchy</flavor>
</apple>
<apple>
<kind>Granny Smith</kind>
<flavor>tart</flavor>
</apple>
<apple>
<kind>Pink Lady</kind>
<flavor>sweet</flavor>
</apple>
</fruit>
Let's say I want to write an xpath that will select the flavor of a Granny Smith apple. While I would normally do something like:
//apple[kind/text()='Granny Smith']/flavor/text()
I cannot figure out how to merge the concept of utilizing local-name() to be namespace agnostic while still selecting a node based upon a sibling's value.
In short, what is the xpath necessary to return "tart" regardless of what namespace the input fruit xml document belongs to?
I need to be able to handle multiple xml documents of the same
structure but belonging to different namespaces (don't ask)
My preferred way to handle this is to first transform the data to use a single namespace, and then do the transformation proper. Doing it this way (a) keeps the real transformation much simpler, (b) puts all the namespace conversion logic in one place, and (c) makes the namespace conversion logic reusable - you can use the same transformation regardless how the data will subsequently be used.
I am new to ruby and XML. I have been given an XML file and asked to do some data manipulation in that.
For ex. consider the below XML file.
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to> Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
They are asking me to extract the the string which are inside the tags for ex "Tove", "Jani" and do some manipulation(for ex replacing "tove" with "john") on it and rewrite the data to same xml document.
I know ruby has a lot of gems and utilities and there must be a good utility to do it. If someone has any idea about any utility to do this work easily then just let me know.
And if there is no utility then if someone could give me some idea on how to proceed with it then it would be good.
One way is to use REXML that comes as part of the standard library.
Another way is to use Nokogiri (I would recommend using this).
Here are some good tutorials that will definitely help you:
http://ruby.bastardsbook.com/chapters/html-parsing/
https://blog.engineyard.com/2010/getting-started-with-nokogiri/
I am fetching data from database and send response in XML like below..
I want to fetch data into array or hash and then response to xml.... OR create direct xml..
Please refer below xml example...
<Response>
<Tolls>
<Toll>
<Id>123</Id>
<Name>Bradfield Highway</Name>
<Address>Bradfield Highway, New York</Address>
<Charge>5.95</Charge>
<Location lat="41.145556" lng="-73.995"/>
<EntryRects>
<EntryRect>
<Points>
<Point lat="41.145556" lng="-73.995"/>
<Point lat="41.145556" lng="-73.995"/>
<Point lat="41.145556" lng="-73.995"/>
<Point lat="41.145556" lng="-73.995"/>
</Points>
</EntryRects>
...
</EntryRects>
</Toll>
<Toll>
...
</Toll>
...
</Tolls>
</Response>
please send me response asap if any one know...
you should use the Builder::XmlMarkup, which provides a simple way to create XML markup and data structures
You don't say what database you are using, but many can generate the XML for you as the result of a query, instead of returning a normal "select" statement's output. That would be the fastest/easiest path because the data is going to have to be returned to your app anyway, so let the DBM do the conversion on the fly.
Second easiest is to use something like Nokogiri, Builder or one of several other gems. They can handle the encoding of non-ASCII characters, specifying the correct headers, and make sure the nesting and tag closure is correct. That's why people use those tools, because they save a huge amount of coding.
The last choice should be attempting to do it yourself. Simply because you asked the question, I suspect you don't really understand what goes into creating well-formed XML. It's possible to generate trivial XML output using something like ERB or maybe HAML to help with the nesting, but encoding will fall directly on you. If you insist on doing it, then start reading all the related links on the right side of the Stack Overflow page, plus any XML documentation you can find.
From my VS2010 deployment project I would like to apply two different transformations to two different attributes of one element in my web.config. Consider the following web.config snippet:
<exampleElement attr1="false" attr2="false" attr3="true" attr4="~/" attr5="false">
<supportedLanguages>
<!-- Some more elements here -->
</supportedLanguages>
</exampleElement>
Now how can I change attribute 'attr1' and remove attribute 'attr5' in the transformed web.config? I know how to perform the individual transformations:
<exampleElement attr1="true" xdt:Transform="SetAttributes(attr1)"></exampleElement>
and:
<exampleElement xdt:Transform="RemoveAttributes(attr5)"></exampleElement>
But I don't know how to combine these transforms. Anybody?
EDIT:
Can't answer my own question yet, but the solution seems to be:
It seems that it is possible to repeat the same element with different transformations, like so:
<?xml version="1.0"?>
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
<exampleElement attr1="true" xdt:Transform="SetAttributes(attr1)"></exampleElement>
<exampleElement xdt:Transform="RemoveAttributes(attr5)"></exampleElement>
</configuration>
As said, this seems to work, but I'm not sure whether this is the intended use of the web.config transformation syntax.
As Nick Nieslanik confirmed this is done by repeating the same element with different transformations, like so:
<?xml version="1.0"?>
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
<exampleElement attr1="true" xdt:Transform="SetAttributes(attr1)"></exampleElement>
<exampleElement xdt:Transform="RemoveAttributes(attr5)"></exampleElement>
</configuration>
I'm using XmlPreprocess tool for config files transformations & manipulation. It is using one mapping file for multiple environments. You can edit mapping file by Excel. It is very easy to use.
You can update your config files with xmlpreprocess and use configuration (debug, dev, prod,...) as a parameter for the different setup...
I'm interested in finding out what's the shortest script one can write to replace one XML element in a file with another one from a second file.
I can whip up a simple program to easily do this, but I'm wondering if it's easily do-able using a shell script. It's just a utility tool meant as a convenience. Can this be done using sed? awk? I'm not familiar with those. I suppose I can probably do it with a combination of grep and wc, but it seems likely that there's a much more direct way to do this.
Essentially, I have a large configuration file, say config.xml, which say looks like this:
<config>
<element name="a">
<subelement />
</element>
<element name="b">
<subelement />
</element>
<element name="c">
<subelement />
</element>
<!-- and so on... -->
</config>
Once in a while, changes require me to modify/add/delete one subelement. Now, it so happens that there's a sort of generator that will generate an up-to-date subconfig.xml, like following file:
<config>
<element name="c">
<subelement />
<subelement />
</element>
</config>
My thinking is that if I can take the element in subconfig.xml and replace the existing one in config.xml, then hey, that'd be great! Yea, it's not much a work-saver, since it's only needed rarely, but it just occurred to me that I could try to do it in a script, but I'm not sure how.
Any help appreciated (including pointing out that I'd be better off writing a program for this ^-^).
If your xml are consistent and your replacement requirement is simple, there's actually no need to use parsers. Just simple awk will do
$ subconfig=(<subconfig.xml)
$ awk -v subconf=$subconfig '/<config>/{print subconf}/<config>/,/<\/config>/{next}1' config.xml
I wouldn't attempt to do this with command-line tools, you'll run into all sorts of difficulties. The way you should do this is with a proper XML parser. The logic would be: consume original file, parse it, consume update file, parse it, identify which node this guy is to replace, do the replace, write out the result.
I don't know what you are comfortable with coding-wise, but there's XML parsers available in most popular languages.