Trying to parse xml with nokogiri and ruby - ruby

I am trying to parse the xml below to get the email address out. I can get the messageid but I think having the a: in front is enabling me to use xpath. Not sure how to pull out the email address. I am trying
xml.xpath("//s:Body/Discover/request/EmailAddress").children.text.to_s
and
xml.xpath("//s:Body/Discover/EmailAddress").children.text.to_s
if i do xml.xpath("//s:Body").children.text.to_s i get the email and the version with all the newlines and tabs but i do not want to parse the email out if i do not have to.
<s:Envelope xmlns:a="http://www.w3.org/2005/08/addressing" xmlns:s="http://www.w3.org/2003/05/soap-envelope">
<s:Header>
<a:Action s:mustUnderstand="1">test url</a:Action>
<a:MessageID>mid</a:MessageID>
<a:ReplyTo>
<a:Address>test url</a:Address>
</a:ReplyTo>
<a:To s:mustUnderstand="1">test url</a:To>
</s:Header>
<s:Body>
<Discover xmlns="test url">
<request xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<EmailAddress>bob#xml.com</EmailAddress>
<RequestVersion>1.0</RequestVersion>
</request>
</Discover>
</s:Body>
</s:Envelope>

The test url is preventing Nokogiri's Xpath from catching on to your namespacing within s:Body. Try simply
email = xml.xpath("//s:Body").first.to_xml.scan(/<EmailAddress>([^<]+)/)[0][0]

The Discover element (and its children) are in a different namespace, and you need to specify this in your query. The second argument to the xpath method is a hash where you can associate prefixes used in the query with namespace urls. Have a look at the section on namespaces in the Nokogiri tutorial.
With Nokogiri, if you don’t specify a namespace hash it will automatically register any namespaces defined on the root node for you. In this case that is the a prefix for http://www.w3.org/2005/08/addressing and the s prefix for http://www.w3.org/2003/05/soap-envelope. This is why your query for //s:Body works. The namespace declaration for Discover isn’t on the root, so you have to register it yourself.
When you provide your own namespace hash Nokogiri doesn’t add those defined on the root, so you will also need to include any of those used in your query.
In your case the following will find the EmailAddress node. The actual prefix you used doesn’t matter (here I’ve chosen t) as long as the URI matches).
xml.xpath('//s:Body/t:Discover/t:request/t:EmailAddress',
's' => "http://www.w3.org/2003/05/soap-envelope",
't' => "test url")

Related

Issue on Camel route - parsing XML tags

I have a complex camel route, which starts with an initialization route, which tries to set the headers with the info from the XML used as input.
I wonder how the route is not being able to parse the XML content, using XPath.
Before calling the route, I print the xml information in my java JUNIT, and it prints correctly, with all xml tags.
So I know the information is being sent as I am expecting.
But that route, which should set the headers using XPath, returns empty to any expression I try to use! I even used a XPath tool to assist me (https://codebeautify.org/Xpath-Tester), to check if was some xpath coding mistake, but I get the results I want from there.
So, let's suppose, I have an XML as:
<bic:Test>
<bic:context>
<bic:memberCode>GOOGLE</bic:memberCode>
</bic:context>
</bic:Test>
So, with the line below:
<setHeader headerName="myHeader">
<xpath resultType="java.lang.String">//<anyTag>/text()</xpath>
</setHeader>
or
<setHeader headerName="myHeader">
<xpath resultType="java.lang.String">//<anyTag></xpath>
</setHeader>
I will see the header with empty content.
I tried so many different things, that finally I decided to print the all the content, using an XPath expression as /.
It will print only the content ("GOOGLE"), not the tags.
Could you please assist me?
Thank you in advance!
This is probably a namespace related issue.
You have to define the bic namespace in the camel context and then use it in the xpath expression.
Have a look at the documentation in https://github.com/apache/camel/blob/master/camel-core/src/main/docs/xpath-language.adoc and particularly in the example of "Using XML configuration"
Also look at "Namespace auditing to aid debugging" for further information about debugging namespace related issues in camel.

How to get a child value with XPath

I have this xml
<S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/">
<S:Body>
<loginResponse xmlns="http://www.tedial.com/3rdparty/"
xmlns:ns2="http://www.tedial.com/apiextension/">
<session>1C7AE89A-73BF-01E9-9D3F-0010007FFF00</session>
</loginResponse>
</S:Body>
</S:Envelope>
I was trying so many combinations but I am unable to get the session value. Can you help me?
I tried //S:Envelope//S:Body//ns2:loginResponse//ns2:session with no luck
You used a wrong namespace on the session element. The default namespace of loginResponse - xmlns="http://www.tedial.com/3rdparty/" - is inherited to the session element. You have to use the same namespace as
with loginResponse which you - erroneously - assigned the ns2 namespace. So define a third namespace prefix for http://www.tedial.com/3rdparty/ - here I used third - and use that for loginResponse and session:
/S:Envelope/S:Body/third:loginResponse/third:session
Try this:
//S:Envelope//S:Body/loginResponse/session/text()
ns2 prefix not needed.
I got it working now. The problem was happening in SoapUI. For some reason Soap UI automatically uses namespaces for the default namespaces starting with ns1. In my case this Xpath expression worked fine:
//S:Envelope//S:Body//ns1:loginResponse//ns1:session

Formatting the response XML of a WebService when using Savon

I'm working using an external Webservice which I consume using the Savon gem.
I want to process the response of the WebService, before Savon, in order to clean the XML and get the correct Hash to work with. Currently, the Savon call method, answers with the Hash:
{:envelope => {
:body => {
:get_method_result => {
:result=>"OK",
:dataset_xml => "
<NewDataSet>
<xs:schema id=\"NewDataSet\" xmlns=\"\"........
Wich, as you can see, after dataset_xml has an XML string. So I have to take this and process it in order to have a full Hash.
All of this is happening because my response has thing like: <NewDataSet>\r\n <xs:schema id=\"NewDataSet\" xmlns=\ inside it's XML, which if I could be able to fix, then I wouldn't need to do all the after-process to turn it into a Hash.
You could simply try to parse the xml your self with nokogiri gem. Have you tried that already?
I would simply try
Nokogiri::XML(response[:body])
A friend solved it, he added a module named mod_substitute to Apache. I used it to parse the incoming XML, extracting the CDATA characters. With tha being done, the Savon gem received a clean XML which was parsed perfectly, in one step, to a Hash.
<Location />
AddOutputFilterByType SUBSTITUTE text/html
Substitute "s|CDATAREGEX|' '|i"
</Location>

Webapi put parameter from body requires xml to contain xmlns

I'm hoping this is an easy question. I haven't made a public api using webapi or a restful service before. So I have noticed that when creating a Put or Post method that use a complex object parameter the xml sent in the body is required to contain namespace info. For example.
public HttpResponseMessage Put(Guid vendortoken, [FromBody] ActionMessage message)
{
if (message == null)
return Request.CreateErrorResponse(HttpStatusCode.ExpectationFailed,
"actionmessage must be provided in request body.");
return Request.CreateResponse(HttpStatusCode.OK);
}
For message to come back not null my request has to look like this.
<ActionMessage
xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schemas.datacontract.org/2004/07/IntegrationModels">
<Message i:type="NewAgreement">
<AgreementGuid>3133b145-0571-477e-a87d-32f165187783</AgreementGuid>
<PaymentMethod>Cash</PaymentMethod>
</Message>
<Status>0</Status>
</ActionMessage>
the key here of course is the xmlns. On one hand the namespace is pretty generic so I feel like it shouldn't be an issue for vendors to provide, on the other hand should they really need to? If not how can I fix this so message will come back populated if they leave the name space out?
ah if only I could make them all use json :(
The namespace is significant in XML. If you want to remove it, what you can do is to change your ActionMessage class, to annotate it with the appropriate attribute (in your case, I'm assuming it would be the [DataContract(Namespace="")]), and that should remove the need for the namespace in the input (actually, after making that change using the namespace would be an error, so please consider the implications if you already have clients using your API out there).

Obtain XML element's value from REST server response using Ruby

n00b REST question. I'm making a GET request to an API's endpoint and getting the proper XML response. The question I have is, how do I get the value of a particular XML element in the servers REST response using Ruby?
So let's say one of the elements is 'Body' and I want to assign its value 'Blah blah blah' to a variable
Part of the XML response:
<Body>Blah blah blah</Body>
How would I do that with the response? Basically I want to do something like this
variable = params["Body"]
Thanks in advance!
The best solution is to use RestClient or HTTParty and have it parse the response for you.
Otherwise, you'll have to parse the response itself using a library such as Nokogiri:
doc = Nokogiri.XML(response)
variable = doc.at("body").text
You'll want to use an XML parser of some kind.
It sounds like you want something like XmlSimple, which will turn an XML document into ruby arrays and hashes. There's tons of examples of how to use it on the page that has been linked.
One thing to be aware of is that XML to native container mappings are imperfect. If you're dealing with a complex document, you'll likely want to use a more robust parser, like Nokogiri.
If you want full XML Object Mapping, HappyMapper is a decent library, although it isn't very active anymore. It can work with XML from any source, so you'll still want something like the libraries mentioned by #Fitzsimmons or #MarkThomas to do the HTTP request.

Resources