XPath using string functions in the middle of the path - xpath

I'm trying to use Web Deploy 3.0 to make changes to my web.config before deployment. Let's say I have the following xml:
<node>
<subnode>
<connectInfo httpURL="http://LookImAUrl.com" />
</subnode>
<node>
And I'd like to match just the "http" in "http://..." so that I can potentially replace it with https.
I looked into XPath string functions and understand them -- I just don't know how to put them in the middle of an expression, for example:
"//node/subnode/connectInfo/#httpURL/substring-before(../#httpURL,':')"
That's basically what I want to do, but it doesn't look right.

"//node/subnode/connectInfo/#httpURL/substring-before(../#httpURL,':')"
That's basically what I want to do, but it doesn't look right.
But it is right and will match the http.
(Btw, you could write it shorter without ..
//node/subnode/connectInfo/#httpURL/substring-before(.,':')
)
However, it will return the string "http" not some kind of pointer pointing to the value of #httpUrl, which is not possible, since there are no partial nodes within the value.
(In XPath 2,) you can return the attribute and a new value, and then perhaps change it in the calling language
//node/subnode/connectInfo/#httpURL/(., concat("https:", substring-after(.,':')))

Using XPath 1.0, if you want to return the initial part of the URL use:
substring-before(//node/subnode/connectInfo/#httpURL,':')
Note though that this will return the value of ONLY the first connectInfo element.
If you want to get the connectInfo nodes that use HTTP:
//node/subnode/connectInfo[starts-with(#httpURL,'http:')]
If you wan to get all httpURL that use HTTP:
//node/subnode/connectInfo/#httpURL[starts-with(.,'http:')]

Related

Access deep nested node from document.xml using nokogiri

I am using nokogiri to access a docx's document xml file.
here is a sample of it:
<w:document>
<w:body>
<w:p w:rsidR="00454EDC" w:rsidRDefault="00454EDC" w:rsidP="00454EDC">
<w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0">
<wp:extent cx="1926590" cy="1088571"/>
<wp:effectExtent l="0" t="0" r="0" b="0"/>
<wp:docPr id="1" name="Picture 1"/>
<wp:cNvGraphicFramePr>
<a:graphicFrameLocks xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" noChangeAspect="1"/>
</wp:cNvGraphicFramePr>
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr>
<pic:cNvPr id="0" name="Picture 1"/>
<pic:cNvPicPr>
<a:picLocks noChangeAspect="1" noChangeArrowheads="1"/>
</pic:cNvPicPr>
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:embed="rId5" cstate="print">
<a:extLst>
<a:ext uri="{28A0092B-C50C-407E-A947-70E740481C1C}">
<a14:useLocalDpi xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/main" val="0"/>
</a:ext>
</a:extLst>
</a:blip>
<a:srcRect/>
<a:stretch>
<a:fillRect/>
</a:stretch>
</pic:blipFill>
<pic:spPr bwMode="auto">
<a:xfrm>
<a:off x="0" y="0"/>
<a:ext cx="1951299" cy="1102532"/>
</a:xfrm>
<a:prstGeom prst="rect">
<a:avLst/>
</a:prstGeom>
<a:noFill/>
<a:ln>
<a:noFill/>
</a:ln>
</pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
</w:p>
</w:body>
</w:document>
Now I want to access all <w:drawing> tags and from them I wan to access <a:blip> tag and extract the value of attribute of r:embed from it.
In this case as you can see it is rId5
I am able to access the <w:drawing> tag by using xml.xpath('//w:drawing') but when I do so xml.xpath('//w:drawing').xpath('//a:blip'), it throws error :
Nokogiri::XML::XPath::SyntaxError: Undefined namespace prefix: //a:blip
What am I doing wrong, can anyone point me in the right direction?
The error is telling you that in your XPath query, //a:blip, Nokogiri doesn’t know what namespace a refers to. You need to specify the namespaces that you are targeting in your query, not just the prefix. The fact that the prefix a is defined in the document doesn’t really matter, it is the actual namespace URI that is important. It is possible to use completely different prefixes in the query than those used in the document, as long as the namespace URIs match.
You may be wondering why the query //w:drawing works. You don’t include the full XML, but I suspect that the w prefix is defined on the root node (something like xmlns:w="http://some.uri.here"). If you don’t specify any namespaces, Nokogiri will automatically register any defined in the root node so they will be available in your query. The namespace corresponding to the a prefix isn’t defined on the root, so it is unavailable, and so you get the error you see.
To specify namespaces in Nokogiri you pass a hash, mapping the prefix (as used in the query) to namespace URI, to the xpath method (or which ever query method you’re using). Since you are providing your own namespace mappings, you also need to include any you use from the root node, Nokogiri doesn’t include them in this case.
In your case, the code would look something like this:
namespaces = {
'w' => 'http://some.uri', # whatever the URI is for this namespace
'a' => 'http://schemas.openxmlformats.org/drawingml/2006/main'
}
# You can combine this to a single query.
# Also note you don’t want a double slash infront of
# the `/a:blip` part, just one.
xml.xpath('//w:drawing/a:blip', namespaces)
Have a look at the Nokogiri tutorial section on namespaces for more info.
I would say that this is a bug in the xml parser that you are using :
Indeed, the error seems to be that the namespace prefix a is undefined, however, it has been defined in <a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">, which is a parent of the <a:blip> element.
See here if you want to know more about xml namespaces
It seems that they are a few other questions about problems with namespace prefixes in nokogiri, for example : Undefined namespace prefix in Nokogiri and XPath

camel:when on header value using blueprint

I have camel routes that make rest calls based on header values.
I had been using xpath to read values from xml and set them as the header and used xpath in a block as so:
<camel:setHeader headerName="clear">
<xpath>/TicketInfo/TicketData/Clear/text()</xpath>
</camel:setHeader>
<camel:choice>
<camel:when>
<camel:xpath>$clear='CLEARED'</camel:xpath>
<camel:doTry>
...
but now I am forced to use json so xpath will not work. I now have:
<camel:setHeader headerName="clear">
<camel:jsonpath>$.ticket.Type</camel:jsonpath>
</camel:setHeader>
<camel:choice>
<camel:when>
<camel:xpath>$clear='CLEARED'</camel:xpath>
<camel:doTry>
...
but obviously the <camel:xpath>$clear='CLEARED'</camel:xpath> part won't work anymore. Is there another way I can check the value of $clear header to restrict when the <camel:doTry> and following execute?
Try the simple language :
<camel:when>
<camel:simple>${in.header.clear} == 'CLEARED'</camel:simple>
<camel:doTry>
See this documentation

SOAP UI: how to use property in xpath match assertion

i am testing a WS that adds events for an user. the last event added has an userEventId incremented, so i don't know in advance its value. to recover it, i use a Property Transfer.
Now, i would like to use an xquery match assertion to test my value. But i don't know how to use my property in the equery expression.
this matches:
//events[last()]/userEventId = <userEventId>12</userEventId>
returns:
<xml-fragment>true</xml-fragment>
but this not:
//events[last()]/userEventId = <userEventId>${UserEventId}</userEventId>
returns:
<xml-fragment>false</xml-fragment>
Is there a solution?
I think you need something like:
//events[last()]/userEventId = <userEventId>${#TestCase#UserEventId}</userEventId>
${UserEventId} by itself will not expand to anything in SoapUI.
works using XPath Match assertion:
matches(//events[last()]/userEventId, '${#subscribe_one_event_TestCase#user_event_id}')
returns true.

Extracting value from complex hash in Ruby

I am using an API (zillow) which returns a complex hash. A sample result is
{"xmlns:xsi"=>"http://www.w3.org/2001/XMLSchema-instance",
"xsi:schemaLocation"=>"http://www.zillow.com/static/xsd/SearchResults.xsd http://www.zillowstatic.com/vstatic/5985ee4/static/xsd/SearchResults.xsd",
"xmlns:SearchResults"=>"http://www.zillow.com/static/xsd/SearchResults.xsd", "request"=>[{"address"=>["305 Vinton St"], "citystatezip"=>["Melrose, MA 02176"]}],
"message"=>[{"text"=>["Request successfully processed"], "code"=>["0"]}],
"response"=>[{"results"=>[{"result"=>[{"zpid"=>["56291382"], "links"=>[{"homedetails"=>["http://www.zillow.com/homedetails/305-Vinton-St-Melrose-MA-02176/56291382_zpid/"],
"graphsanddata"=>["http://www.zillow.com/homedetails/305-Vinton-St-Melrose-MA-02176/56291382_zpid/#charts-and-data"], "mapthishome"=>["http://www.zillow.com/homes/56291382_zpid/"],
"comparables"=>["http://www.zillow.com/homes/comps/56291382_zpid/"]}], "address"=>[{"street"=>["305 Vinton St"], "zipcode"=>["02176"], "city"=>["Melrose"], "state"=>["MA"], "latitude"=>["42.466805"],
"longitude"=>["-71.072515"]}], "zestimate"=>[{"amount"=>[{"currency"=>"USD", "content"=>"562170"}], "last-updated"=>["06/01/2014"], "oneWeekChange"=>[{"deprecated"=>"true"}], "valueChange"=>[{"duration"=>"30", "currency"=>"USD", "content"=>"42749"}], "valuationRange"=>[{"low"=>[{"currency"=>"USD",
"content"=>"534062"}], "high"=>[{"currency"=>"USD", "content"=>"590278"}]}], "percentile"=>["0"]}], "localRealEstate"=>[{"region"=>[{"id"=>"23017", "type"=>"city",
"name"=>"Melrose", "links"=>[{"overview"=>["http://www.zillow.com/local-info/MA-Melrose/r_23017/"], "forSaleByOwner"=>["http://www.zillow.com/melrose-ma/fsbo/"],
"forSale"=>["http://www.zillow.com/melrose-ma/"]}]}]}]}]}]}]}
I can extract a specific value using the following:
result = result.to_hash
p result["response"][0]["results"][0]["result"][0]["zestimate"][0]["amount"][0]["content"]
It seems odd to have to specify the index of each element in this fashion. Is there a simpler way to obtain a named value?
It looks like this should be parsed into XML. According to the Zillow API Docs, it returns XML by default. Apparently, "to_hash" was able to turn this into a hash (albeit, a very ugly one), but you are really trying to swim upstream by using it this way. I would recommend using it as intended (xml) at the start, and then maybe parsing it into an easier to use format (like a JSON/Hash structure) later.
Nokogiri is GREAT at parsing XML! You can use the xpath syntax for grabbing elements from the dom, or even css selectors.
For example, to get an array of the "content" in every result:
response = #get xml response from zillow
results = Nokogiri::XML(response).remove_namespaces!
#using css
content_array = results.css("result content")
#same thing using xpath:
content_array = results.xpath("//result//content")
If you just want the content from the first result, you can do this as a shortcut:
content = results.at_css("result content").content
Since it is indeed XML dumped into a JSON, you could use JSONPath to query the JSON

simplexml_load_file with xPath returns empty array

Getting XML from this URL:
$xml = simplexml_load_file('http://geocode-maps.yandex.ru/1.x/?geocode=37.71677,55.75208&kind=metro&spn=1,1&rspn=1');
print_r($xml) shows that XML loaded, but xpath always returns empty array. I tried:
$xml->xpath('/');
$xml->xpath('/ymaps');
$xml->xpath('/GeoObjectCollection');
$xml->xpath('/ymaps/GeoObjectCollection');
$xml->xpath('//GeoObjectCollection');
$xml->xpath('precision');
Why I got empty array? Hope I just missing something easy.
It might be rather easy, but I guess it is also the most common mistake in the history of XML: You are forgetting namespaces!
A lot of elements in the given XML are changing the default namespace and you have to consider that in your XPath.
You can first register your namespace like so:
$xml->registerXPathNamespace('y', 'http://maps.yandex.ru/ymaps/1.x');
$xml->registerXPathNamespace('a', 'http://maps.yandex.ru/attribution/1.x');
and then you can query your data:
$xml->xpath('//y:ymaps/y:GeoObjectCollection');

Resources