simplexml_load_file with xPath returns empty array - xpath

Getting XML from this URL:
$xml = simplexml_load_file('http://geocode-maps.yandex.ru/1.x/?geocode=37.71677,55.75208&kind=metro&spn=1,1&rspn=1');
print_r($xml) shows that XML loaded, but xpath always returns empty array. I tried:
$xml->xpath('/');
$xml->xpath('/ymaps');
$xml->xpath('/GeoObjectCollection');
$xml->xpath('/ymaps/GeoObjectCollection');
$xml->xpath('//GeoObjectCollection');
$xml->xpath('precision');
Why I got empty array? Hope I just missing something easy.

It might be rather easy, but I guess it is also the most common mistake in the history of XML: You are forgetting namespaces!
A lot of elements in the given XML are changing the default namespace and you have to consider that in your XPath.
You can first register your namespace like so:
$xml->registerXPathNamespace('y', 'http://maps.yandex.ru/ymaps/1.x');
$xml->registerXPathNamespace('a', 'http://maps.yandex.ru/attribution/1.x');
and then you can query your data:
$xml->xpath('//y:ymaps/y:GeoObjectCollection');

Related

Java XPath with default Namespace issue

I am not able to ready node for expression
<ns:Msg xmlns:ns="http://www.noventus.se/epix1" xmlns="http:www.defaultnamespace.com">
<ns:Header>
<SubsysId>1</SubsysId>
<SubsysType>30003</SubsysType>
<SendDateTime>2009-08-13T14:28:15</SendDateTime>
</ns:Header>
</ns:Msg>
I am having this kind of xml with contains two namespaces 1 is with ns and other one is default one.
I am trying to get value for SubsysId using org.dom4j.XPath and adding namespace with
Map namespaces = new HashMap();
namespaces.put("ns", "http://www.noventus.se/epix1");
namespaces.put("main", "http:www.defaultnamespace.com");
Adding these namespaces like this
xpath.setNamespaceContext(new SimpleNamespaceContext(namespaces));
This is my expression
String expression = "/ns:Msg/ns:Header/SubsysId";
I tried multiple options but not able to get the value.
NOTE: If I remove default namespace and run then I am getting the value.
Your help is highly appreciated.
Since you defined namespaces.put("main", "http:www.defaultnamespace.com");
then you would need to specify it in your xpath.
So your xpath becomes:
String expression = "/ns:Msg/ns:Header/main:SubsysId";

Extracting value from complex hash in Ruby

I am using an API (zillow) which returns a complex hash. A sample result is
{"xmlns:xsi"=>"http://www.w3.org/2001/XMLSchema-instance",
"xsi:schemaLocation"=>"http://www.zillow.com/static/xsd/SearchResults.xsd http://www.zillowstatic.com/vstatic/5985ee4/static/xsd/SearchResults.xsd",
"xmlns:SearchResults"=>"http://www.zillow.com/static/xsd/SearchResults.xsd", "request"=>[{"address"=>["305 Vinton St"], "citystatezip"=>["Melrose, MA 02176"]}],
"message"=>[{"text"=>["Request successfully processed"], "code"=>["0"]}],
"response"=>[{"results"=>[{"result"=>[{"zpid"=>["56291382"], "links"=>[{"homedetails"=>["http://www.zillow.com/homedetails/305-Vinton-St-Melrose-MA-02176/56291382_zpid/"],
"graphsanddata"=>["http://www.zillow.com/homedetails/305-Vinton-St-Melrose-MA-02176/56291382_zpid/#charts-and-data"], "mapthishome"=>["http://www.zillow.com/homes/56291382_zpid/"],
"comparables"=>["http://www.zillow.com/homes/comps/56291382_zpid/"]}], "address"=>[{"street"=>["305 Vinton St"], "zipcode"=>["02176"], "city"=>["Melrose"], "state"=>["MA"], "latitude"=>["42.466805"],
"longitude"=>["-71.072515"]}], "zestimate"=>[{"amount"=>[{"currency"=>"USD", "content"=>"562170"}], "last-updated"=>["06/01/2014"], "oneWeekChange"=>[{"deprecated"=>"true"}], "valueChange"=>[{"duration"=>"30", "currency"=>"USD", "content"=>"42749"}], "valuationRange"=>[{"low"=>[{"currency"=>"USD",
"content"=>"534062"}], "high"=>[{"currency"=>"USD", "content"=>"590278"}]}], "percentile"=>["0"]}], "localRealEstate"=>[{"region"=>[{"id"=>"23017", "type"=>"city",
"name"=>"Melrose", "links"=>[{"overview"=>["http://www.zillow.com/local-info/MA-Melrose/r_23017/"], "forSaleByOwner"=>["http://www.zillow.com/melrose-ma/fsbo/"],
"forSale"=>["http://www.zillow.com/melrose-ma/"]}]}]}]}]}]}]}
I can extract a specific value using the following:
result = result.to_hash
p result["response"][0]["results"][0]["result"][0]["zestimate"][0]["amount"][0]["content"]
It seems odd to have to specify the index of each element in this fashion. Is there a simpler way to obtain a named value?
It looks like this should be parsed into XML. According to the Zillow API Docs, it returns XML by default. Apparently, "to_hash" was able to turn this into a hash (albeit, a very ugly one), but you are really trying to swim upstream by using it this way. I would recommend using it as intended (xml) at the start, and then maybe parsing it into an easier to use format (like a JSON/Hash structure) later.
Nokogiri is GREAT at parsing XML! You can use the xpath syntax for grabbing elements from the dom, or even css selectors.
For example, to get an array of the "content" in every result:
response = #get xml response from zillow
results = Nokogiri::XML(response).remove_namespaces!
#using css
content_array = results.css("result content")
#same thing using xpath:
content_array = results.xpath("//result//content")
If you just want the content from the first result, you can do this as a shortcut:
content = results.at_css("result content").content
Since it is indeed XML dumped into a JSON, you could use JSONPath to query the JSON

How to get namespace names in XPath?

I have this .xml file
<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3.org/TR/html4/">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:tr>
<f:td>Red</f:td>
<f:td>Yellow</f:td>
</f:tr>
</f:table>
</root>
How can i get only the elements with a specify namespace?
For example i want to retrieve only that elements in 'h' namespace.
How can i get it? In exist-db the 'namespace::' axis is not more working
Try using the in-scope-prefixes() function in a predicate:
//*[in-scope-prefixes(.)='h']
In the comments you show a solution that parses the return value from name():
//*[substring-before(name(), ":")='h']
There is a far simpler way to get all elements in the namespace that's mapped to the h prefix:
//h:*
Note: when I first tested this, I was getting back all elements in the document. That's because both of your prefixes are mapped to the same namespace:
xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3.org/TR/html4/"
You should also fix this.

XPath format required on namespace node

Can someone please show me the XPath format i should use to retrieve the 2nd txnDetail node's billAmount ?
I am expecting value 10.00 but i have issues with the namespace and "a:" and XPath fails to retrieve the correct value.
<TransactionRsp xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<avlBal>818.00</avlBal>
<blkAmt>0.00</blkAmt>
<cardID>2561683577196298</cardID>
<currBill>GBP</currBill>
<endBal>390.00</endBal>
<logDateTime>2013-04-30T12:17:20.4249292Z</logDateTime>
<msgID>121719721</msgID>
<rspCode>000</rspCode>
<startBal>400.00</startBal>
<txnDetail xmlns:a="http://schemas.datacontract.org/2004/07/CoreModels">
<a:txnDetail>
<a:billAmount>400.00</a:billAmount>
<a:billConvRate>0.00</a:billConvRate>
<a:blkAmount>0.00</a:blkAmount>
<a:debOrCred>1</a:debOrCred>
<a:itemID>2278</a:itemID>
<a:itemType>6</a:itemType>
<a:txnAmount>0.00</a:txnAmount>
<a:txnCurrency/>
<a:txnDateTime>2012-02-23T14:35:45</a:txnDateTime>
<a:txnDescription></a:txnDescription>
</a:txnDetail>
<a:txnDetail>
<a:billAmount>10.00</a:billAmount>
<a:billConvRate>0.00</a:billConvRate>
<a:blkAmount>0.00</a:blkAmount>
<a:debOrCred>0</a:debOrCred>
<a:itemID>3058</a:itemID>
<a:itemType>5</a:itemType>
<a:txnAmount>0.00</a:txnAmount>
<a:txnCurrency/>
<a:txnDateTime>2012-07-30T12:22:14</a:txnDateTime>
<a:txnDescription>Fee: Card Issue</a:txnDescription>
</a:txnDetail>
</txnDetail>
</TransactionRsp>
It's:
//TransactionRsp/txnDetail/a:txnDetail[2]
However, depending on your programming language you might have to register the a namespace. The document might have a default namespace as well. (Don't expect that the xml you've posted is the whole document)
I have managed to pull the relevant data using the following XPath:
/TransactionRsp/txnDetail/[local-name()='txnDetail'][2]/[local-name()='billAmount']
Now I need to know how to filter out only txnDetail with an itemType = 6 ??
Any thoughts ?

XPath using string functions in the middle of the path

I'm trying to use Web Deploy 3.0 to make changes to my web.config before deployment. Let's say I have the following xml:
<node>
<subnode>
<connectInfo httpURL="http://LookImAUrl.com" />
</subnode>
<node>
And I'd like to match just the "http" in "http://..." so that I can potentially replace it with https.
I looked into XPath string functions and understand them -- I just don't know how to put them in the middle of an expression, for example:
"//node/subnode/connectInfo/#httpURL/substring-before(../#httpURL,':')"
That's basically what I want to do, but it doesn't look right.
"//node/subnode/connectInfo/#httpURL/substring-before(../#httpURL,':')"
That's basically what I want to do, but it doesn't look right.
But it is right and will match the http.
(Btw, you could write it shorter without ..
//node/subnode/connectInfo/#httpURL/substring-before(.,':')
)
However, it will return the string "http" not some kind of pointer pointing to the value of #httpUrl, which is not possible, since there are no partial nodes within the value.
(In XPath 2,) you can return the attribute and a new value, and then perhaps change it in the calling language
//node/subnode/connectInfo/#httpURL/(., concat("https:", substring-after(.,':')))
Using XPath 1.0, if you want to return the initial part of the URL use:
substring-before(//node/subnode/connectInfo/#httpURL,':')
Note though that this will return the value of ONLY the first connectInfo element.
If you want to get the connectInfo nodes that use HTTP:
//node/subnode/connectInfo[starts-with(#httpURL,'http:')]
If you wan to get all httpURL that use HTTP:
//node/subnode/connectInfo/#httpURL[starts-with(.,'http:')]

Resources