Google Sheets Importxml Xpath - xpath

I am trying to import both the link to the Google Maps image and the address of the council from https://www.google.com.au/?gws_rd=ssl#q=Albany+City+council+address&time=445678
I have tried all sorts of Xpath expressions and keep getting a result saying the imported results were empty.
For the address I have tried:
//*[#class='_uX kno-fb-ctx']
//div[#class='_eF']
//*[#class='_eF']
//div/div/div/div/div/div/div/div/div/ol/li/div/div/div/ol/li/div
The info I want appears in 3 places on that page - so any Xpath that gets it from one of these locations is what I am looking for:
<div class="_uX kno-fb-ctx" aria-level="3" role="heading" data-hveid="29" data-ved="0CB0QtwcoADAA"><div class="_eF">102 North Road, Yakamia WA 6330</div>
id="lnv_href"></a></div></td><td valign="top" style="color:#222;line-height:1.24">102 North Road, Yakamia WA 6330
<div class="_lR"><div class="_mr"><span style="font-weight:bold">Address:</span> <span>102 North Road, Yakamia WA 6330</span>
Any help would be greatly appreciated.

1) info You want appears in 2 places on that page, probbably the best way is to use Xpath construction with contains.
2) for map use
//*[#id="media_result_group"]/ol/div/div/div/div[2]/div/div[3]/a/img/#src
or simple
//div[#class="rhsg4 rhsmap5col"]/a/img/#src

In my opinion, you can't do that because those data from Google result page can't be rendered easily by xpath into google spreadsheet. The reason is they are rendered by javascript (more tech savvy guys will correct me if I am wrong).

The answer marked correct is incorrect. The content I was looking for is HTML and is able to be captured with ImportHTMl("URL" "table",4) but I did need to add WA (stands for Western Australia) to the search string.

Related

Xpath between 2 texts for IMPORTXML formula

I have worked out xpath which gives very close to what I need but needs some small refining.
https://www.punters.com.au/form-guide/
I want all URLs from the website racing Today and only in Australia
These are the xpaths I have now.
This one provides all the races on the page. Including all countries racing today. - //*[#class='component-wrapper form-guide-index']/table1/tbody/tr//td/a/#href
This one provides all races in Australia. But includes races today, tomorrow or any other day on the webpage - //tr[#class="upcoming-race__row"][preceding::tr[#class='upcoming-race__row upcoming-race__row--country']1[*/.="Australia"]]/td[position()>=2]/a/#href
OK. So this is the related topic :
xpath to obtain texts between 2 tags in IMPORTXML formula
To get the links of all races in Australia today (replace " with ' in GoogleSheets) :
//tr[#class="upcoming-race__row"][preceding::td[#class="upcoming-race__country-title"][1][.="Australia"]][preceding::h2[1][.="Today"]]/td[position()>=2]/a/#href
Alternative XPaths :
//h2[.="Today"]/following::table[1]//tr[#class="upcoming-race__row"][preceding::td[#class='upcoming-race__country-title'][1][.="Australia"]]/td[position()>=2]/a/#href
//div[#class="component-wrapper form-guide-index"]/table[1]//tr[#class="upcoming-race__row"][preceding::td[#class='upcoming-race__country-title'][1][.="Australia"]]/td[position()>=2]/a/#href

ImportXML function in Google Sheets produces error 'Imported content is empty'!

Here is the ImportXML formula I am using:
=IMPORTXML("https://finance.yahoo.com/quote/RY.TO/profile",K6)
Cell K6 contains the following xpath query:
//*[#id="Col1-0-Profile-Proxy"]/section/div[1]/div/div/p[2]/strong[1]
I got the xpath query by using the Copy XPath function in Google Chrome (e.g. after inspecting the element I am interested in).
The element I am interested in is the Sector associated with the Royal Bank (e.g. Financial Services)
Any help would be appreciated. Many thanks!!
Using the Copy XPath function is a handy feature. However, the suggested query is usually clumsy and sometimes does not yield the desired result. Here is an alternative approach:
//span[.='Sector']/following-sibling::strong[1]
Select the span that has the innerHtml "Sector" and then select the following strong sibling; finally, we can select the /text() directly too like this:
=IMPORTXML($A$10;"//span[.='Sector']/following-sibling::strong[1]/text()")
which returns: Financial Services

Scraping number of likes and comments from an Instagram post via Google Sheet IMPORTXML

I am a noob at importXML. The XPath to the number of likes is
//*[#id="react-root"]/section/main/div/div/article/div[2]/section[2]/div/a/span
So the formula for the scraping the number of likes from this post: https://www.instagram.com/p/BZLli5ll6yz/ should be:
=IMPORTXML("https://www.instagram.com/p/BZLli5ll6yz/", "//*[#id="react-root"]/section/main/div/div/article/div[2]/section[2]/div/a/span")
Right? What am I missing?
Make sure that in the xpath the "react-root" is in a subclass: 'react-root'. This keeps it contained within the second argument.

ImportXML xpath to google sheets returning #N/A

I am a beginner to programming in general and google in particular. I've been trying to get this (what seems to me) simple web query working for a while using the importxml() function. I am trying to pull a reference from a citation generation website, where you search a pubmed ID number (PMID).
The site is https://mickschroeder.com/citation/?q=18515037 where 18515037 is the PMID. This brings up a citation.
Allison MA, Kwan K, Ditomasso D, Wright CM, Criqui MH. The epidemiology of
abdominal aortic diameter. J Vasc Surg. 2008;48(1):121-7.
I did inspect element and got the XPath as:
//*[#id="citation_formatted"]/text()
So i have tried
=importxml(ttps://mickschroeder.com/citation/?q=18515037, "//*[#id="citation_formatted"]/text()")
And it returns #N/A or blank. I've tried taking out the * but can't get it working. Do I need to escape the () in the text()? Or do I have the Xpath totally wrong. I did a search for the answer but I figure I'm so new I can't apply those concepts.
Thanks for any help you can give.

ImportXML Xpath Query Return txt

I need to use Google Spreadsheet ImportXML return a value from this website...
http://www.e-go.com.au/calculatorAPI2?pickuppostcode=2000&pickupsuburb=SYDNEY+CITY&deliverypostcode=4000&deliverysuburb=BRISBANE&type=Carton&width=40&height=35&depth=65&weight=2&items=3
the website simply displays the below in text and code...
error=OK
eta=Overnight
price=64.69
I need to return the values after last line 'price=', being a newbee I'm struggling with xpath query (?) required to make this happens...
=importxml("url",?)
Your help is greatly appreciated.
Thank you in advance.
Regards
first of all, IMPORTXML() won't work because your webpage is not formatted correctly for XML, and google sheets doesn't like it.
All hope is not lost tho, as your output is so simple. you can simply load the whole output using IMPORTDATA() and then process within google sheets
have a look at the output of the following formulae (where the url is stored in A1)
=IMPORTDATA(A1)
=transpose(IMPORTDATA(A1))
=index(IMPORTDATA(A1),3,1) - IF there are always 3 results, and price will always be in the third one this will work
=filter(IMPORTDATA(A1),left(IMPORTDATA(A1),5)="price") - if the price can appear in any of the result lines, but always starting with "price"

Resources