I am looking to find an update XPath to use in google sheets for =importxml
http://www.sportsline.com/nfl/odds/
This was the original one I was using, but it now comes up #N/A
=importhtml("http://www.sportsline.com/nfl/odds/","Table",0)
still works:
=IMPORTHTML("http://www.sportsline.com/nfl/odds/", "TABLE", 0)
but you can try also:
=IMPORTHTML("http://www.sportsline.com/nfl/odds/", "TABLE", 1)
Related
I'm trying to import a search result from google to my spreadsheet. I've had success with Wikipedia pages, but for some reason, Google search isn't working correctly (giving a "could not fetch url" error). I'm sure the problem is somewhere in my URL or XPath, but I've been trying a variety of things and I'm lost. Here is what I've got:
=IMPORTXML("https://www.google.com/search?q=dom+fera+easy+thing+released", "//div[#class='Z0LcW XcVN5d']")
I'm linking the spreadsheet below as view-only for reference as well. Ultimately the goal is to be able to webscrape release years of songs. I'd appreciate any help!
https://docs.google.com/spreadsheets/d/1bt8MJ23nfGAv6ianaR-sd7DM5DNn98p7zWSG1UzBlEY/edit?usp=sharing
AFAIK, you can't parse results from GoogleSearch in Google Sheets.
Using Discogs, MusicBrainz, All Music... to get the release dates could be useful.
But it seems some of your groups are little known. So, you can use Youtube to fetch the dates.
Note : we assume the year of publication on Youtube corresponds to the year of release.
Of course, that's not 100% true. For example, artists can clip their video months after release. Or publish nothing on Youtube.
So this method will work with a wide range of songs but not ALL the songs. With recent bands and songs, it should be OK.
To do this you can use the Youtube API or IMPORTXML formulas. In both cases, we always take the first result (relevant order) of the search engine as source.
You need an API key and an ImportJSON script (credits to Brad Jasper) to use the API method. Once you have installed the script and activated your API key,you can paste in cell B3:
="https://www.googleapis.com/youtube/v3/search?key={yourAPIKey}&part=snippet&type=video&filter=items®ionCode=FR&q="&ENCODEURL(A3)
We generate the url to query with the content you input in column A.
We use "regionCode=FR" since some songs are not available in the US ("i need you FMLYBND"). That way we get the correct release date.
In C3, you can paste :
=LEFT(QUERY(ImportJSON(B3);"SELECT Col11 LIMIT 1 label Col11''";1);4)
We parse the JSON, select the column of interest, the line of interest, then we clean the result.
With the IMPORTXML method, you can paste in E3 :
="https://www.youtube.com"&IMPORTXML("https://www.youtube.com/results?search_query="&A3;"(//div[#class='yt-lockup-thumbnail contains-addto'])[3]/a/#href")
We construct the url with the first search result of the search engine.
In F3, you can paste :
=LEFT(IMPORTXML(E3;"//meta[#itemprop='datePublished']/#content");4)
We parse the previously built url, then we extract the year of publication.
As you can see, there's a difference in the results on line 5. That's because the song is not available in the US. The first result returned in the IMPORTXML method is different from the one of the API method which uses a "FR" flag.
Side note : I'm based in Europe. So ";" in the formulas should be replaced with ",".
google does not support web scraping of google search into google sheets. this option was disabled 2 years ago. you will need to use alternative search engine
I'm pretty fresh and trying to paste certain xpath from a website into sheets.
Url: "https://www.btcmarkets.net/"
Xpath: (from chrome copy xpath function) : //*[#id="LastPriceAUDBTC"]
I keep getting
formula parse error
I have managed to get the table headings on with:
Xpath: "//tr"
but not the information within
Is this even possible?
I know the google finance add-ons but I am analyzing the difference in prices of different exchanges.
QUERY #2
I would also like to
=importxml("http://www.xe.com/currencyconverter/convert/?Amount=1&From=EUR&To=CAD","//*[#id="ucc-container"]/span[2]/span[2]")
Should I be using =importDATA and shaving off what I don't want?
You need to use double quotes around the entire xpath but single quotes around the class name/id name/attribute name:
"//*[#id='LastPriceAUDBTC']"
And
=importxml("http://www.xe.com/currencyconverter/convert/?Amount=1&From=EUR&To=CAD","//*[#id='ucc-container']/span[2]/span[2]")
I am trying to scrape data from https://www.snpedia.com/index.php/Rs7136259 to create an automated database of genomic information using google sheets.
I would like to retrieve the odds ratio contained in a table on the page. I have tried to figure out the XPath, but nothing I do works. I copied as XPath from InspectElement but that's returning a #N/A error. The information I am trying to scrape is the "Odds Ratio".
My current query:
=importxml(J2,"//*div[#id="mw-content-text"]/table/tr[7]/td")
Thanks for your input. I have searched the other links but could not figure it out. Sorry for being so green.
As noted in the comments, *div is not valid XPath. Another problem is that you have double quotes inside of double quotes, which is also invalid.
It looks like this works:
=importxml(J2,"//*[#id='mw-content-text']/table/tr[7]/td")
I tried to use this formula
=ImportXML("http://www.google.com/search?q=philadelphia seo company&num=100", "//h3[#class='r']/a/#href")
from http://www.seerinteractive.com/blog/importxml-cookbook/
and I get an formula error , you need to enable something in google spreadsheet before using this formula?
You need to encode the part of the search query where "q=philadelphia seo company" meaning all the spaces should be converted to "%20".
end result should look like this:
=ImportXML("http://www.google.com/search?q=philadelphia%20seo%20company&num=100", "//h3[#class='r']/a/#href")
also - i use importxml all the time and with google search results, you can also use "//cite" depends how much of the url you want.
I continue to get this error when I try to run this XPath query
//div[#iti='0']
on this link (flight search from google)
https://www.google.com/flights/#search;f=LGW;t=JFK;d=2014-05-22;r=2014-05-26
I get something like this:
=ImportXML("https://www.google.fr/flights/#search;f=jfk;t=lgw;d=2014-02-22;r=2014-02-26";"//div[#iti='0']")
I verified and the XPath is correct (I get the answer wanted using XPath helper, the answer wanted are the data relative to the first flight selected).
I guess that it is a problem of syntax, but I tried more or less all the combinations of lower/uppercase, punctuation (replacing ; , ' ") and I tried to link the URI and the XPath query stored in cells, but nothing works.
Any help will be appreciated.
As a matter of fact, maybe it is a bug on the new google sheets or they have changed how the function works. I've activated mine and when I try to use the ImportXML it simply wont work. Since I have some old sheets here (on the old mechanism) they still work normally. If I copy and paste the script from the old to the new one it simply doesn't get any data.
Here a example:
=ImportXML("http://www.nytimes.com/pages/todayspaper/index.html";"//div[#class='columnGroup first']//h3")
If I run this on the old mechanism it works fine, but if I run the same on the new mechanism, first it will exchange my ";" for a "," and then it will bring a "#N/A" with a warning of "Error: Imported XML content cannot be parsed".
Edit (05/05/2015):
I am happy to say that I tested this function again today on the new spreadsheets and they've fixed it. I was checking that every two months and now finally they have solved this issue. The example I've added above is now returning information.
I'm sorry, but you won't be able to easily parse Google result pages. The reason your function throws an error is because the content of the page you see in your browser is generated by javascript, and Google spreadsheet doesn't execute js.
Your ImportXML has the right syntax, it doesn't return anything because the node you're looking for isn't there (importXML Parse Error).
You will have to find another source if you want these result in your spreadsheet. For info some libraries already parse the usual result page (http://www.seerinteractive.com/blog/google-scraper-in-google-docs-update for example, if it still works), but I doubt finding one for your special case will be easy.
This gives the answer (importXML Parse Error), but it's not entirely obvious.
ImportXML doesn't load Javascript. When you're building ImportXML queries on Google results, make sure you're testing against a version of the page that has Javascript turned off. You can do this using the Chrome DevTools.
(But I agree that ImportXML is fickle, idiosyncratic, and generally rage-inducing).