Google sheets IMPORTXML fails for ASX data - google-sheets-formula

I am trying to extract the "Forward Dividend & Yield" value from https://finance.yahoo.com/ for multiple companies in different markets, into Google Sheets.
This is successful:
=IMPORTXML("https://finance.yahoo.com/quote/WBS", "//*[#id='quote-summary']/div[2]/table/tbody/tr[6]/td[2]")
But this fails with #N/A:
=IMPORTXML("https://finance.yahoo.com/quote/CBA.AX", "//*[#id='quote-summary']/div[2]/table/tbody/tr[6]/td[2]")
I cannot work out what needs to be different for ASX ticker codes, why does CBA.AX cause a problem?
Huge thanks for any help

When I tested the formula of =IMPORTXML("https://finance.yahoo.com/quote/CBA.AX", "//*"), an error of Error Resource at url not found. occurred. I thought that this might be the reason of your issue.
But, fortunately, when I try to retrieve the HTML from the same URL using Google Apps Script, the HTML could be retrieved. So, in this answer, I would like to propose to retrieve the value using the custom function created by Google Apps Script. The sample script is as follows.
Sample script:
Please copy and paste the following script to the script editor of Google Spreadsheet and save it. And, please put a formula of =SAMPLE("https://finance.yahoo.com/quote/CBA.AX") to a cell. By this, the value is retrieved.
function SAMPLE(url) {
const res = UrlFetchApp.fetch(url).getContentText().match(/DIVIDEND_AND_YIELD-value.+?>(.+?)</);
return res && res.length > 1 ? res[1] : "No value";
}
Result:
When above script is used, the following result is obtained.
Note:
When this script is used, you can also use =SAMPLE("https://finance.yahoo.com/quote/WBS").
In this case, when the HTML structure of the URL is changed, this script might not be able to be used. I think that this situation is the same with IMPORTXML and the xpath. So please be careful this.
References:
Custom Functions in Google Sheets
Class UrlFetchApp

An other solution is to decode the json contained in the source of the web page. Of course you can't use importxml since the web page is built on your side by javascript and not on server's side. You can access data by this way and get a lot of informations
var source = UrlFetchApp.fetch(url).getContentText()
var jsonString = source.match(/(?<=root.App.main = ).*(?=}}}})/g) + '}}}}'
i.e. for what you are looking for you can use
function trailingAnnualDividendRate(){
var url='https://finance.yahoo.com/quote/CBA.AX'
var source = UrlFetchApp.fetch(url).getContentText()
var jsonString = source.match(/(?<=root.App.main = ).*(?=}}}})/g) + '}}}}'
var data = JSON.parse(jsonString)
var dividendRate = data.context.dispatcher.stores.QuoteSummaryStore.summaryDetail.trailingAnnualDividendRate.raw
Logger.log(dividendRate)
}

Related

How to edit the data in CK editor that comes from API?

I post the data in API and want to edit this data after getting from API. when I try to edit the data it gives me the following error:
CKEditorError: datacontroller-set-non-existent-root: Attempting to set data on a non-existing root. Read more: https://ckeditor.com/docs/ckeditor5/latest/framework/guides/support/error-codes.html#error-datacontroller-set-non-existent-root
<CKEditor
editor={ClassicEditor}
onChange={this.handleChange}
data={html}
></CKEditor>
Here is the dummy solution for now, I don't know what is the proper reason of that:
Create one variable:
let a = "";
Replace the content of the variable a with the your content coming from the API
and parse it with htmlparser:
let data = a.replace("", htmlparser(/*data coming from your api*/))

Setting Google Photos date

I uploaded many photos with no EXIF data, but with their date in the name. Google Photos used upload date to sort them. I'd like to use the date in their name to modify them.
So far I tried to use Drive API to change modification date, I can change it but it is not used. I also tried to modify imageMediaMetadata.date, but it seems to be read-only to me.
Code:
function myFunction() {
var files = DriveApp.getFilesByName("IMG-20150402-WA0002_1.jpg")
while (files.hasNext()) {
var file = files.next();
var name = file.getName().toUpperCase();
if (name.indexOf("-WA") > -1) {
if (name.indexOf("IMG-20") == 0 || name.indexOf("VID-20") == 0) {
var y = name.substr(4, 4);
var m = name.substr(8, 2);
var d = name.substr(10, 2);
var file2 = Drive.Files.get(file.getId());
file2.imageMediaMetadata.date = y+"-"+m+"-"+d+"T12:00:00.000Z";
var file3 = Drive.Files.patch(file2, file.getId());
Logger.log(name + " no ok " + file3.imageMediaMetadata.date); // same as file2
}
}
}
I could delete them, modify the original files and re-upload, but before that I'd like to be sure there is no other way.
Thank you.
Perhaps you could programmatically write an EXIF header to the files?
I would also be looking for a convenient way to supply photo date when uploading old photos using the new-ish Google Photos API. My photos do not necessarily have EXIF data; I tried setting the creation/last modified date on one of my JPEGs on my MacOS machines disk, then manually uploading it via the Google Photos web interface and the date of of the JPEG file on local disk becomes the photo's date in Google Photos, as expected. The file has no EXIF data and if did, it would not contain that same date, so apparently the google photos web uploader respects the local filesystem date for the photo.
I then tried to sniff the traffic using Charles Proxy, but apparently the web interface does not use the Google Photos API same way that us external developers would -- it doesnt POST to https://photoslibrary.googleapis.com/v1/uploads or so it seems. So I couldn't reverse engineer that process. Also I couldn't see where the file creation date was passed in.
What would be great is to have a HTTP header in the upload POST request to set this date. I dont see batchCreate (https://developers.google.com/photos/library/reference/rest/v1/mediaItems/batchCreate) method having any means of setting this.

Issues in Updating Metadata while Generating PDF

I am working on a Extend Script which saves FrameMaker Book as a PDF. The script is able to save to the PDF but when I tried to add the PDF Metadata (Author/CreationDate/Keywords/Subject/Title) etc, the same does not reflect in the generated PDF.
On Closure inspection I found that Metadata elements were not added to PDFDocInfo property of the Book.
Here is the code which I wrote to update the Author Details in PDFDocInfo
$.writeln("Length before" + doc.PDFDocInfo.length);
doc.PDFDocInfo.push("Author");
doc.PDFDocInfo.push("Mr Bond");
$.writeln("Length after" + doc.PDFDocInfo.length);
where doc is an Object of type Book
The output is
Length before0
Length after0
Should the PDFDocInfo not have 2 elements in it now. Am I missing any thing here ?
The following code did the trick...
var pdfDocInfo = new Strings();
pdfDocInfo.push("Author");
pdfDocInfo.push("Mr Bond");
book.PDFDocInfo = pdfDocInfo;

Best way to get definitions out of Google?

I'm trying to make a simple feature where a user can specify a term and the program fetches a definition for it and returns it. The best definition system I know of is Google's "define" keyword in search queries where if you start the query with "define " or "define:" etc it returns very accurate and sufficient definitions. However, I have no idea how to access this information programatically.
Google's new Custom Search Engine API doesn't show definitions and the old one gives slightly better results but is deprecated and still doesn't show the same definitions I see when I Google the term in the browser.
Failing Google, I turned to Wikipedia, which has a huge API but I still couldn't find a way to extract summaries like Google definitions.
So my question is, does anybody know how I can get this information out of Google via the API or any other means?
This is an older question but is asking the same thing. Except the answers given are no longer applicable as Google Dictionary no longer exists.
Update: So I'm now going down the route of trying to scrape the definitions straight out of the page itself. Now the problem is, when I visit the page in the browser (Firefox), the definitions show up, but when I'm scraping them using cheerio, they don't show up anywhere on the page. I must mention I'm scraping the page through nitrous.io so it's rendering the page from a different region and operating system to the one I'm viewing it in the browser with so maybe it's region related. Will look into it further.
Update 2.0: I think maybe the definitions are loaded asynchronously and so I have no idea how to scrape them because I've never really done scraping before and I'm just a newbie :(
Update 3.0: Ok, so now I think it's not to do with the asynchronous loading but the renderer of the page. When I load this in Firefox, the page looks like this:
However, when I load it in IE (8) it looks like this:
Anybody got some insight on this?
Finally got to the answer. Had to set user agent when screen scraping. My resulting code for getting definitions via scraping:
var request = require('request')
, cheerio = require('cheerio');
var searchTerm = 'test';
request({url:'https://www.google.co.uk/search?q=define+'+searchTerm,headers:{"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"}}, function(err, resp, body){
$ = cheerio.load(body);
var defineBlocks = $(".lr_dct_sf_sen");
var numOfBlocks = (defineBlocks.length < 3) ? defineBlocks.length : 3;
for (var i=0; i<numOfBlocks; i++){
var block = defineBlocks[i].children[1].children[0]; //font-size:small level
process(block);
function process (block) {
for (var i=0; i<block.children.length; i++){
var line = block.children[i];
if ("style" in line.attribs){ // main text
exampleStr = "";
for (var k=0; k<line.children.length; k++){
exampleStr += line.children[k].children[0].data;
}
console.log(exampleStr);
} else if ("class" in line.attribs){ // example
console.log("\""+line.children[1].children[0].data+"\"");
} else { // nothing i want
}
}
}
}
});

Google Spreadsheet API - returns remote 500 error

Has anyone battled 500 errors with the Google spreadsheet API for google domains?
I have copied the code in this post (2-legged OAuth): http://code.google.com/p/google-gdata/source/browse/trunk/clients/cs/samples/OAuth/Program.cs, substituted in my domain;s API id and secret and my own credentials, and it works.
So it appears my domain setup is fine (at least for the contacts/calendar apis).
However swapping the code out for a new Spreadsheet service / query instead, it reverts to type: remote server returned an internal server error (500).
var ssq = new SpreadsheetQuery();
ssq.Uri = new OAuthUri("https://spreadsheets.google.com/feeds/spreadsheets/private/full", "me", "mydomain.com");
ssq.OAuthRequestorId = "me#mydomain.com"; // can do this instead of using OAuthUri for queries
var feed = ssservice.Query(ssq); //boom 500
Console.WriteLine("ss:" + feed.Entries.Count);
I are befuddled
I had to make sure to use the "correct" class:
not
//using SpreadsheetQuery = Google.GData.Spreadsheets.SpreadsheetQuery;
but
using SpreadsheetQuery = Google.GData.Documents.SpreadsheetQuery;
stinky-malinky
Seems you need the gdocs api to query for spreadsheets, but the spreadsheet api to query inside of a spreadsheet but nowhere on the internet until now will you find this undeniably important tit-bit. Google sucks hard on that one.

Resources